Hacker Newsnew | past | comments | ask | show | jobs | submit | raducu's favoriteslogin

Kafka brokers handle connections to consumers and data storage. This creates contention as the primaries for each partition have to service the traffic and handle IO. Consumers that aren't tailing the stream will cause slowdowns because Kafka has to seek to that offset from files which aren't cached in RAM.

Pulsar separates storage into a different layer (powered by Apache Bookkeeper) which allows consumers to read directly from multiple nodes. There's much more IO throughput available to handle consumers picking up anywhere in the stream.


This paper is really important: it shows that transfer learning can be applied to a wide variety of NLP problems with great success. They show state of the art results on nearly every major class of NLP problem.

The basic approach is the same as our ULMFiT (http://nlp.fast.ai/classification/2018/05/15/introducting-ul...) model - pre-train a language model (a model that predicts the next word in a sequence) on a large corpus, and then modify the language model slightly for whatever task you wish to do (e.g. text classification). Finally, fine-tune that model using your target corpus (e.g. texts labeled with classes).

This new paper has two significant leaps over ULMFiT:

- Replace the RNN with a transformer model

- Apply to many more types of problem.

Note that although the original language model takes them a long time to train (a month on 8 GPUs), there's almost no reason for anyone else to create their own model from scratch, except if you need to use this approach on a language that doesn't have a pre-trained model yet. The transfer learning fine-tuning doesn't take anywhere close to as long as the language model pre-training, and you can just use the existing pre-trained weights.

The previous HN discussion on ULMFit may also be of interest: https://news.ycombinator.com/item?id=17076222


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: