I fully agree with you - but also with the article.
If I were to index my personal knowledge base, where I have on the order of 1000 documents, I'd need at most 100k embeddings (generously assuming 100 per document, but that's an overestimate).
I'm literally talking to a company at the moment who's interested in a chatbot to talk to their internal knowledge base. They have... 100 documents, around 40 pages long. Again, this will easily fit within 1 million embeddings.
And using vector databases comes with tradeoffs: you will get worse recall (aka you will sometimes miss the most relevant document).
If I were to index my personal knowledge base, where I have on the order of 1000 documents, I'd need at most 100k embeddings (generously assuming 100 per document, but that's an overestimate).
I'm literally talking to a company at the moment who's interested in a chatbot to talk to their internal knowledge base. They have... 100 documents, around 40 pages long. Again, this will easily fit within 1 million embeddings.
And using vector databases comes with tradeoffs: you will get worse recall (aka you will sometimes miss the most relevant document).