3 Comments

Hi avi. I'll have to disagree with you here. People dramatically overestimate the performance improvements from vector DBS. Even with very high scales, more traditional ir techniques can be a great foundation (using vector embeddings more for precision one we have filtered the most useless parts). And here, more normal DBs are good enough for storing vectors

Expand full comment
author

I see. Can you tell me what you observed in case of performance improvements when you used vector DBs, or maybe share the studies you read? (I am assuming by "performance improvement", you meant run-time).

On small scales, of course, it did not make much sense, but on intermediate+, the difference was quite noticeable when I experimented. Although yes, the precision was slightly lower because of ANN.

Would love to learn more about your perspective.

Expand full comment

This is a good one- https://about.xethub.com/blog/you-dont-need-a-vector-database

There's other research that also reaches similar conclusions. And my work with these things. Vector DBs are okay (they will never hurt), but rarely are they what drive success.

Performance- cost, quality of retrieval, and architectural complexity of the system.

Expand full comment