This is such a basic thing nowadays, and ElasticSearch is massive overkill for it. Something like SQLite or LanceDB or basically any vector database is much more appropriate.
This seems to be coming from the “we must make ElasticSearch AI-compatible” department more than anything.
show comments
BiraIgnacio
TIL
- Hybrid recall + reranker: Two searches merged, then re-scored for best matches
- Supersession: Old facts get hidden, new ones take their place
- Decay: Recent or often‑used memories get a score boost
- DLS: Each user only sees their own documents
itissid
I have a request: can this text be even more AI generated?
show comments
0xbadcafebee
Summary of the article (https://pastebin.com/aawJfrF6) since the original one is like reading an academic paper filtered through an LLM that hates human readers.
It seems like a cool approach. Don't know if it's novel but it's much smarter than "shove markdown files into directories".
show comments
voidUpdate
For someone who isn't super familiar, what is "R@10", and is 0.89 good? It's impossible to google for
show comments
reactordev
I built one into my agent using sqlite…
show comments
koinedad
I think this is cool and helpful but my biggest complaint is the writing style and word choice just scream LLM
verdverm
I'm using Typesense to power my take on a md kb, highly recommend this option which positions itself against Elasticsearch and Algolia. Combines vector with bm25 and all the extras you get from a trad search tool like Algolia.
dominotw
is there any proof that all these shenanigans impove agent performance
show comments
tuo-lei
so the 11% miss rate - do users actually notice when the agent drops a memory? like if someone already said they tried X and the agent suggests it again.
This is such a basic thing nowadays, and ElasticSearch is massive overkill for it. Something like SQLite or LanceDB or basically any vector database is much more appropriate.
This seems to be coming from the “we must make ElasticSearch AI-compatible” department more than anything.
TIL
- Hybrid recall + reranker: Two searches merged, then re-scored for best matches
- Supersession: Old facts get hidden, new ones take their place
- Decay: Recent or often‑used memories get a score boost
- DLS: Each user only sees their own documents
I have a request: can this text be even more AI generated?
Summary of the article (https://pastebin.com/aawJfrF6) since the original one is like reading an academic paper filtered through an LLM that hates human readers.
It seems like a cool approach. Don't know if it's novel but it's much smarter than "shove markdown files into directories".
For someone who isn't super familiar, what is "R@10", and is 0.89 good? It's impossible to google for
I built one into my agent using sqlite…
I think this is cool and helpful but my biggest complaint is the writing style and word choice just scream LLM
I'm using Typesense to power my take on a md kb, highly recommend this option which positions itself against Elasticsearch and Algolia. Combines vector with bm25 and all the extras you get from a trad search tool like Algolia.
is there any proof that all these shenanigans impove agent performance
so the 11% miss rate - do users actually notice when the agent drops a memory? like if someone already said they tried X and the agent suggests it again.