I am actually want to work on that only like i want some more real world question and answers to check it.
Working is done that phi2 and other also for this I am using both lmstudio and ollama support.
I am doing a RAG system that is used and using the embedding extraction to talk to database of these pdf and for halisunation I was earlier using tinyllama and after move to phi2 it work better.
Will need suggestions how to test as in my organisation I am not responsible for testing mainly for integration and making.
In future can add finance or chat finance based fine tuned models to solve it.
In rag the best case I found is to stuff accurate and good data using relevance and limit.
Like in navinflourine after phi2 it was 8 and 40% relevenace while for itc it was 20 and 60.
For tech stack I am using Qdrant,Hugging face text embedding, semantic kernel , .net api format mainly and ollama with everything being in a docker container and all stored locally
Subscribe To Our Free Newsletter |