Description
Large language models, or LLMs, suffer from hallucinations. Hallucinations are a type of inaccuracy that occurs when an LLM is “mistaken” on a given fact, outputting and reasoning based on this mistaken assumption. Hallucinations are of course wrong, but also hold tremendous potential for damage in high risk domains like medicine. Methods have been built to get around this issue, most notably a method known as retrieval augmented generation, or RAG. RAG connects an LLM to a database of searchable data points, which then supply the LLM and allow it to perform in-context learning, or ICL, to reason. This approach works very well, with strong performance boosts in certain tasks. However, often times, when an LLM is not supplied with relevant sources (by way of flawed search or lack of data set depth), the hallucination problem arises once again. Our objective with this project is to create a methodology to gauge the answerability of a query in three stages:
This project holds large implications; understanding the answerability of a certain question will allow the construction of more complex pipelines, stronger guardrails, and dynamic methods of handling queries of different complexities.
Timeline