Understanding Objectivity

Description

Large language models, or LLMs, suffer from hallucinations. Hallucinations are a type of inaccuracy that occurs when an LLM is “mistaken” on a given fact, outputting and reasoning based on this mistaken assumption. Hallucinations are of course wrong, but also hold tremendous potential for damage in high risk domains like medicine. Methods have been built to get around this issue, most notably a method known as retrieval augmented generation, or RAG. RAG connects an LLM to a database of searchable data points, which then supply the LLM and allow it to perform in-context learning, or ICL, to reason. This approach works very well, with strong performance boosts in certain tasks. However, often times, when an LLM is not supplied with relevant sources (by way of flawed search or lack of data set depth), the hallucination problem arises once again. Our objective with this project is to create a methodology to gauge the answerability of a query in three stages:

Pre-hoc: We would like to be able to create an estimator to determine how likely a question is to be answerable by the LLM before searching the database.
Mid-hoc: We would like to predict the answerability of a question once sources have been acquired but before the LLM makes an output.
Post-hoc: We would like to be able to determine how likely an LLM output is to be correct based on the data presented and corresponding questions.

This project holds large implications; understanding the answerability of a certain question will allow the construction of more complex pipelines, stronger guardrails, and dynamic methods of handling queries of different complexities.

Timeline

Ideation and Outline
Execution
Analysis & Writing
Polishing
Submission