Imane Jaaouine, Ross D. King
2025-11-30
摘要
Large language models (LLMs) produce context inconsistency hallucinations, which are LLM generated outputs that are misaligned with the user prompt. This research project investigates whether prompt engineering (PE) methods can mitigate context inconsistency hallucinations in zero-shot LLM summarisation of scientific texts, where zero-shot indicates that the LLM relies purely on its pre-training data. Across eight yeast biotechnology research paper abstracts, six instruction-tuned LLMs were prom...
The study addresses the issue of context inconsistency hallucinations in LLM-generated summaries of scientific texts. It investigates whether prompt engineering methods can mitigate these hallucinations in zero-shot summarization, where LLMs rely solely on pre-training data.
Six instruction-tuned LLMs were prompted with seven methods, including baseline prompts, prompts with increasing instruction complexity, and prompts with context repetition or random sentence addition. The summaries were evaluated using six metrics.
The results indicated that context repetition and random addition significantly improved the lexical alignment of LLM-generated summaries with the original abstracts. However, increased instruction complexity did not improve semantic alignment and even caused a decline.