arXiv cs.AISaturday · May 23, 2026FREE

Improving Quantized Model Performance in Qualitative Analysis with Multi-Pass Prompt Verification

llmquantizationqualitative-analysishallucination

Researchers evaluated quantized LLaMA-3.1 (8B) at 8-bit, 4-bit, 3-bit, and 2-bit levels on qualitative analysis of 82 interview transcripts. Low-bit models produced more hallucinations and unstable results, particularly with non-expert language. To address this, they introduced a quantization-aware multi-pass prompt verification method that guides the model through controlled steps to reduce hallucinations, removing unreliable content and passing results after verification. Human coders used NVivo and BF16 LLaMA-3.1 for validation; BF16 output had semantic drift and hallucinations corrected manually. The corrected BF16 and NVivo coding formed a gold standard. The method improved accuracy over baseline quantized models.

// why it matters

Improves reliability of quantized LLMs for qualitative analysis, reducing hallucinations in low-resource settings.

Sources

Primary · arXiv cs.AI

▸ Read original at arxiv.org

Improving Quantized Model Performance in Qualitative Analysis with Multi-Pass Prompt Verification

Sources

Like this? Get the next digest.