arrow_backNeural Digest
AI-generated illustration
AI image
Research

Introducing Background Temperature to Characterise Hidden Randomness in Large Language Models

ArXiv CS.AI2d ago
auto_awesomeAI Summary

Researchers have identified and formalized sources of nondeterminism in large language models that persist even when using deterministic decoding settings. The team introduces "background temperature" to quantify this hidden randomness caused by implementation-level factors like floating-point arithmetic and kernel behavior.

Key Takeaways

  • LLMs exhibit nondeterministic behavior despite temperature T=0 settings due to implementation factors
  • Sources include batch-size variation, kernel non-invariance, and floating-point non-associativity
  • Background temperature concept formalizes and characterizes this previously unexplained hidden randomness

LLMs produce different outputs even at temperature zero due to hidden computational randomness.

trending_upWhy It Matters

Understanding hidden sources of randomness in LLMs is critical for reliability, reproducibility, and deployment in high-stakes applications. This research provides a framework for quantifying and potentially controlling nondeterminism that practitioners currently cannot fully predict or manage, improving model robustness and consistency across different computational environments.

FAQ

Why does temperature zero still produce variable outputs?expand_more
Implementation details like floating-point arithmetic, batch processing, and GPU kernel behavior introduce inherent nondeterminism independent of the temperature parameter used for sampling.
How can developers use background temperature in practice?expand_more
The framework helps practitioners understand and potentially mitigate hidden randomness sources, leading to more predictable and reproducible LLM behavior across different systems and hardware configurations.
This summary was AI-generated. Neural Digest is not liable for the accuracy of source content. Read the original →
Read full article on ArXiv CS.AIopen_in_new
Share this story

Related Articles