What is Temperature in LLMs?

Predictable ↔ Random.

Dec 16, 2024

AI systems love neatly formatted data—Markdowns, Structured data, HTML etc.

And now it is easier than ever to produce LLM-digestible data!

Firecrawl is an open-source framework that takes a URL, crawls it, and converts it into a clean markdown or structured format.

FireCrawl GitHub

Why Firecrawl?

LLM-ready formats → Markdown, HTML, Structured data, metadata.
Handles the hard stuff → proxies, anti-bots, dynamic content.
Customizable → exclude tags, custom headers, max depth.
Reliable → gets the data you need, no matter what.
Batching → scrape thousands of URLs at once
Media parsing → PDFs, DOCX, images
Actions → click, scroll, input, wait.

If you prefer FireCrawl’s managed service, you can use the code “DDODS” for a 10% discount code here →

Thanks to Firecrawl for partnering with us today!

What is temperature in LLMs?

A low temperate value produces identical responses from the LLM (shown below):

But a high temperate value produces gibberish.

What exactly is temperature in LLMs?

Let’s understand this today!

Traditional classification models use softmax to generate the final prediction from logits:

In LLMs, the output layer spans the entire vocabulary.

The difference is that a traditional classification model predicts the class with the highest softmax score, which makes it deterministic.

But LLMs sample the prediction from these softmax probabilities:

Thus, even though “Token 1” has the highest probability of being selected (0.86), it may not be chosen as the next token since we are sampling.

Temperature introduces the following tweak in the softmax function, which, in turn, influences the sampling process:

If the temperature is low, the probabilities look more like a max value instead of a “soft-max” value.
- This means the sampling process will almost certainly choose the token with the highest probability.
- This makes the generation process greedy and (almost) deterministic.

If the temperature is high, the probabilities start to look like a uniform distribution:
- This means the sampling process may select any token.
- This makes the generation process random and heavily stochastic.

A quick note: In practice, the model can generate different outputs even if temperature=0. This is because there are still several other sources of randomness, such as race conditions in multithreaded code.

That said, here are some best practices for using temperature:

Set a low temperature value to generate predictable responses.
Set a high temperature value to generate more random and creative responses.
An extremely high temperature value rarely has any real utility, as we saw at the top.

And this explains the objective behind temperature in LLMs.

That said, any AI system will only be as good as the data going in.

FireCrawl helps you ensure that your AI systems always receive neatly formatted data—Markdowns, Structured data, HTML, etc.

FireCrawl GitHub

If you prefer FireCrawl’s managed service, you can use the code “DDODS” for a 10% discount code here →

👉 Over to you: How do you determine an ideal value of temperature?

Thanks for reading!

Daily Dose of Data Science

Discussion about this post

Daily Dose of Data Science

What is Temperature in LLMs?

Predictable ↔ Random.

🔥 Turn ANY website into LLM-ready data [Open-source]​

What is temperature in LLMs?

Discussion about this post

🔥 Turn ANY website into LLM-ready data [Open-source]