Back to roadmaps openai Course

Tuning Model Parameters: Temperature, Top P, and Max Tokens

To control the creativity, length, and randomness of AI responses, configure these model parameters when requesting completions.


1. Temperature (Randomness vs Determinism)

  • Range: 0 to 2 (default is 1).
  • Behavior:
    • Low Temperature (0.0 - 0.2): High predictability and accuracy. Ideal for database lookups, code generation, and factual summaries.
    • High Temperature (0.8 - 1.5): High creativity and variety. Ideal for copywriting, creative fiction, or brainstorming.
  • Important: Do not modify both temperature and top_p simultaneously. Keep one at default values when adjusting the other.

2. Top P (Nucleus Sampling)

  • Range: 0 to 1 (default is 1).
  • Behavior:
    • Instead of selecting randomly from all possible word tokens, the model only selects from tokens comprising the top percentage of probability mass.
    • Setting top_p to 0.1 means the model only considers the top 10 percent highest probability tokens.

3. Token Limits and Control

const response = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Tell me a joke." }],
  
  // 1. Set the maximum token count allowed in the response
  max_tokens: 150, 
  
  // 2. Control creativity
  temperature: 0.7, 
  
  // 3. Stop token sequences
  stop: ["\n", "User:"], 
});
  • max_tokens: Prevents the model from writing overly long responses, helping control your API credit consumption.
  • stop: Instructs the model to immediately cease generating content if a matching text pattern occurs.
  • presence_penalty: Encourages the model to talk about new topics.
  • frequency_penalty: Discourages the model from repeating identical words.
Published on Last updated: