OpenAI Request Builder

Build and preview OpenAI Chat Completions API requests with code generation

Configuration

Response Format:
{
  "model": "gpt-4o",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Hello! How can you help me today?"
    }
  ],
  "temperature": 1,
  "max_tokens": 1024,
  "top_p": 1,
  "stream": false
}

Quick Tips

  • • Set OPENAI_API_KEY environment variable before running
  • • Temperature 0 = deterministic, 2 = maximum randomness
  • • Use JSON mode when you need structured output

Related Tools

OpenAI Chat Completions API: Complete Request Builder Guide

The OpenAI Chat Completions API is the primary interface for accessing GPT-4o, GPT-4, GPT-3.5-Turbo, and other OpenAI models. This request builder helps you configure API calls visually and generates ready-to-use code in multiple languages.

Whether you're prototyping a new feature, testing different parameters, or learning the API, this tool eliminates the guesswork and helps you build correct API requests quickly.

Understanding Each Parameter

Model

The model determines capabilities, speed, and cost. GPT-4o is the recommended default for most use cases — it's fast, capable, and cost-effective. Use GPT-4o Mini for simpler tasks at lower cost, or o1 for complex reasoning.

Temperature (0-2)

Controls randomness. 0 = deterministic (same input → same output), 1 = balanced, 2 = highly creative. Use 0-0.3 for factual/coding tasks, 0.7-1.0 for creative writing.

Max Tokens

Maximum number of tokens in the response. 1 token ≈ 4 characters in English. This caps cost and response length. Set based on expected output size + buffer.

Top P (Nucleus Sampling)

Alternative to temperature for controlling randomness. Top P = 0.1 means only top 10% probability tokens are considered. Generally, adjust temperature OR top_p, not both.

Stream

When enabled, the response is sent incrementally as it's generated. Essential for chat UIs to display typing indicators and reduce perceived latency.

Response Format

JSON Object mode ensures the response is valid JSON. Use it when you need structured output for parsing. Always include instructions in the prompt about the expected JSON structure.

Frequently Asked Questions

Which model should I use?

Start with GPT-4o — it's the best balance of capability, speed, and cost. Use GPT-4o Mini for high-volume, simpler tasks. Use o1 or o1-mini for complex reasoning, math, or coding challenges.

How do I get my API key?

Go to platform.openai.com, sign in, navigate to API Keys in your account settings, and create a new secret key. Store it securely — you won't be able to see it again.

What's the difference between temperature and top_p?

Both control randomness but differently. Temperature scales all probabilities; top_p cuts off low-probability options. OpenAI recommends adjusting only one at a time. Temperature is more intuitive for most use cases.

How do I handle streaming responses?

With streaming, the response comes as Server-Sent Events (SSE). Each chunk contains a delta with partial content. Concatenate all deltas to build the complete response. The generated code shows the pattern for both Python and Node.js.

How is the cost calculated?

Cost = (input tokens × input price) + (output tokens × output price). Different models have different prices. GPT-4o is ~$2.50/M input, $10/M output. GPT-4o Mini is ~$0.15/M input, $0.60/M output.

Related Tools