Exploring the OpenAI API: Tokens, Costs, and Usage

This notebook demonstrates how to interact with the OpenAI API from Python in a reproducible classroom or research environment.
The focus is on understanding how tokenization, model usage, and costs per token work in practice.

⚠️ Prices current as of March 2026. OpenAI updates its pricing regularly. Always verify the latest prices at https://openai.com/api/pricing/ before running cost estimates in production.

What the Notebook Does¶

Connects to the OpenAI API using the shared API key.
Sends example prompts to small and large models (e.g., gpt-4o or gpt-4o-mini) to illustrate response quality and cost trade-offs.
Explores tokenization — how text is converted into tokens and how token counts vary by model.
Calculates API usage costs, showing how prompt length and model choice affect pricing.
Visualizes results, helping students understand the relationship between:
- Input text length (number of tokens)
- Model type and context window
- Cost per request

Learning Goals¶

Understand what a token is and how it differs from characters or words.
Learn to estimate and monitor API usage costs.
Gain experience working with environment variables and best practices for secret management.
Build intuition for trade-offs between model size, latency, and price in practical applications.

import os
from IPython.display import display
import ipywidgets as widgets

try:
    from dotenv import load_dotenv
except:
    !pip install python-dotenv
    from dotenv import load_dotenv

try:
    from openai import OpenAI
except ImportError:
    !pip install openai
    from openai import OpenAI

API Key Setup¶

To keep credentials secure, the API key is not stored directly in this notebook.

The API key is linked to my credit card, so if it gets out the charges could add up. If I put the API key on Github it will be automatically flagged.

Instead, it is stored in a .env file inside a shared directory (../shared/.env) with a line like: openai_API_KEY="..."

# file path for the .env file containing the OpenAI API key
env_file_path = "/home/jovyan/shared/.env"
load_dotenv(env_file_path)

openai_api_key = os.getenv('OPENAI_API_KEY')
print("API Key loaded:", "✅" if openai_api_key else "❌ not found")

( option to manually load )

# this cell is just if you want to manually load a different API Key 
#openai_api_key =""

Notes for Instructors¶

The shared .env file allows multiple users on the same DataHub instance or Jupyter environment to access a single institutional API key without embedding secrets in their notebooks.
Students should never print the API key or share the .env file contents publicly.
The key can be rotated by updating the shared .env file; all dependent notebooks will continue to function.

The OpenAI Python Package¶

The OpenAI Python package provides a simple interface for interacting with OpenAI’s models—such as GPT, o1, and o3—directly from Python code. It supports both synchronous and asynchronous API calls, making it easy to send prompts, generate completions, and analyze responses. The package handles authentication via an environment variable (openai_API_KEY) and returns structured results that can be easily integrated into data workflows, Jupyter notebooks, or applications for natural language processing, code generation, or AI-assisted analysis.

Initializing the OpenAI Client¶

Once the API key is loaded from the environment, we create a client object that serves as our connection to the OpenAI API.
This client will handle authentication and allow us to make requests to different models.

We’ll initialize it like this:

client = OpenAI(api_key = openai_api_key)

Checking Available Models¶

Before making any API calls, it’s useful to list the models that your API key can access.
The client.models.list() method returns all available model identifiers for your OpenAI account, such as gpt-4o, gpt-4-turbo, and smaller variants like gpt-4o-mini.
Listing these helps confirm the correct model names to use in later API requests.

models = client.models.list()
print([m.id for m in models])

# Send a chat message to GPT-4o-mini
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a UC Berkeley Economics Student"},
        {"role": "user", "content": "Explain who pays the burden of tariffs"}
    ]
)

# Display response
print(response.choices[0].message.content)

Basic Chat Completion Example¶

To demonstrate the simplest API call, we can send a chat-style request to one of the OpenAI language models.
Here, we use client.chat.completions.create() to send a short conversation.
The model responds based on the system and user messages provided.

In this example, the system message defines the context (“You are a UC Berkeley Economics student”), and the user asks a question (“Explain who pays the burden of tariffs”).
The model returns a text completion that we can extract and display from the response object.

This basic pattern—system message, user message, and model reply—is the foundation of all chat-based interactions with OpenAI models.

OpenAI Token Pricing (approx., March 2026)¶

Always verify the latest prices at: https://openai.com/api/pricing/

Prices converted to per 1,000 tokens for readability.

Model	Input (per 1K tokens)	Output (per 1K tokens)	Context Window
GPT-5	$0.00125	$0.01000	256K
GPT-5 mini	$0.00025	$0.00200	256K
GPT-5 nano	$0.00005	$0.00040	256K
GPT-4.1	$0.00200	$0.00800	~1M
GPT-4.1 mini	$0.00040	$0.00160	~1M
GPT-4.1 nano	$0.00020	$0.00080	~1M
GPT-4o	$0.00250	$0.01000	128K
GPT-4o mini	$0.00015	$0.00060	128K
o3-mini	$0.00110	$0.00440	~200K

Quick Interpretation¶

Tier	Typical Use
Nano models	embeddings, classification, routing
Mini models	chatbots, summarization, RAG
4.1 / 5 class models	coding, reasoning, complex tasks
O-series models	deliberate reasoning (math, planning, multi-step problems)

Example: Cost to Generate 1 Million Output Tokens¶

Model	Cost
GPT-5	$10.00
GPT-4.1 mini	$1.60
GPT-5 nano	$0.40

This illustrates how model choice dramatically affects cost.
Using a smaller model for simple tasks can reduce API costs by 10×–25× or more.

# Define token prices (per 1K tokens)
# Prices approx. March 2026 — verify at https://openai.com/api/pricing/

token_prices = {
    "gpt-5": {"input": 0.00125, "output": 0.01000},
    "gpt-5-mini": {"input": 0.00025, "output": 0.00200},
    "gpt-5-nano": {"input": 0.00005, "output": 0.00040},

    "gpt-4.1": {"input": 0.00200, "output": 0.00800},
    "gpt-4.1-mini": {"input": 0.00040, "output": 0.00160},
    "gpt-4.1-nano": {"input": 0.00020, "output": 0.00080},

    "gpt-4o": {"input": 0.00250, "output": 0.01000},
    "gpt-4o-mini": {"input": 0.00015, "output": 0.00060},

    "o3-mini": {"input": 0.00110, "output": 0.00440},
}

# Widgets
model_selector = widgets.Dropdown(
    options=list(token_prices.keys()),
    value="gpt-4o-mini",
    description='Model:',)

input_tokens = widgets.IntText(
    value=10000,
    description='Input Tokens:',)

output_tokens = widgets.IntText(
    value=5000,
    description='Output Tokens:',)

estimate_button = widgets.Button(
    description="Estimate Cost",
    button_style="success")

cost_display = widgets.Label(value="")

# Define the estimator
def estimate_cost(b):
    model = model_selector.value
    input_count = input_tokens.value
    output_count = output_tokens.value
    prices = token_prices[model] 
    cost = (input_count / 1000) * prices["input"] + (output_count / 1000) * prices["output"]
    cost_display.value = f"💲 Estimated Cost: ${cost:.6f}"

estimate_button.on_click(estimate_cost)

# Display everything
display(model_selector, input_tokens, output_tokens, estimate_button, cost_display)

# Send a chat message to GPT-4o mini
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a UC Berkeley Economics Student"},
        {"role": "user", "content": "Explain who pays the burden of tariffs"}
    ]
)

# Display response
print(response.choices[0].message.content)

# Display token usage
print("\n🔢 Token Usage:")
print(f"Prompt tokens: {response.usage.prompt_tokens}")
print(f"Completion tokens: {response.usage.completion_tokens}")
print(f"Total tokens: {response.usage.total_tokens}")

How about a more complicated example?¶

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {
            "role": "system",
            "content": """
You are an advanced UC Berkeley undergraduate economics student enrolled in
Intermediate Macroeconomics and International Trade.

Your responses should:
• Use formal economic reasoning and terminology.
• Reference economic models such as supply and demand, partial equilibrium,
  and basic trade theory (e.g., incidence of tariffs, elasticity, welfare effects).
• Explain concepts clearly but at an intermediate level appropriate for
  a second-year economics major.
• Use structured explanations and occasionally reference diagrams conceptually
  (even though you cannot draw them).
• Distinguish between short-run and long-run effects when appropriate.
• Avoid political rhetoric and focus on economic analysis.

When relevant, discuss:
- Consumer surplus
- Producer surplus
- Deadweight loss
- Elasticity of supply and demand
- Terms of trade effects
- Distributional impacts across domestic consumers, domestic producers,
  and foreign exporters.

Respond in 2–3 well-structured paragraphs.
"""
        },
        {
            "role": "user",
            "content": """
Suppose the United States imposes a 25% tariff on imported steel. The policy
is justified politically as a way to protect domestic steel producers and
support domestic manufacturing employment.

Using standard economic theory, analyze who ultimately bears the burden of
this tariff.

In your answer, discuss:

• How the incidence of the tariff depends on the elasticity of supply and demand
  in the domestic and international markets.

• Whether domestic consumers, domestic producers, or foreign exporters bear
  the largest share of the burden.

• The effects on prices, consumer surplus, producer surplus, and government
  tariff revenue.

• Any deadweight losses generated by the tariff.

• How the analysis might differ in the short run versus the long run.

Assume the United States is a large enough country that its tariff can affect
world prices.
"""
        }
    ],
    temperature=0.7,
    top_p=1.0,
    presence_penalty=0.5,
    frequency_penalty=0.3,
    max_tokens=400,
    stop=None
)

# Display the response text
print("📘 Response:")
print(response.choices[0].message.content)

# Display token usage
print("\n🔢 Token Usage:")
print(f"Prompt tokens: {response.usage.prompt_tokens}")
print(f"Completion tokens: {response.usage.completion_tokens}")
print(f"Total tokens: {response.usage.total_tokens}")

Cost modeling (not using the API, just calculating based on token counts and prices)¶

import pandas as pd
import matplotlib.pyplot as plt


df = pd.DataFrame(token_prices).T
df

# Example token usage scenarios
token_usage_scenarios = {
    "Small": {"input": 500, "output": 250},
    "Medium": {"input": 2000, "output": 1000},
    "Large": {"input": 10000, "output": 5000},
}

# Cost function
def estimate_cost(model, input_tokens, output_tokens):
    price = token_prices[model]
    return (input_tokens/1000)*price["input"] + (output_tokens/1000)*price["output"]

# Build table of results
cost_records = []

for scenario_name, tokens in token_usage_scenarios.items():
    for model_name in token_prices:
        cost = estimate_cost(model_name, tokens["input"], tokens["output"])
        cost_records.append({
            "scenario": scenario_name,
            "model": model_name,
            "cost": cost
        })

token_cost_table = pd.DataFrame(cost_records)

# Select one scenario for visualization
medium_usage_costs = token_cost_table[token_cost_table["scenario"] == "Medium"]

# Bar chart
plt.figure()
plt.bar(medium_usage_costs["model"], medium_usage_costs["cost"])
plt.xticks(rotation=45)
plt.ylabel("Estimated Cost ($)")
plt.title("Model Cost Comparison (Medium Token Usage)")
plt.show()