OpenAI Realtime API
📝 Overview
Introduction
OpenAI Realtime API provides two connection methods:
WebRTC - For real-time audio/video interaction in browsers and mobile clients
WebSocket - For server-to-server application integration
Use Cases
Real-time voice conversations
Audio/video conferencing
Real-time translation
Speech transcription
Real-time code generation
Server-side real-time integration
Key Features
Bidirectional audio streaming
Mixed text and audio conversations
Function calling support
Automatic Voice Activity Detection (VAD)
Audio transcription capabilities
WebSocket server-side integration
🔐 Authentication & Security
Authentication Methods
Standard API Key (server-side only)
Ephemeral Token (client-side use)
Ephemeral Token
Validity: 1 minute
Usage limit: Single connection
Generation: Created via server-side API
POST https://your-newapi-server-address/v1/realtime/sessions
Content-Type: application/json
Authorization: Bearer $NEW_API_KEY
{
"model": "gpt-4o-realtime-preview-2024-12-17",
"voice": "verse"
}
Security Recommendations
Never expose standard API keys on the client side
Use HTTPS/WSS for communication
Implement appropriate access controls
Monitor for unusual activity
🔌 Connection Establishment
WebRTC Connection
URL: https://your-newapi-server-address/v1/realtime
Query parameters: model
Headers:
Authorization: Bearer EPHEMERAL_KEY
Content-Type: application/sdp
WebSocket Connection
URL: wss://your-newapi-server-address/v1/realtime
Query parameters: model
Headers:
Authorization: Bearer YOUR_API_KEY
OpenAI-Beta: realtime=v1
Connection Flow
sequenceDiagram
participant Client
participant Server
participant OpenAI
alt WebRTC Connection
Client->>Server: Request ephemeral token
Server->>OpenAI: Create session
OpenAI-->>Server: Return ephemeral token
Server-->>Client: Return ephemeral token
Client->>OpenAI: Create WebRTC offer
OpenAI-->>Client: Return answer
Note over Client,OpenAI: Establish WebRTC connection
Client->>OpenAI: Create data channel
OpenAI-->>Client: Confirm data channel
else WebSocket Connection
Server->>OpenAI: Establish WebSocket connection
OpenAI-->>Server: Confirm connection
Note over Server,OpenAI: Begin real-time conversation
end
Data Channel
Name: oai-events
Purpose: Event transmission
Format: JSON
Audio Stream
Input: addTrack()
Output: ontrack
event
💬 Conversation Interaction
Conversation Modes
Text-only conversations
Voice conversations
Mixed conversations
Session Management
Create session
Update session
End session
Session configuration
Event Types
Text events
Audio events
Function calls
Status updates
Error events
⚙️ Configuration Options
Audio Configuration
Input formats
pcm16
g711_ulaw
g711_alaw
Output formats
pcm16
g711_ulaw
g711_alaw
Voice types
alloy
echo
shimmer
Model Configuration
Temperature
Maximum output length
System prompt
Tool configuration
VAD Configuration
Threshold
Silence duration
Prefix padding
💡 Request Examples
WebRTC Connection ❌
Client Implementation (Browser)
async function init () {
// Get ephemeral key from server - see server code below
const tokenResponse = await fetch ( "/session" );
const data = await tokenResponse . json ();
const EPHEMERAL_KEY = data . client_secret . value ;
// Create peer connection
const pc = new RTCPeerConnection ();
// Set up remote audio from model playback
const audioEl = document . createElement ( "audio" );
audioEl . autoplay = true ;
pc . ontrack = e => audioEl . srcObject = e . streams [ 0 ];
// Add local audio track from browser microphone input
const ms = await navigator . mediaDevices . getUserMedia ({
audio : true
});
pc . addTrack ( ms . getTracks ()[ 0 ]);
// Set up data channel for sending and receiving events
const dc = pc . createDataChannel ( "oai-events" );
dc . addEventListener ( "message" , ( e ) => {
// Receive real-time server events here!
console . log ( e );
});
// Start session using Session Description Protocol (SDP)
const offer = await pc . createOffer ();
await pc . setLocalDescription ( offer );
const baseUrl = "https://your-newapi-server-address/v1/realtime" ;
const model = "gpt-4o-realtime-preview-2024-12-17" ;
const sdpResponse = await fetch ( ` ${ baseUrl } ?model= ${ model } ` , {
method : "POST" ,
body : offer . sdp ,
headers : {
Authorization : `Bearer ${ EPHEMERAL_KEY } ` ,
"Content-Type" : "application/sdp"
},
});
const answer = {
type : "answer" ,
sdp : await sdpResponse . text (),
};
await pc . setRemoteDescription ( answer );
}
init ();
Server Implementation (Node.js)
import express from "express" ;
const app = express ();
// Create an endpoint for generating ephemeral tokens
// This endpoint works with the client code above
app . get ( "/session" , async ( req , res ) => {
const r = await fetch ( "https://your-newapi-server-address/v1/realtime/sessions" , {
method : "POST" ,
headers : {
"Authorization" : `Bearer ${ process . env . NEW_API_KEY } ` ,
"Content-Type" : "application/json" ,
},
body : JSON . stringify ({
model : "gpt-4o-realtime-preview-2024-12-17" ,
voice : "verse" ,
}),
});
const data = await r . json ();
// Send the JSON received from OpenAI REST API back to client
res . send ( data );
});
app . listen ( 3000 );
WebRTC Event Send/Receive Example
// Create data channel from peer connection
const dc = pc . createDataChannel ( "oai-events" );
// Listen for server events on data channel
// Event data needs to be parsed from JSON string
dc . addEventListener ( "message" , ( e ) => {
const realtimeEvent = JSON . parse ( e . data );
console . log ( realtimeEvent );
});
// Send client event: serialize valid client events to
// JSON and send via data channel
const responseCreate = {
type : "response.create" ,
response : {
modalities : [ "text" ],
instructions : "Write a haiku about code" ,
},
};
dc . send ( JSON . stringify ( responseCreate ));
WebSocket Connection ✅
Node.js (ws module)
import WebSocket from "ws" ;
const url = "wss://your-newapi-server-address/v1/realtime?model=gpt-4o-realtime-preview-2024-12-17" ;
const ws = new WebSocket ( url , {
headers : {
"Authorization" : "Bearer " + process . env . NEW_API_KEY ,
"OpenAI-Beta" : "realtime=v1" ,
},
});
ws . on ( "open" , function open () {
console . log ( "Connected to server." );
});
ws . on ( "message" , function incoming ( message ) {
console . log ( JSON . parse ( message . toString ()));
});
Python (websocket-client)
# Requires websocket-client library:
# pip install websocket-client
import os
import json
import websocket
NEW_API_KEY = os . environ . get ( "NEW_API_KEY" )
url = "wss://your-newapi-server-address/v1/realtime?model=gpt-4o-realtime-preview-2024-12-17"
headers = [
"Authorization: Bearer " + NEW_API_KEY ,
"OpenAI-Beta: realtime=v1"
]
def on_open ( ws ):
print ( "Connected to server." );
def on_message ( ws , message ):
data = json . loads ( message )
print ( "Received event:" , json . dumps ( data , indent = 2 ))
ws = websocket . WebSocketApp (
url ,
header = headers ,
on_open = on_open ,
on_message = on_message ,
)
ws . run_forever ()
Browser (Standard WebSocket)
/*
Note: In browser and other client environments, we recommend using WebRTC.
But in Deno and Cloudflare Workers and other browser-like environments,
you can also use the standard WebSocket interface.
*/
const ws = new WebSocket (
"wss://your-newapi-server-address/v1/realtime?model=gpt-4o-realtime-preview-2024-12-17" ,
[
"realtime" ,
// Authentication
"openai-insecure-api-key." + NEW_API_KEY ,
// Optional
"openai-organization." + OPENAI_ORG_ID ,
"openai-project." + OPENAI_PROJECT_ID ,
// Beta protocol, required
"openai-beta.realtime-v1"
]
);
ws . on ( "open" , function open () {
console . log ( "Connected to server." );
});
ws . on ( "message" , function incoming ( message ) {
console . log ( message . data );
});
Message Send/Receive Example
Node.js/Browser
// Receive server events
ws . on ( "message" , function incoming ( message ) {
// Need to parse message data from JSON
const serverEvent = JSON . parse ( message . data )
console . log ( serverEvent );
});
// Send events, create JSON data structure conforming to client event format
const event = {
type : "response.create" ,
response : {
modalities : [ "audio" , "text" ],
instructions : "Give me a haiku about code." ,
}
};
ws . send ( JSON . stringify ( event ));
Python
# Send client events, serialize dictionary to JSON
def on_open ( ws ):
print ( "Connected to server." );
event = {
"type" : "response.create" ,
"response" : {
"modalities" : [ "text" ],
"instructions" : "Please assist the user."
}
}
ws . send ( json . dumps ( event ))
# Receive messages need to parse message payload from JSON
def on_message ( ws , message ):
data = json . loads ( message )
print ( "Received event:" , json . dumps ( data , indent = 2 ))
⚠️ Error Handling
Common Errors
Connection errors
Network issues
Authentication failures
Configuration errors
Audio errors
Device permissions
Unsupported formats
Codec issues
Session errors
Token expiration
Session timeout
Concurrency limits
Error Recovery
Automatic reconnection
Session recovery
Error retry
Graceful degradation
📝 Event Reference
All events need to include the following request headers:
Header
Type
Description
Example Value
Authorization
String
Authentication token
Bearer $NEW_API_KEY
OpenAI-Beta
String
API version
realtime=v1
Client Events
session.update
Update the default configuration for the session.
Parameter
Type
Required
Description
Example Value/Optional Values
event_id
String
No
Client-generated event identifier
event_123
type
String
No
Event type
session.update
modalities
String array
No
Modality types the model can respond with
["text", "audio"]
instructions
String
No
System instructions prepended to model calls
"Your knowledge cutoff is 2023-10..."
voice
String
No
Voice type used by the model
alloy, echo, shimmer
input_audio_format
String
No
Input audio format
pcm16, g711_ulaw, g711_alaw
output_audio_format
String
No
Output audio format
pcm16, g711_ulaw, g711_alaw
input_audio_transcription.model
String
No
Model used for transcription
whisper-1
turn_detection.type
String
No
Voice detection type
server_vad
turn_detection.threshold
Number
No
VAD activation threshold (0.0-1.0)
0.8
turn_detection.prefix_padding_ms
Integer
No
Audio duration included before speech starts
500
turn_detection.silence_duration_ms
Integer
No
Silence duration to detect speech stop
1000
tools
Array
No
List of tools available to the model
[]
tool_choice
String
No
How the model chooses tools
auto/none/required
temperature
Number
No
Model sampling temperature
0.8
max_output_tokens
String/Integer
No
Maximum tokens per response
"inf"/4096
Append audio data to the input audio buffer.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Client-generated event identifier
event_456
type
String
No
Event type
input_audio_buffer.append
audio
String
No
Base64-encoded audio data
Base64EncodedAudioData
Commit the audio data in the buffer as a user message.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Client-generated event identifier
event_789
type
String
No
Event type
input_audio_buffer.commit
Clear all audio data from the input audio buffer.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Client-generated event identifier
event_012
type
String
No
Event type
input_audio_buffer.clear
conversation.item.create
Add a new conversation item to the conversation.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Client-generated event identifier
event_345
type
String
No
Event type
conversation.item.create
previous_item_id
String
No
New item will be inserted after this ID
null
item.id
String
No
Unique identifier for the conversation item
msg_001
item.type
String
No
Type of conversation item
message/function_call/function_call_output
item.status
String
No
Status of conversation item
completed/in_progress/incomplete
item.role
String
No
Role of message sender
user/assistant/system
item.content
Array
No
Message content
[text/audio/transcript]
item.call_id
String
No
ID of function call
call_001
item.name
String
No
Name of called function
function_name
item.arguments
String
No
Arguments for function call
{"param": "value"}
item.output
String
No
Output result of function call
{"result": "value"}
conversation.item.truncate
Truncate audio content in assistant messages.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Client-generated event identifier
event_678
type
String
No
Event type
conversation.item.truncate
item_id
String
No
ID of assistant message item to truncate
msg_002
content_index
Integer
No
Index of content part to truncate
0
audio_end_ms
Integer
No
End time point for audio truncation
1500
conversation.item.delete
Delete the specified conversation item from conversation history.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Client-generated event identifier
event_901
type
String
No
Event type
conversation.item.delete
item_id
String
No
ID of conversation item to delete
msg_003
response.create
Trigger response generation.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Client-generated event identifier
event_234
type
String
No
Event type
response.create
response.modalities
String array
No
Modality types for response
["text", "audio"]
response.instructions
String
No
Instructions for the model
"Please assist the user."
response.voice
String
No
Voice type used by the model
alloy/echo/shimmer
response.output_audio_format
String
No
Output audio format
pcm16
response.tools
Array
No
List of tools available to the model
["type", "name", "description"]
response.tool_choice
String
No
How the model chooses tools
auto
response.temperature
Number
No
Sampling temperature
0.7
response.max_output_tokens
Integer/String
No
Maximum output tokens
150/"inf"
response.cancel
Cancel ongoing response generation.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Client-generated event identifier
event_567
type
String
No
Event type
response.cancel
Server Events
error
Event returned when an error occurs.
Parameter
Type
Required
Description
Example Value
event_id
String array
No
Unique identifier for server event
["event_890"]
type
String
No
Event type
error
error.type
String
No
Error type
invalid_request_error/server_error
error.code
String
No
Error code
invalid_event
error.message
String
No
Human-readable error message
"The 'type' field is missing."
error.param
String
No
Parameter related to error
null
error.event_id
String
No
ID of related event
event_567
Returned when input audio transcription is enabled and transcription succeeds.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_2122
type
String
No
Event type
conversation.item.input_audio_transcription.completed
item_id
String
No
ID of user message item
msg_003
content_index
Integer
No
Index of content part containing audio
0
transcript
String
No
Transcribed text content
"Hello, how are you?"
Returned when input audio transcription is configured but transcription request for user message fails.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_2324
type
String array
No
Event type
["conversation.item.input_audio_transcription.failed"]
item_id
String
No
ID of user message item
msg_003
content_index
Integer
No
Index of content part containing audio
0
error.type
String
No
Error type
transcription_error
error.code
String
No
Error code
audio_unintelligible
error.message
String
No
Human-readable error message
"The audio could not be transcribed."
error.param
String
No
Parameter related to error
null
conversation.item.truncated
Returned when client truncates previous assistant audio message item.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_2526
type
String
No
Event type
conversation.item.truncated
item_id
String
No
ID of truncated assistant message item
msg_004
content_index
Integer
No
Index of truncated content part
0
audio_end_ms
Integer
No
Time point when audio was truncated (milliseconds)
1500
conversation.item.deleted
Returned when an item in the conversation is deleted.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_2728
type
String
No
Event type
conversation.item.deleted
item_id
String
No
ID of deleted conversation item
msg_005
Returned when audio buffer data is committed.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_1121
type
String
No
Event type
input_audio_buffer.committed
previous_item_id
String
No
New conversation item will be inserted after this ID
msg_001
item_id
String
No
ID of user message item to be created
msg_002
Returned when client clears input audio buffer.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_1314
type
String
No
Event type
input_audio_buffer.cleared
In server voice detection mode, returned when voice input is detected.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_1516
type
String
No
Event type
input_audio_buffer.speech_started
audio_start_ms
Integer
No
Milliseconds from session start to voice detection
1000
item_id
String
No
ID of user message item to be created when voice stops
msg_003
In server voice detection mode, returned when voice input stops.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_1718
type
String
No
Event type
input_audio_buffer.speech_stopped
audio_start_ms
Integer
No
Milliseconds from session start to voice stop detection
2000
item_id
String
No
ID of user message item to be created
msg_003
response.created
Returned when a new response is created.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_2930
type
String
No
Event type
response.created
response.id
String
No
Unique identifier for response
resp_001
response.object
String
No
Object type
realtime.response
response.status
String
No
Status of response
in_progress
response.status_details
Object
No
Additional details about status
null
response.output
String array
No
List of output items generated by response
["[]"]
response.usage
Object
No
Usage statistics for response
null
response.done
Returned when response streaming is complete.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_3132
type
String
No
Event type
response.done
response.id
String
No
Unique identifier for response
resp_001
response.object
String
No
Object type
realtime.response
response.status
String
No
Final status of response
completed/cancelled/failed/incomplete
response.status_details
Object
No
Additional details about status
null
response.output
String array
No
List of output items generated by response
["[...]"]
response.usage.total_tokens
Integer
No
Total tokens
50
response.usage.input_tokens
Integer
No
Input tokens
20
response.usage.output_tokens
Integer
No
Output tokens
30
response.output_item.added
Returned when a new output item is created during response generation.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_3334
type
String
No
Event type
response.output_item.added
response_id
String
No
ID of response the output item belongs to
resp_001
output_index
String
No
Index of output item in response
0
item.id
String
No
Unique identifier for output item
msg_007
item.object
String
No
Object type
realtime.item
item.type
String
No
Type of output item
message/function_call/function_call_output
item.status
String
No
Status of output item
in_progress/completed
item.role
String
No
Role associated with output item
assistant
item.content
Array
No
Content of output item
["type", "text", "audio", "transcript"]
response.output_item.done
Returned when output item streaming is complete.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_3536
type
String
No
Event type
response.output_item.done
response_id
String
No
ID of response the output item belongs to
resp_001
output_index
String
No
Index of output item in response
0
item.id
String
No
Unique identifier for output item
msg_007
item.object
String
No
Object type
realtime.item
item.type
String
No
Type of output item
message/function_call/function_call_output
item.status
String
No
Final status of output item
completed/incomplete
item.role
String
No
Role associated with output item
assistant
item.content
Array
No
Content of output item
["type", "text", "audio", "transcript"]
response.content_part.added
Returned when a new content part is added to assistant message item during response generation.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_3738
type
String
No
Event type
response.content_part.added
response_id
String
No
ID of response
resp_001
item_id
String
No
ID of message item to add content part to
msg_007
output_index
Integer
No
Index of output item in response
0
content_index
Integer
No
Index of content part in message item content array
0
part.type
String
No
Content type
text/audio
part.text
String
No
Text content
"Hello"
part.audio
String
No
Base64-encoded audio data
"base64_encoded_audio_data"
part.transcript
String
No
Transcribed text of audio
"Hello"
response.content_part.done
Returned when content part in assistant message item streaming is complete.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_3940
type
String
No
Event type
response.content_part.done
response_id
String
No
ID of response
resp_001
item_id
String
No
ID of message item to add content part to
msg_007
output_index
Integer
No
Index of output item in response
0
content_index
Integer
No
Index of content part in message item content array
0
part.type
String
No
Content type
text/audio
part.text
String
No
Text content
"Hello"
part.audio
String
No
Base64-encoded audio data
"base64_encoded_audio_data"
part.transcript
String
No
Transcribed text of audio
"Hello"
response.text.delta
Returned when text value of "text" type content part is updated.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_4142
type
String
No
Event type
response.text.delta
response_id
String
No
ID of response
resp_001
item_id
String
No
ID of message item
msg_007
output_index
Integer
No
Index of output item in response
0
content_index
Integer
No
Index of content part in message item content array
0
delta
String
No
Text delta update content
"Sure, I can h"
response.text.done
Returned when "text" type content part text streaming is complete.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_4344
type
String
No
Event type
response.text.done
response_id
String
No
ID of response
resp_001
item_id
String
No
ID of message item
msg_007
output_index
Integer
No
Index of output item in response
0
content_index
Integer
No
Index of content part in message item content array
0
delta
String
No
Final complete text content
"Sure, I can help with that."
response.audio_transcript.delta
Returned when transcription content of model-generated audio output is updated.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_4546
type
String
No
Event type
response.audio_transcript.delta
response_id
String
No
ID of response
resp_001
item_id
String
No
ID of message item
msg_008
output_index
Integer
No
Index of output item in response
0
content_index
Integer
No
Index of content part in message item content array
0
delta
String
No
Transcription text delta update content
"Hello, how can I a"
response.audio_transcript.done
Returned when transcription of model-generated audio output streaming is complete.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_4748
type
String
No
Event type
response.audio_transcript.done
response_id
String
No
ID of response
resp_001
item_id
String
No
ID of message item
msg_008
output_index
Integer
No
Index of output item in response
0
content_index
Integer
No
Index of content part in message item content array
0
transcript
String
No
Final complete transcribed text of audio
"Hello, how can I assist you today?"
response.audio.delta
Returned when model-generated audio content is updated.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_4950
type
String
No
Event type
response.audio.delta
response_id
String
No
ID of response
resp_001
item_id
String
No
ID of message item
msg_008
output_index
Integer
No
Index of output item in response
0
content_index
Integer
No
Index of content part in message item content array
0
delta
String
No
Base64-encoded audio data delta
"Base64EncodedAudioDelta"
response.audio.done
Returned when model-generated audio is complete.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_5152
type
String
No
Event type
response.audio.done
response_id
String
No
ID of response
resp_001
item_id
String
No
ID of message item
msg_008
output_index
Integer
No
Index of output item in response
0
content_index
Integer
No
Index of content part in message item content array
0
Function Calling
response.function_call_arguments.delta
Returned when model-generated function call arguments are updated.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_5354
type
String
No
Event type
response.function_call_arguments.delta
response_id
String
No
ID of response
resp_002
item_id
String
No
ID of message item
fc_001
output_index
Integer
No
Index of output item in response
0
call_id
String
No
ID of function call
call_001
delta
String
No
JSON format function call arguments delta
"{\"location\": \"San\""
response.function_call_arguments.done
Returned when model-generated function call arguments streaming is complete.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_5556
type
String
No
Event type
response.function_call_arguments.done
response_id
String
No
ID of response
resp_002
item_id
String
No
ID of message item
fc_001
output_index
Integer
No
Index of output item in response
0
call_id
String
No
ID of function call
call_001
arguments
String
No
Final complete function call arguments (JSON format)
"{\"location\": \"San Francisco\"}"
Other Status Updates
rate_limits.updated
Triggered after each "response.done" event to indicate updated rate limits.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_5758
type
String
No
Event type
rate_limits.updated
rate_limits
Object array
No
List of rate limit information
[{"name": "requests_per_min", "limit": 60, "remaining": 45, "reset_seconds": 35}]
conversation.created
Returned when conversation is created.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_9101
type
String
No
Event type
conversation.created
conversation
Object
No
Conversation resource object
{"id": "conv_001", "object": "realtime.conversation"}
conversation.item.created
Returned when conversation item is created.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_1920
type
String
No
Event type
conversation.item.created
previous_item_id
String
No
ID of previous conversation item
msg_002
item
Object
No
Conversation item object
{"id": "msg_003", "object": "realtime.item", "type": "message", "status": "completed", "role": "user", "content": [{"type": "text", "text": "Hello"}]}
session.created
Returned when session is created.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_1234
type
String
No
Event type
session.created
session
Object
No
Session object
{"id": "sess_001", "object": "realtime.session", "model": "gpt-4", "modalities": ["text", "audio"]}
session.updated
Returned when session is updated.
Parameter
Type
Required
Description
Example Value
event_id
String
No
Unique identifier for server event
event_5678
type
String
No
Event type
session.updated
session
Object
No
Updated session object
{"id": "sess_001", "object": "realtime.session", "model": "gpt-4", "modalities": ["text", "audio"]}
Rate Limit Event Parameter Table
Parameter
Type
Required
Description
Example Value
name
String
Yes
Limit name
requests_per_min
limit
Integer
Yes
Limit value
60
remaining
Integer
Yes
Remaining available amount
45
reset_seconds
Integer
Yes
Reset time (seconds)
35
Function Call Parameter Table
Parameter
Type
Required
Description
Example Value
type
String
Yes
Function type
function
name
String
Yes
Function name
get_weather
description
String
No
Function description
Get the current weather
parameters
Object
Yes
Function parameter definition
{"type": "object", "properties": {...}}
Parameter
Type
Description
Optional Values
sample_rate
Integer
Sample rate
8000, 16000, 24000, 44100, 48000
channels
Integer
Number of channels
1 (mono), 2 (stereo)
bits_per_sample
Integer
Bits per sample
16 (pcm16), 8 (g711)
encoding
String
Encoding method
pcm16, g711_ulaw, g711_alaw
Voice Detection Parameter Table
Parameter
Type
Description
Default Value
Range
threshold
Float
VAD activation threshold
0.5
0.0-1.0
prefix_padding_ms
Integer
Voice prefix padding (milliseconds)
500
0-5000
silence_duration_ms
Integer
Silence detection duration (milliseconds)
1000
100-10000
Parameter
Type
Description
Optional Values
tool_choice
String
Tool selection method
auto, none, required
tools
Array
Available tools list
[{type, name, description, parameters}]
Model Configuration Parameter Table
Parameter
Type
Description
Range/Optional Values
Default Value
temperature
Float
Sampling temperature
0.0-2.0
1.0
max_output_tokens
Integer/String
Maximum output length
1-4096/"inf"
"inf"
modalities
String array
Response modalities
["text", "audio"]
["text"]
voice
String
Voice type
alloy, echo, shimmer
alloy
Event Common Parameter Table
Parameter
Type
Required
Description
Example Value
event_id
String
Yes
Unique identifier for event
event_123
type
String
Yes
Event type
session.update
timestamp
Integer
No
Event timestamp (milliseconds)
1677649363000
Session Status Parameter Table
Parameter
Type
Description
Optional Values
status
String
Session status
active, ended, error
error
Object
Error information
{"type": "error_type", "message": "error message"}
metadata
Object
Session metadata
{"client_id": "web", "session_type": "chat"}
Conversation Item Status Parameter Table
Parameter
Type
Description
Optional Values
status
String
Conversation item status
completed, in_progress, incomplete
role
String
Sender role
user, assistant, system
type
String
Conversation item type
message, function_call, function_call_output
Content Type Parameter Table
Parameter
Type
Description
Optional Values
type
String
Content type
text, audio, transcript
format
String
Content format
plain, markdown, html
encoding
String
Encoding method
utf-8, base64
Response Status Parameter Table
Parameter
Type
Description
Optional Values
status
String
Response status
completed, cancelled, failed, incomplete
status_details
Object
Status details
{"reason": "user_cancelled"}
usage
Object
Usage statistics
{"total_tokens": 50, "input_tokens": 20, "output_tokens": 30}
Audio Transcription Parameter Table
Parameter
Type
Description
Example Value
enabled
Boolean
Whether transcription is enabled
true
model
String
Transcription model
whisper-1
language
String
Transcription language
en, zh, auto
prompt
String
Transcription prompt
"Transcript of a conversation"
Audio Stream Parameter Table
Parameter
Type
Description
Optional Values
chunk_size
Integer
Audio chunk size (bytes)
1024, 2048, 4096
latency
String
Latency mode
low, balanced, high
compression
String
Compression method
none, opus, mp3
WebRTC Configuration Parameter Table
Parameter
Type
Description
Default Value
ice_servers
Array
ICE server list
[{"urls": "stun:stun.l.google.com:19302"}]
audio_constraints
Object
Audio constraints
{"echoCancellation": true}
connection_timeout
Integer
Connection timeout (milliseconds)
30000