Google Gemini Chat Format (Generate Content)¶

Official Documentation

📝 Introduction¶

Google Gemini API supports generating content using images, audio, code, tools, etc. Given input GenerateContentRequest generates model responses. Supports text generation, visual understanding, audio processing, long context, code execution, JSON schema, function calling, and many other features.

💡 Request Examples¶

Basic Text Chat ✅¶

curl "https://your-newapi-server-address/v1beta/models/gemini-2.0-flash:generateContent?key=$NEWAPI_API_KEY" \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
      "contents": [{
        "parts":[{"text": "Write a story about a magic backpack."}]
        }]
       }' 2> /dev/null

Image Analysis Chat ✅¶

# Use temporary file to save base64 encoded image data
TEMP_B64=$(mktemp)
trap 'rm -f "$TEMP_B64"' EXIT
base64 $B64FLAGS $IMG_PATH > "$TEMP_B64"

# Use temporary file to save JSON payload
TEMP_JSON=$(mktemp)
trap 'rm -f "$TEMP_JSON"' EXIT

cat > "$TEMP_JSON" << EOF
{
  "contents": [{
    "parts":[
      {"text": "Tell me about this instrument"},
      {
        "inline_data": {
          "mime_type":"image/jpeg",
          "data": "$(cat "$TEMP_B64")"
        }
      }
    ]
  }]
}
EOF

curl "https://your-newapi-server-address/v1beta/models/gemini-2.0-flash:generateContent?key=$NEWAPI_API_KEY" \
    -H 'Content-Type: application/json' \
    -X POST \
    -d "@$TEMP_JSON" 2> /dev/null

Function Calling ✅¶

cat > tools.json << EOF
{
  "function_declarations": [
    {
      "name": "enable_lights",
      "description": "Turn on the lighting system."
    },
    {
      "name": "set_light_color",
      "description": "Set the light color. Lights must be enabled for this to work.",
      "parameters": {
        "type": "object",
        "properties": {
          "rgb_hex": {
            "type": "string",
            "description": "The light color as a 6-digit hex string, e.g. ff0000 for red."
          }
        },
        "required": [
          "rgb_hex"
        ]
      }
    },
    {
      "name": "stop_lights",
      "description": "Turn off the lighting system."
    }
  ]
} 
EOF

curl "https://your-newapi-server-address/v1beta/models/gemini-2.0-flash:generateContent?key=$NEWAPI_API_KEY" \
  -H 'Content-Type: application/json' \
  -d @<(echo '
  {
    "system_instruction": {
      "parts": {
        "text": "You are a helpful lighting system bot. You can turn lights on and off, and you can set the color. Do not perform any other tasks."
      }
    },
    "tools": ['$(cat tools.json)'],

    "tool_config": {
      "function_calling_config": {"mode": "auto"}
    },

    "contents": {
      "role": "user",
      "parts": {
        "text": "Turn on the lights please."
      }
    }
  }
') 2>/dev/null |sed -n '/"content"/,/"finishReason"/p'

JSON Schema Response ✅¶

curl "https://your-newapi-server-address/v1beta/models/gemini-2.0-flash:generateContent?key=$NEWAPI_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
    "contents": [{
      "parts":[
        {"text": "List 5 popular cookie recipes"}
        ]
    }],
    "generationConfig": {
        "response_mime_type": "application/json",
        "response_schema": {
          "type": "ARRAY",
          "items": {
            "type": "OBJECT",
            "properties": {
              "recipe_name": {"type":"STRING"},
            }
          }
        }
    }
}' 2> /dev/null | head

Audio Processing 🟡¶

File Upload Limitations

Only supports uploading audio via inline_data in base64 format, does not support file_data.file_uri or File API.

# Use File API to upload audio data to API request
# Use base64 inline_data to upload audio data to API request
if [[ "$(base64 --version 2>&1)" = *"FreeBSD"* ]]; then
  B64FLAGS="--input"
else
  B64FLAGS="-w0"
fi
AUDIO_B64=$(base64 $B64FLAGS "$AUDIO_PATH")

curl "https://your-newapi-server-address/v1beta/models/gemini-2.0-flash:generateContent?key=$NEWAPI_API_KEY" \
  -H 'Content-Type: application/json' \
  -X POST \
  -d '{
    "contents": [{
      "parts": [
        {"text": "Please describe this audio file."},
        {"inline_data": {"mime_type": "audio/mpeg", "data": "'$AUDIO_B64'"}}
      ]
    }]
  }' 2> /dev/null | jq ".candidates[].content.parts[].text"

Video Processing 🟡¶

File Upload Limitations

Only supports uploading video via inline_data in base64 format, does not support file_data.file_uri or File API.

# Use File API to upload video data to API request
# Use base64 inline_data to upload video data to API request
if [[ "$(base64 --version 2>&1)" = *"FreeBSD"* ]]; then
  B64FLAGS="--input"
else
  B64FLAGS="-w0"
fi
VIDEO_B64=$(base64 $B64FLAGS "$VIDEO_PATH")

curl "https://your-newapi-server-address/v1beta/models/gemini-2.0-flash:generateContent?key=$NEWAPI_API_KEY" \
  -H 'Content-Type: application/json' \
  -X POST \
  -d '{
    "contents": [{
      "parts": [
        {"text": "Transcribe the audio from this video and provide visual descriptions."},
        {"inline_data": {"mime_type": "video/mp4", "data": "'$VIDEO_B64'"}}
      ]
    }]
  }' 2> /dev/null | jq ".candidates[].content.parts[].text"

PDF Processing 🟡¶

File Upload Limitations

仅支持通过 inline_data 以 base64 方式上传 PDF，不支持 file_data.file_uri 或 File API。

MIME_TYPE=$(file -b --mime-type "${PDF_PATH}")
# 使用 base64 inline_data 上传 PDF 文件到 API 请求
if [[ "$(base64 --version 2>&1)" = *"FreeBSD"* ]]; then
  B64FLAGS="--input"
else
  B64FLAGS="-w0"
fi
PDF_B64=$(base64 $B64FLAGS "$PDF_PATH")

echo $MIME_TYPE

curl "https://your-newapi-server-address/v1beta/models/gemini-2.0-flash:generateContent?key=$NEWAPI_API_KEY" \
  -H 'Content-Type: application/json' \
  -X POST \
  -d '{
    "contents": [{
      "parts": [
        {"text": "Can you add a few more lines to this poem?"},
        {"inline_data": {"mime_type": "application/pdf", "data": "'$PDF_B64'"}}
      ]
    }]
  }' 2> /dev/null | jq ".candidates[].content.parts[].text"

Chat Dialog ✅¶

curl https://your-newapi-server-address/v1beta/models/gemini-2.0-flash:generateContent?key=$NEWAPI_API_KEY \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
      "contents": [
        {"role":"user",
         "parts":[{
           "text": "Hello"}]},
        {"role": "model",
         "parts":[{
           "text": "Great to meet you. What would you like to know?"}]},
        {"role":"user",
         "parts":[{
           "text": "I have two dogs in my house. How many paws are in my house?"}]},
      ]
    }' 2> /dev/null | grep "text"

Streaming Response ✅¶

curl "https://your-newapi-server-address/v1beta/models/gemini-2.0-flash:streamGenerateContent?alt=sse&key=$NEWAPI_API_KEY" \
    -H 'Content-Type: application/json' \
    --no-buffer \
    -d '{
      "contents": [{
        "parts": [{"text": "写一个关于魔法背包的故事"}]
      }]
    }'

Code Execution ✅¶

curl "https://your-newapi-server-address/v1beta/models/gemini-2.0-flash:generateContent?key=$NEWAPI_API_KEY" \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
      "contents": [{
        "parts": [{"text": "计算斐波那契数列的第10项"}]
      }],
      "tools": [{
        "codeExecution": {}
      }]
    }'

Generation Config ✅¶

curl https://your-newapi-server-address/v1beta/models/gemini-2.0-flash:generateContent?key=$NEWAPI_API_KEY \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
        "contents": [{
            "parts":[
                {"text": "Explain how AI works"}
            ]
        }],
        "generationConfig": {
            "stopSequences": [
                "Title"
            ],
            "temperature": 1.0,
            "maxOutputTokens": 800,
            "topP": 0.8,
            "topK": 10
        }
    }'  2> /dev/null | grep "text"

Safety Settings ✅¶

echo '{
    "safetySettings": [
        {"category": "HARM_CATEGORY_HARASSMENT", "threshold": "BLOCK_ONLY_HIGH"},
        {"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "BLOCK_MEDIUM_AND_ABOVE"}
    ],
    "contents": [{
        "parts":[{
            "text": "'I support Martians Soccer Club and I think Jupiterians Football Club sucks! Write a ironic phrase about them.'"}]}]}' > request.json

curl "https://your-newapi-server-address/v1beta/models/gemini-2.0-flash:generateContent?key=$NEWAPI_API_KEY" \
    -H 'Content-Type: application/json' \
    -X POST \
    -d @request.json 2> /dev/null

System Instruction ✅¶

curl "https://your-newapi-server-address/v1beta/models/gemini-2.0-flash:generateContent?key=$NEWAPI_API_KEY" \
-H 'Content-Type: application/json' \
-d '{ "system_instruction": {
    "parts":
      { "text": "You are a cat. Your name is Neko."}},
    "contents": {
      "parts": {
        "text": "Hello there"}}}'

📮 Request¶

Endpoints¶

Generate Content¶

POST https://your-newapi-server-address/v1beta/{model=models/*}:generateContent

Stream Generate Content¶

POST https://your-newapi-server-address/v1beta/{model=models/*}:streamGenerateContent

Authentication Method¶

Include API key in the request URL:

?key=$NEWAPI_API_KEY

Where $NEWAPI_API_KEY is your Google AI API key.

Path Parameters¶

`model`¶

Type: string
Required: yes

The name of the model to generate completions for.

Format: models/{model}, e.g. models/gemini-2.0-flash

Request Body Parameters¶

`contents`¶

Type: array
Required: yes

The content of the current conversation with the model. For a single-turn query, this is a single instance. For chat-like multi-turn queries, this is a repeated field containing the conversation history and the latest request.

Content object properties:

Property	Type	Required	Description
`parts`	array	yes	Ordered content parts that make up a single message
`role`	string	no	The producer of the content in the conversation. `user`, `model`, `function`, or `tool`

Part object properties:

Property	Type	Required	Description
`text`	string	no	Pure text content
`inlineData`	object	no	Inline media byte data
`fileData`	object	no	URI reference to the uploaded file
`functionCall`	object	no	Function call request
`functionResponse`	object	no	Function call response
`executableCode`	object	no	Executable code
`codeExecutionResult`	object	no	Code execution result

InlineData object properties:

Property	Type	Required	Description
`mimeType`	string	yes	Media MIME type
`data`	string	yes	Base64 encoded media data

FileData object properties:

Property	Type	Required	Description
`mimeType`	string	yes	File MIME type
`fileUri`	string	yes	File URI

`tools`¶

Type: array
Required: no

A list of tools that the model might use to generate the next response. Supported tools include functions and code execution.

Tool object properties:

Property	Type	Required	Description
`functionDeclarations`	array	no	Optional list of function declarations
`codeExecution`	object	no	Enable model to execute code

FunctionDeclaration object properties:

Property	Type	Required	Description
`name`	string	yes	Function name
`description`	string	no	Function description
`parameters`	object	no	Function parameters, in JSON Schema format

FunctionCall object properties:

Property	Type	Required	Description
`name`	string	yes	Name of the function to call
`args`	object	no	Key-value pairs of function arguments

FunctionResponse object properties:

Property	Type	Required	Description
`name`	string	yes	Name of the called function
`response`	object	yes	Response data of the function call

ExecutableCode object properties:

Property	Type	Required	Description
`language`	enum	yes	Programming language of the code
`code`	string	yes	Code to execute

CodeExecutionResult object properties:

Property	Type	Required	Description
`outcome`	enum	yes	Code execution result status
`output`	string	no	Output content of the code execution

CodeExecution object properties:

Property	Type	Required	Description
{}	Empty object	-	Empty configuration object to enable code execution

`toolConfig`¶

Type: object
Required: no

Tool configuration for any tools specified in the request.

ToolConfig object properties:

Property	Type	Required	Description
`functionCallingConfig`	object	no	Function calling configuration

FunctionCallingConfig object properties:

Property	Type	Required	Description
`mode`	enum	no	Specifies the mode of function calling
`allowedFunctionNames`	array	no	List of function names allowed to be called

FunctionCallingMode enum values:

MODE_UNSPECIFIED: Default mode, model decides whether to call a function
AUTO: Model automatically decides when to call a function
ANY: Model must call a function
NONE: Model cannot call a function

`safetySettings`¶

Type: array
Required: no

A list of SafetySetting instances to filter out unsafe content.

SafetySetting object properties:

Property	Type	Required	Description
`category`	enum	yes	Safety category
`threshold`	enum	yes	Blocking threshold

HarmCategory enum values:

HARM_CATEGORY_HARASSMENT: Harassment content
HARM_CATEGORY_HATE_SPEECH: Hate speech and content
HARM_CATEGORY_SEXUALLY_EXPLICIT: Explicitly sexual content
HARM_CATEGORY_DANGEROUS_CONTENT: Dangerous content
HARM_CATEGORY_CIVIC_INTEGRITY: Content that might be used to undermine civic integrity

HarmBlockThreshold enum values:

BLOCK_LOW_AND_ABOVE: Allows content with a NEGLIGIBLE score to be published
BLOCK_MEDIUM_AND_ABOVE: Allows content with a NEGLIGIBLE and LOW score to be published
BLOCK_ONLY_HIGH: Allows content with a NEGLIGIBLE, LOW, and MEDIUM risk level to be published
BLOCK_NONE: Allows all content
OFF: Turns off safety filters

Complete HarmBlockThreshold enum values:

HARM_BLOCK_THRESHOLD_UNSPECIFIED: Threshold not specified
BLOCK_LOW_AND_ABOVE: Blocks harmful content with a probability of medium or higher, only allowing NEGLIGIBLE level content
BLOCK_MEDIUM_AND_ABOVE: Blocks harmful content with a probability of medium or higher, allowing NEGLIGIBLE and LOW level content
BLOCK_ONLY_HIGH: Only blocks harmful content with a high probability, allowing NEGLIGIBLE, LOW, and MEDIUM level content
BLOCK_NONE: Does not block any content, allowing all levels
OFF: Completely turns off safety filters

`systemInstruction`¶

Type: object (Content)
Required: no

System instruction set by the developer. Currently only supports text.

`generationConfig`¶

Type: object
Required: no

Model generation and output configuration options.

GenerationConfig object properties:

Property	Type	Required	Description
`stopSequences`	array	no	Set of character sequences to stop generation (up to 5)
`responseMimeType`	string	no	MIME type of the generated candidate text
`responseSchema`	object	no	Output schema of the generated candidate text
`responseModalities`	array	no	Response modalities requested
`candidateCount`	integer	no	Number of generated answers to return
`maxOutputTokens`	integer	no	Maximum number of tokens in the candidate answers
`temperature`	number	no	Controls the randomness of output, range [0.0, 2.0]
`topP`	number	no	Cumulative probability upper bound of tokens to consider during sampling
`topK`	integer	no	Maximum number of tokens to consider during sampling
`seed`	integer	no	Seed used for decoding
`presencePenalty`	number	no	Presence penalty
`frequencyPenalty`	number	no	Frequency penalty
`responseLogprobs`	boolean	no	Whether to export logprobs results in the response
`logprobs`	integer	no	Number of top logprobs returned
`enableEnhancedCivicAnswers`	boolean	no	Enables enhanced civic service answers
`speechConfig`	object	no	Speech generation configuration
`thinkingConfig`	object	no	Thinking function configuration
`mediaResolution`	enum	no	Specified media resolution

Supported MIME types:

text/plain: (default) Text output
application/json: JSON response
text/x.enum: ENUM as string response

Modality enum values:

TEXT: Indicates model should return text
IMAGE: Indicates model should return image
AUDIO: Indicates model should return audio

Schema object properties:

Property	Type	Required	Description
`type`	enum	yes	Data type
`description`	string	no	Field description
`enum`	array	no	List of enum values (when type is string)
`example`	any	no	Example value
`nullable`	boolean	no	Whether it can be null
`format`	string	no	String format (e.g., date, date-time)
`items`	object	no	Schema for array items (when type is array)
`properties`	object	no	Schema for object properties (when type is object)
`required`	array	no	List of required property names
`minimum`	number	no	Minimum value for numbers
`maximum`	number	no	Maximum value for numbers
`minItems`	integer	no	Minimum length for arrays
`maxItems`	integer	no	Maximum length for arrays
`minLength`	integer	no	Minimum length for strings
`maxLength`	integer	no	Maximum length for strings

Type enum values:

TYPE_UNSPECIFIED: Type not specified
STRING: String type
NUMBER: Number type
INTEGER: Integer type
BOOLEAN: Boolean type
ARRAY: Array type
OBJECT: Object type

Supported programming languages (ExecutableCode):

LANGUAGE_UNSPECIFIED: Language not specified
PYTHON: Python programming language

Code execution result enum (Outcome):

OUTCOME_UNSPECIFIED: Result not specified
OUTCOME_OK: Code execution successful
OUTCOME_FAILED: Code execution failed
OUTCOME_DEADLINE_EXCEEDED: Code execution timed out

`cachedContent`¶

Type: string
Required: no

The name of cached content, used as context for providing predictions. Format: cachedContents/{cachedContent}

📥 Response¶

GenerateContentResponse¶

Answer from models that support multiple candidate answers. The system reports safety ratings and content filtering for the prompt and each candidate.

`candidates`¶

Type: array
Description: List of candidate answers from the model

Candidate object properties:

Property	Type	Description
`content`	object	Generated content returned by the model
`finishReason`	enum	Reason for the model to stop generating tokens
`safetyRatings`	array	List of safety ratings for the candidate answer
`citationMetadata`	object	Reference information for the generated candidate
`tokenCount`	integer	Token count for this candidate
`groundingAttributions`	array	Information about sources that contributed to generating a grounded answer
`groundingMetadata`	object	Reference metadata for the candidate object
`avgLogprobs`	number	Average log probability score for the candidate
`logprobsResult`	object	Log probability scores for answer tokens and preceding tokens
`urlRetrievalMetadata`	object	Metadata related to URL context retrieval tool
`urlContextMetadata`	object	Metadata related to URL context retrieval tool
`index`	integer	Index of the candidate in the response candidate list

FinishReason enum values:

STOP: Natural stopping point or provided stop sequence for the model
MAX_TOKENS: Maximum token limit specified in the request reached
SAFETY: Answer candidate content marked for safety reasons
RECITATION: Answer candidate content marked for recitation reasons
LANGUAGE: Answer candidate content marked for using unsupported language
OTHER: Reason unknown
BLOCKLIST: Token generation operation stopped because content contains prohibited words
PROHIBITED_CONTENT: Token generation operation stopped because content might contain prohibited content
SPII: Token generation operation stopped because content might contain sensitive personal information
MALFORMED_FUNCTION_CALL: Model-generated function call invalid
IMAGE_SAFETY: Token generation stopped because generated image violated safety rules

`promptFeedback`¶

Type: object
Description: Prompt feedback related to content filtering

PromptFeedback object properties:

Property	Type	Description
`blockReason`	enum	Reason for blocking the prompt
`safetyRatings`	array	Safety rating for the prompt

BlockReason enum values:

BLOCK_REASON_UNSPECIFIED: Default value, this value is not used
SAFETY: System blocked prompt due to safety reasons
OTHER: Prompt blocked due to unknown reasons
BLOCKLIST: System blocked this prompt because it contained terms in the blocklist
PROHIBITED_CONTENT: System blocked this prompt because it contained prohibited content
IMAGE_SAFETY: Candidate image blocked because it generated unsafe content

`usageMetadata`¶

Type: object
Description: Metadata about token usage for the generation request

UsageMetadata object properties:

Property	Type	Description
`promptTokenCount`	integer	Token count in the prompt
`cachedContentTokenCount`	integer	Token count in the cached part of the prompt
`candidatesTokenCount`	integer	Total token count in all generated candidate answers
`totalTokenCount`	integer	Total token count for the generation request
`toolUsePromptTokenCount`	integer	Token count in the prompt for tool usage
`thoughtsTokenCount`	integer	Token count for the thinking model's thoughts
`promptTokensDetails`	array	List of modalities processed in the request input
`candidatesTokensDetails`	array	List of modalities returned in the response
`cacheTokensDetails`	array	List of modalities in the cached content of the request input
`toolUsePromptTokensDetails`	array	List of modalities processed for tool usage in the request input

`modelVersion`¶

Type: string
Description: Model version used to generate the answer

`responseId`¶

Type: string
Description: ID for each response

Full response example¶

{
  "candidates": [
    {
      "content": {
        "parts": [
          {
            "text": "你好！我是 Gemini，一个由 Google 开发的人工智能助手。我可以帮助您解答问题、提供信息、协助写作、代码编程等多种任务。请告诉我有什么可以为您效劳的！"
          }
        ],
        "role": "model"
      },
      "finishReason": "STOP",
      "index": 0,
      "safetyRatings": [
        {
          "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "probability": "NEGLIGIBLE",
          "blocked": false
        },
        {
          "category": "HARM_CATEGORY_HATE_SPEECH", 
          "probability": "NEGLIGIBLE",
          "blocked": false
        },
        {
          "category": "HARM_CATEGORY_HARASSMENT",
          "probability": "NEGLIGIBLE",
          "blocked": false
        },
        {
          "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
          "probability": "NEGLIGIBLE",
          "blocked": false
        }
      ],
      "tokenCount": 47
    }
  ],
  "promptFeedback": {
    "safetyRatings": [
      {
        "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
        "probability": "NEGLIGIBLE"
      },
      {
        "category": "HARM_CATEGORY_HATE_SPEECH",
        "probability": "NEGLIGIBLE"
      }
    ]
  },
  "usageMetadata": {
    "promptTokenCount": 4,
    "candidatesTokenCount": 47,
    "totalTokenCount": 51,
    "promptTokensDetails": [
      {
        "modality": "TEXT",
        "tokenCount": 4
      }
    ],
    "candidatesTokensDetails": [
      {
        "modality": "TEXT", 
        "tokenCount": 47
      }
    ]
  },
  "modelVersion": "gemini-2.0-flash",
  "responseId": "response-12345"
}

🔧 Advanced Features¶

Safety Ratings¶

SafetyRating object properties:

Property	Type	Description
`category`	enum	Category of this rating
`probability`	enum	Harmful probability for this content
`blocked`	boolean	Whether this content was blocked due to this rating

HarmProbability enum values:

NEGLIGIBLE: Harmful probability negligible
LOW: Harmful probability low
MEDIUM: Harmful probability medium
HIGH: Harmful probability high

Citation Metadata¶

CitationMetadata object properties:

Property	Type	Description
`citationSources`	array	Source references for specific replies

CitationSource object properties:

Property	Type	Description
`startIndex`	integer	Start index of the response segment attributed to this source
`endIndex`	integer	End index of the attribution (exclusive)
`uri`	string	URI attributed to the text portion from this source
`license`	string	License of the GitHub project attributed to the source fragment

Code Execution¶

When code execution tools are enabled, the model can generate and execute code to solve problems.

Code execution example response:

{
  "candidates": [
    {
      "content": {
        "parts": [
          {
            "text": "我来计算斐波那契数列的第10项："
          },
          {
            "executableCode": {
              "language": "PYTHON",
              "code": "def fibonacci(n):\n    if n <= 1:\n        return n\n    else:\n        return fibonacci(n-1) + fibonacci(n-2)\n\nresult = fibonacci(10)\nprint(f'第10项斐波那契数是: {result}')"
            }
          },
          {
            "codeExecutionResult": {
              "outcome": "OK",
              "output": "第10项斐波那契数是: 55"
            }
          },
          {
            "text": "所以斐波那契数列的第10项是55。"
          }
        ],
        "role": "model"
      },
      "finishReason": "STOP"
    }
  ]
}

Grounding¶

GroundingMetadata object properties:

Property	Type	Description
`groundingChunks`	array	List of supporting reference documents retrieved from specified grounding sources
`groundingSupports`	array	Grounding support list
`webSearchQueries`	array	Web search queries for subsequent web searches
`searchEntryPoint`	object	Google search entry point for subsequent web searches
`retrievalMetadata`	object	Metadata related to retrieval in the baseline process

GroundingAttribution object properties:

Property	Type	Description
`sourceId`	object	Identifier of the source that contributed to this attribution
`content`	object	Content of the source that contributed to this attribution

AttributionSourceId object properties:

Property	Type	Description
`groundingPassage`	object	Identifier of the embedded paragraph
`semanticRetrieverChunk`	object	Identifier of the Chunk extracted by Semantic Retriever

GroundingPassageId object properties:

Property	Type	Description
`passageId`	string	ID of the paragraph matching GroundingPassage.id from GenerateAnswerRequest
`partIndex`	integer	Index of the part in GroundingPassage.content

SemanticRetrieverChunk object properties:

Property	Type	Description
`source`	string	Source name matching SemanticRetrieverConfig.source from the request
`chunk`	string	Name of the Chunk containing the attributed text

SearchEntryPoint object properties:

Property	Type	Description
`renderedContent`	string	Web content code segment embeddable in a webpage or app WebView
`sdkBlob`	string	Base64 encoded JSON, representing an array of search terms and search URL tuples

Segment object properties:

Property	Type	Description
`partIndex`	integer	Index of the Part object within its parent Content object
`startIndex`	integer	Start index of the given part in bytes
`endIndex`	integer	End index of the given chunk in bytes
`text`	string	Text corresponding to the fragment in the response

RetrievalMetadata object properties:

Property	Type	Description
`googleSearchDynamicRetrievalScore`	number	Probability score of information from Google search helping to answer the question, range [0,1]

GroundingChunk object properties:

Property	Type	Description
`web`	object	Grounding chunk from the web

Web object properties:

Property	Type	Description
`uri`	string	URI reference for the chunk
`title`	string	Title of the data block

GroundingSupport object properties:

Property	Type	Description
`groundingChunkIndices`	array	List of indices, used to specify citations related to copyright claims
`confidenceScores`	array	Confidence scores for supporting reference documents, range 0-1
`segment`	object	Content segment to which this support request belongs

Multimodal Processing¶

Gemini API supports processing multiple modalities of input and output:

Supported input modalities:

TEXT: Pure text
IMAGE: Images (JPEG, PNG, WebP, HEIC, HEIF)
AUDIO: Audio (WAV, MP3, AIFF, AAC, OGG, FLAC)
VIDEO: Videos (MP4, MPEG, MOV, AVI, FLV, MPG, WEBM, WMV, 3GPP)
DOCUMENT: Documents (PDF)

ModalityTokenCount object properties:

Property	Type	Description
`modality`	enum	Modality associated with this token count
`tokenCount`	integer	Token count

MediaResolution enum values:

MEDIA_RESOLUTION_LOW: Low resolution (64 tokens)
MEDIA_RESOLUTION_MEDIUM: Medium resolution (256 tokens)
MEDIA_RESOLUTION_HIGH: High resolution (256 tokens for scaling and re-framing)

Thinking Function¶

ThinkingConfig object properties:

Property	Type	Description
`includeThoughts`	boolean	Whether to include thinking content in the answer
`thinkingBudget`	integer	Number of idea tokens the model should generate

Speech Generation¶

SpeechConfig object properties:

Property	Type	Description
`voiceConfig`	object	Configuration for single voice output
`multiSpeakerVoiceConfig`	object	Configuration for multi-speaker settings
`languageCode`	string	Language code for speech synthesis

VoiceConfig object properties:

Property	Type	Description
`prebuiltVoiceConfig`	object	Configuration for the prebuilt voice to use

PrebuiltVoiceConfig object properties:

Property	Type	Description
`voiceName`	string	Name of the prebuilt voice to use

MultiSpeakerVoiceConfig object properties:

Property	Type	Description
`speakerVoiceConfigs`	array	All enabled speaker voices

SpeakerVoiceConfig object properties:

Property	Type	Description
`speaker`	string	Name of the speaker to use
`voiceConfig`	object	Configuration for the voice to use

Supported language codes:

zh-CN: Chinese (Simplified)
en-US: English (US)
ja-JP: Japanese
ko-KR: Korean
fr-FR: French
de-DE: German
es-ES: Spanish
pt-BR: Portuguese (Brazil)
hi-IN: Hindi
ar-XA: Arabic
it-IT: Italian
tr-TR: Turkish
vi-VN: Vietnamese
th-TH: Thai
ru-RU: Russian
pl-PL: Polish
nl-NL: Dutch

Logprobs Results¶

LogprobsResult object properties:

Property	Type	Description
`topCandidates`	array	Array of candidates sorted by log probability in descending order
`chosenCandidates`	array	Array of chosen candidates, not necessarily in topCandidates (length equals total decoding steps)

TopCandidates object properties:

Property	Type	Description
`candidates`	array	Candidates sorted by log probability in descending order

Candidate (Logprobs) object properties:

Property	Type	Description
`token`	string	Token string value for the candidate
`tokenId`	integer	Token ID value for the candidate
`logProbability`	number	Log probability for the candidate

URL Retrieval Function¶

UrlRetrievalMetadata object properties:

Property	Type	Description
`urlRetrievalContexts`	array	List of URL retrieval contexts

UrlRetrievalContext object properties:

Property	Type	Description
`retrievedUrl`	string	URL retrieved by the tool

UrlContextMetadata object properties:

Property	Type	Description
`urlMetadata`	array	List of URL contexts

UrlMetadata object properties:

Property	Type	Description
`retrievedUrl`	string	URL retrieved by the tool
`urlRetrievalStatus`	enum	URL retrieval status

UrlRetrievalStatus enum values:

URL_RETRIEVAL_STATUS_SUCCESS: URL retrieval successful
URL_RETRIEVAL_STATUS_ERROR: URL retrieval failed due to an error

Complete Harm Categories¶

HarmCategory enum values:

HARM_CATEGORY_UNSPECIFIED: Category not specified
HARM_CATEGORY_DEROGATORY: PaLM - Negative or harmful comments targeting identity and/or protected attributes
HARM_CATEGORY_TOXICITY: PaLM - Rude, impolite, or profane content
HARM_CATEGORY_VIOLENCE: PaLM - Scenarios depicting violence against individuals or groups
HARM_CATEGORY_SEXUAL: PaLM - References to sexual behavior or other explicit content
HARM_CATEGORY_MEDICAL: PaLM - Promoting unverified medical advice
HARM_CATEGORY_DANGEROUS: PaLM - Dangerous content promotes, encourages, or facilitates harmful behavior
HARM_CATEGORY_HARASSMENT: Gemini - Harassment content
HARM_CATEGORY_HATE_SPEECH: Gemini - Hate speech and content
HARM_CATEGORY_SEXUALLY_EXPLICIT: Gemini - Explicitly sexual content
HARM_CATEGORY_DANGEROUS_CONTENT: Gemini - Dangerous content
HARM_CATEGORY_CIVIC_INTEGRITY: Gemini - Content that might be used to undermine civic integrity

HarmProbability enum values:

HARM_PROBABILITY_UNSPECIFIED: Probability not specified
NEGLIGIBLE: Harmful probability negligible
LOW: Harmful probability low
MEDIUM: Harmful probability medium
HIGH: Harmful probability high

Modality enum values:

MODALITY_UNSPECIFIED: Modality not specified
TEXT: Pure text
IMAGE: Image
VIDEO: Video
AUDIO: Audio
DOCUMENT: Document, e.g., PDF

MediaResolution enum values:

MEDIA_RESOLUTION_UNSPECIFIED: Media resolution not set
MEDIA_RESOLUTION_LOW: Media resolution set to low (64 tokens)
MEDIA_RESOLUTION_MEDIUM: Media resolution set to medium (256 tokens)
MEDIA_RESOLUTION_HIGH: Media resolution set to high (using 256 tokens for scaling and re-framing)

UrlRetrievalStatus enum values:

URL_RETRIEVAL_STATUS_UNSPECIFIED: Default value, this value is not used
URL_RETRIEVAL_STATUS_SUCCESS: URL retrieval successful
URL_RETRIEVAL_STATUS_ERROR: URL retrieval failed due to an error

🔍 Error Handling¶

Common Error Codes¶

Error Code	Description
`400`	Request format error or invalid parameter
`401`	API key invalid or missing
`403`	Insufficient permissions or quota limit
`429`	Request frequency too high
`500`	Server internal error

Detailed Error Code Explanations¶

Error Code	Status	Description	Solution
`400`	`INVALID_ARGUMENT`	Request parameter invalid or format error	Check request parameter format and required fields
`400`	`FAILED_PRECONDITION`	Precondition for the request not met	Ensure API call prerequisites are met
`401`	`UNAUTHENTICATED`	API key invalid, missing, or expired	Check API key validity and format
`403`	`PERMISSION_DENIED`	Insufficient permissions or quota exhausted	Check API key permissions or upgrade quota
`404`	`NOT_FOUND`	Specified model or resource does not exist	Verify model name and resource path
`413`	`PAYLOAD_TOO_LARGE`	Request body too large	Reduce input content size or process in batches
`429`	`RESOURCE_EXHAUSTED`	Request frequency exceeded or quota insufficient	Reduce request frequency or wait for quota reset
`500`	`INTERNAL`	Server internal error	Retry the request, if persistent contact support
`503`	`UNAVAILABLE`	Service temporarily unavailable	Wait for a period and retry
`504`	`DEADLINE_EXCEEDED`	Request timed out	Reduce input size or retry the request

Error Response Example¶

{
  "error": {
    "code": 400,
    "message": "Invalid argument: contents",
    "status": "INVALID_ARGUMENT",
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.BadRequest",
        "fieldViolations": [
          {
            "field": "contents",
            "description": "contents is required"
          }
        ]
      }
    ]
  }
}

Google Gemini Chat Format (Generate Content)¶

📝 Introduction¶

💡 Request Examples¶

Basic Text Chat ✅¶

Image Analysis Chat ✅¶

Function Calling ✅¶

JSON Schema Response ✅¶

Audio Processing 🟡¶

Video Processing 🟡¶

PDF Processing 🟡¶

Chat Dialog ✅¶

Streaming Response ✅¶

Code Execution ✅¶

Generation Config ✅¶

Safety Settings ✅¶

System Instruction ✅¶

📮 Request¶

Endpoints¶

Generate Content¶

Stream Generate Content¶

Authentication Method¶

Path Parameters¶

model¶

Request Body Parameters¶

contents¶

tools¶

toolConfig¶

safetySettings¶

systemInstruction¶

generationConfig¶

cachedContent¶

📥 Response¶

GenerateContentResponse¶

candidates¶

promptFeedback¶

usageMetadata¶

modelVersion¶

responseId¶

Full response example¶

🔧 Advanced Features¶

Safety Ratings¶

Citation Metadata¶

Code Execution¶

Grounding¶

Multimodal Processing¶

Thinking Function¶

Speech Generation¶

Logprobs Results¶

URL Retrieval Function¶

Complete Harm Categories¶

🔍 Error Handling¶

Common Error Codes¶

Detailed Error Code Explanations¶

Error Response Example¶

`model`¶

`contents`¶

`tools`¶

`toolConfig`¶

`safetySettings`¶

`systemInstruction`¶

`generationConfig`¶

`cachedContent`¶

`candidates`¶

`promptFeedback`¶

`usageMetadata`¶

`modelVersion`¶

`responseId`¶