Microsoft Azure · Schema

CreateChatCompletionRequest

Request body for creating a chat completion.

API ManagementCloudCloud ComputingEnterpriseInfrastructure as a ServicePlatform as a ServiceT1

Properties

Name Type Description
messages array A list of messages comprising the conversation so far.
temperature number Sampling temperature between 0 and 2. Higher values make the output more random, while lower values make it more focused and deterministic.
top_p number An alternative to sampling with temperature called nucleus sampling.
n integer How many chat completion choices to generate for each input message.
stream boolean If set, partial message deltas will be sent as server-sent events.
stop string Up to 4 sequences where the API will stop generating further tokens.
max_tokens integer The maximum number of tokens that can be generated in the chat completion.
presence_penalty number Positive values penalize new tokens based on whether they appear in the text so far.
frequency_penalty number Positive values penalize new tokens based on their existing frequency in the text so far.
response_format object An object specifying the format that the model must output.
seed integer If specified, the system will make a best effort to sample deterministically.
tools array A list of tools the model may call.
tool_choice string Controls which (if any) tool is called by the model.
user string A unique identifier representing your end-user.
View JSON Schema on GitHub

JSON Schema

azure-openai-service-create-chat-completion-request-schema.json Raw ↑
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "CreateChatCompletionRequest",
  "type": "object",
  "description": "Request body for creating a chat completion.",
  "properties": {
    "messages": {
      "type": "array",
      "description": "A list of messages comprising the conversation so far."
    },
    "temperature": {
      "type": "number",
      "description": "Sampling temperature between 0 and 2. Higher values make the output more random, while lower values make it more focused and deterministic."
    },
    "top_p": {
      "type": "number",
      "description": "An alternative to sampling with temperature called nucleus sampling."
    },
    "n": {
      "type": "integer",
      "description": "How many chat completion choices to generate for each input message."
    },
    "stream": {
      "type": "boolean",
      "description": "If set, partial message deltas will be sent as server-sent events."
    },
    "stop": {
      "type": "string",
      "description": "Up to 4 sequences where the API will stop generating further tokens."
    },
    "max_tokens": {
      "type": "integer",
      "description": "The maximum number of tokens that can be generated in the chat completion."
    },
    "presence_penalty": {
      "type": "number",
      "description": "Positive values penalize new tokens based on whether they appear in the text so far."
    },
    "frequency_penalty": {
      "type": "number",
      "description": "Positive values penalize new tokens based on their existing frequency in the text so far."
    },
    "response_format": {
      "type": "object",
      "description": "An object specifying the format that the model must output."
    },
    "seed": {
      "type": "integer",
      "description": "If specified, the system will make a best effort to sample deterministically."
    },
    "tools": {
      "type": "array",
      "description": "A list of tools the model may call."
    },
    "tool_choice": {
      "type": "string",
      "description": "Controls which (if any) tool is called by the model."
    },
    "user": {
      "type": "string",
      "description": "A unique identifier representing your end-user."
    }
  }
}