Model

A deployed AI model for inference serving.

AICloud ComputingGPUHPCMachine LearningSemiconductorFortune 500

Properties

Name Type Description
id string
name string
framework string
status string
endpoint string Inference endpoint URL.
View JSON Schema on GitHub

JSON Schema

cloud-api-model-schema.json Raw ↑
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "Model",
  "description": "A deployed AI model for inference serving.",
  "type": "object",
  "properties": {
    "id": {
      "type": "string"
    },
    "name": {
      "type": "string"
    },
    "framework": {
      "type": "string",
      "enum": [
        "vLLM",
        "TGI",
        "TorchServe",
        "Triton"
      ]
    },
    "status": {
      "type": "string",
      "enum": [
        "deploying",
        "serving",
        "stopped",
        "failed"
      ]
    },
    "endpoint": {
      "type": "string",
      "description": "Inference endpoint URL."
    }
  }
}