{
    "title": "How to Check Ray Serve Settings",
    "turns": [{
        "turn": 1,
        "user": "Pip install vllm ray\\[serve\\]\n\nray start --head --dashboard-host=0.0.0.0\n\ncd workspace/ && serve run config.yaml\n\nAfter deploying the server like this, how can I check config = 'embed\\_model'?\n\nHere's what config.yaml looks like.\n\nfrom vllm.entrypoints.openai.serving\\_chat import OpenAIServingChat\n\nfrom transformers import AutoTokenizer, AutoModel\n\nfrom torch.utils.data import DataLoader\n\nimport traceback\n\nimport uuid\n\nassert os.environ\\[\"HUGGING\\_FACE\\_HUB\\_TOKEN\"\\]\n\nlogger = logging.getLogger(\"ray.serve\")\n\napp = FastAPI()\n\nfrom pydantic import BaseModel\n\nclass RequestModel(BaseModel):\n\n    messages : list\n\n    request\\_id : str\n\n    params : dict\n\nclass GenerateResponse(BaseModel):\n\n    output: str\n\n    prompt\\_tokens: int\n\n    output\\_tokens: int\n\n    finish\\_reason: Optional\\[str\\]\n\nclass MultiRequestModel(BaseModel):\n\n    msg : list\n\n    request\\_id : list\n\n    params : dict\n\nclass SimpleRequestModel(BaseModel):\n\n    messages : list\n\n    request\\_id : str\n\nclass Message(BaseModel):\n\n    role: Literal\\[\"system\", \"user\", \"assistant\"\\]\n\n    content: str\n\nclass ChatRequest(BaseModel):\n\n    model: str\n\n    messages: List\\[Message\\]\n\n    temperature: float = 0.7\n\n    max\\_tokens: int = 256\n\n# Define Pydantic models for v1/response\n\nclass InputItem(BaseModel):\n\n    type: Literal\\[\"text\"\\]\n\n    text: str\n\nclass ToolDefinition(BaseModel):\n\n    type: Literal\\[\"function\"\\]\n\n    name: str\n\n    description: Optional\\[str\\]\n\n    parameters: Dict\\[str, Any\\]\n\nclass ToolResult(BaseModel):\n\n    tool\\_name: str\n\n    result: str\n\nclass ResponseRequest(BaseModel):\n\n    model: str\n\n    instructions: Optional\\[str\\] = \"\"\n\n    input: Union\\[str, List\\[InputItem\\]\\]\n\n    temperature: float = 0.7\n\n    max\\_output\\_tokens: int = 512\n\n    tools: Optional\\[List\\[ToolDefinition\\]\\] = \\[\\]\n\n    tool\\_choice: Optional\\[str\\] = \"auto\"\n\n    tool\\_results: Optional\\[List\\[ToolResult\\]\\] = \\[\\]\n\n# Convert messages to prompt\n\ndef convert\\_messages\\_to\\_prompt(messages: List\\[Message\\]) -\\> str:\n\n    prompt = \"\"\n\n    for msg in messages:\n\n        if msg.role == \"system\":\n\n            prompt += f\"\\<|system|\\>\\\\n{msg.content}\\\\n\"\n\n        elif msg.role == \"user\":\n\n            prompt += f\"\\<|user|\\>\\\\n{msg.content}\\\\n\"\n\n        elif msg.role == \"assistant\":\n\n            prompt += f\"\\<|assistant|\\>\\\\n{msg.content}\\\\n\"\n\n    prompt += \"\\<|assistant|\\>\\\\n\"\n\n    return prompt\n\nconfig = {\n\n    'model': 'mistralai/Mistral-Nemo-Instruct-2407',\n\n    'max\\_model\\_len': 1024\\*60,\n\n    'embed\\_model': 'nlpai-lab/KURE-v1'\n\n}",
        "assistant": "Hello! You asked how to check the internal setting value (`config`) of an application deployed using Ray Serve.\n\nBased on the `config.yaml` (which is actually a Python script) you provided, here are the best ways to check the value of `config['embed_model']` on the running server.\n\n### Key Summary\n\nThe most common and recommended way is to **add a status-check API endpoint**. This way, you can easily check the settings anytime using `curl` or a web browser.\n\n* * *\n\n### Method 1: Add an API Endpoint to Check Settings (Most Recommended)\n\nSince you're using a FastAPI application (`app`), you can simply add a GET endpoint that returns the entire `config` dictionary or just part of it as JSON.\n\n#### 1. Modify `config.yaml` (Python Script)\n\nAdd the following to your existing code. This uses the `@app.get` decorator to create a `/config` route that returns the `config` object as JSON.\n\n```"
    }]
}