Back to roadmaps ollama Course

Accessing Ollama REST APIs Locally

Ollama runs an HTTP REST API server on port 11434. You can interact with your local models directly using curl or any HTTP client library.


1. Chat Completions Endpoint

Endpoint URL

  • URL: POST http://localhost:11434/api/chat

Request Payload example

Send a POST request containing the target model name and messages history. By default, responses are streamed:

curl http://localhost:11434/api/chat -d '{
  "model": "qwen2.5",
  "messages": [
    { "role": "user", "content": "Why is the sky blue?" }
  ]
}'

Disabling Streaming

If you want to receive the entire response in a single JSON payload instead of a stream, set the stream parameter to false:

curl http://localhost:11434/api/chat -d '{
  "model": "qwen2.5",
  "messages": [
    { "role": "user", "content": "What is the capital of France?" }
  ],
  "stream": false
}'

2. Text Generation Endpoint

Endpoint URL

  • URL: POST http://localhost:11434/api/generate

Use this endpoint when you want to supply a single raw prompt string rather than structured chat messages:

curl http://localhost:11434/api/generate -d '{
  "model": "qwen2.5",
  "prompt": "Write a 5-word tagline for a software store.",
  "stream": false
}'

3. Embeddings Generation Endpoint

Endpoint URL

  • URL: POST http://localhost:11434/api/embeddings

To convert a text string into a numerical vector (useful for semantic search applications):

curl http://localhost:11434/api/embeddings -d '{
  "model": "qwen2.5",
  "prompt": "Next.js routing models"
}'
Published on Last updated: