Accessing Ollama REST APIs Locally

Ollama runs an HTTP REST API server on port 11434. You can interact with your local models directly using curl or any HTTP client library.

1. Chat Completions Endpoint

Endpoint URL

URL: POST http://localhost:11434/api/chat

Request Payload example

Send a POST request containing the target model name and messages history. By default, responses are streamed:

curl http://localhost:11434/api/chat -d '{
  "model": "qwen2.5",
  "messages": [
    { "role": "user", "content": "Why is the sky blue?" }
  ]
}'

Disabling Streaming

If you want to receive the entire response in a single JSON payload instead of a stream, set the stream parameter to false:

curl http://localhost:11434/api/chat -d '{
  "model": "qwen2.5",
  "messages": [
    { "role": "user", "content": "What is the capital of France?" }
  ],
  "stream": false
}'

2. Text Generation Endpoint

Endpoint URL

URL: POST http://localhost:11434/api/generate

Use this endpoint when you want to supply a single raw prompt string rather than structured chat messages:

curl http://localhost:11434/api/generate -d '{
  "model": "qwen2.5",
  "prompt": "Write a 5-word tagline for a software store.",
  "stream": false
}'

3. Embeddings Generation Endpoint

Endpoint URL

URL: POST http://localhost:11434/api/embeddings

To convert a text string into a numerical vector (useful for semantic search applications):

curl http://localhost:11434/api/embeddings -d '{
  "model": "qwen2.5",
  "prompt": "Next.js routing models"
}'

Published on Jun 16, 2026 Last updated: Jun 16, 2026

Getting Started

Popular Models

Http Api Sdks

Practice Project

Resources

Accessing Ollama REST APIs Locally

1. Chat Completions Endpoint

Endpoint URL

Request Payload example

Disabling Streaming

2. Text Generation Endpoint

Endpoint URL

3. Embeddings Generation Endpoint

Endpoint URL