Accessing Ollama REST APIs Locally
Ollama runs an HTTP REST API server on port 11434. You can interact with your local models directly using curl or any HTTP client library.
1. Chat Completions Endpoint
Endpoint URL
- URL:
POST http://localhost:11434/api/chat
Request Payload example
Send a POST request containing the target model name and messages history. By default, responses are streamed:
curl http://localhost:11434/api/chat -d '{
"model": "qwen2.5",
"messages": [
{ "role": "user", "content": "Why is the sky blue?" }
]
}'Disabling Streaming
If you want to receive the entire response in a single JSON payload instead of a stream, set the stream parameter to false:
curl http://localhost:11434/api/chat -d '{
"model": "qwen2.5",
"messages": [
{ "role": "user", "content": "What is the capital of France?" }
],
"stream": false
}'2. Text Generation Endpoint
Endpoint URL
- URL:
POST http://localhost:11434/api/generate
Use this endpoint when you want to supply a single raw prompt string rather than structured chat messages:
curl http://localhost:11434/api/generate -d '{
"model": "qwen2.5",
"prompt": "Write a 5-word tagline for a software store.",
"stream": false
}'3. Embeddings Generation Endpoint
Endpoint URL
- URL:
POST http://localhost:11434/api/embeddings
To convert a text string into a numerical vector (useful for semantic search applications):
curl http://localhost:11434/api/embeddings -d '{
"model": "qwen2.5",
"prompt": "Next.js routing models"
}'Published on Last updated: