# Simple Ollama hosting

We have a small 2x A100 (40GB) server w/ a 100% uptime Ollama instance. <br>

* Ollama [Rest API Docs](https://github.com/ollama/ollama/blob/main/docs/api.md)<br>
* Ollama [Python API Docs](https://github.com/ollama/ollama-python)&#x20;

{% hint style="danger" %}
This service is no longer publicly available. It's being used in production for UIUC.chat. For stability, we cannot allow arbitrary use.
{% endhint %}

{% hint style="warning" %}
**Only use `llama3.1:70b` and `nomic-embed-text:v1.5`**\
\
Requesting any other model will cause "thrashing" since there's not enough GPU memory and nobody's jobs will complete. Do not /pull new models.
{% endhint %}

### Examples

#### Llama3 70b-instruct

```bash
# bash
curl https://ollama.ncsa.ai/api/chat -d '{
  "model": "llama3.1:70b",
  "messages": [
    { "role": "user", "content": "Write a long detailed bash program" }
  ]
}'

# python 
from ollama import Client

client = Client(host='https://ollama.ncsa.ai')
response = client.chat(model='llama3.1:70b', messages=[
    {
        'role': 'user',
        'content': 'Why is the sky blue?',
    },
])
```

#### Text embeddings

```bash
# bash
curl https://ollama.ncsa.ai/api/embeddings -d '{
  "model": "nomic-embed-text:v1.5",
  "prompt": "The sky is blue because of Rayleigh scattering"
}'

# python 
from ollama import Client

client = Client(host='https://ollama.ncsa.ai')
client.embeddings(model='nomic-embed-text:v1.5', prompt='The sky is blue because of rayleigh scattering')
```
