Go + Ollama = Simple Local AI

Davin Hills
6 min readSep 6, 2024

--

Cyborg bent over a small laptop typing

Are you a Go developer interested in experimenting with local AI? This guide will walk you through setting up and interacting with a local Large Language Model (LLM) in Go.

Step 1: Setting Up a Local LLM

First, you’ll need to get an LLM running on your local machine. For this, we’ll use Ollama (available on GitHub). While there are several options for loading models locally, I’ve found Ollama to be the easiest to set up. Ollama supports macOS, Linux, and Windows.

Since I haven’t tried it on Windows, I’ll focus on macOS and Linux, but the process should be similar on Windows.

1. Download Ollama from the official website.

2. Extract the compressed file.

3. On macOS, move the application bundle to the /Applications directory.

4. Start the application.

Next, open a terminal on your OS and choose a model to run. The simplest approach is to pick a model from the Ollama models list. You can install other models from sources like Hugging Face, but we’ll keep it simple for now. Once you’ve selected a model, run one of the following commands in your terminal:

ollama pull <model name>

or

ollama run <model name>

Congratulations! You now have an LLM running locally on your machine.

Step 2: Setting Up Go for LLM Interaction

Now, let’s focus on the Go side of things. Assuming you already have a Go environment set up (if not, refer to the Go installation instructions), you’re ready to begin.

We’ll skip using any libraries and build the interaction from scratch.

1. Open a terminal and navigate to the directory where you want to create your project, or set it up using your preferred workflow:

mkdir go-ollama
cd go-ollama
go mod init go-ollama

2. Open a new file, main.go, in your favorite editor. Start by defining a chat request structure based on the Ollama API documentation:

package main

type Request struct {
Model string `json:"model"`
Messages []Message `json:"messages"`
Stream bool `json:"stream"`
}

type Message struct {
Role string `json:"role"`
Content string `json:"content"`
}

Let’s break this down:

Model: The model you downloaded using Ollama. In my case that is llama3.1

Stream: Determines if you receive a constant stream of responses (set to true) or a single response (set to false). We’ll use false for this example, but you can experiment with streaming later.

Messages: This structure contains the questions you’ll send to the AI.

Here’s an example using the question, “Why is the sky blue?”:

package main

type ChatRequest struct {
Model string `json:"model"`
Messages []Message `json:"messages"`
Stream bool `json:"stream"`
}

type Message struct {
Role string `json:"role"`
Content string `json:"content"`
}

func main() {
msg := Message{
Role: "user",
Content: "Why is the sky blue?",
}
req := Request{
Model: "llama3.1",
Stream: false,
Messages: []Message{msg},
}
}

Step 3. Sending the Request and Receiving a Response

Next, let’s send this request to Ollama and get a response. Here’s the response structure based on the documentation:

package main

import "time"

type Response struct {
Model string `json:"model"`
CreatedAt time.Time `json:"created_at"`
Message Message `json:"message"`
Done bool `json:"done"`
TotalDuration int64 `json:"total_duration"`
LoadDuration int `json:"load_duration"`
PromptEvalCount int `json:"prompt_eval_count"`
PromptEvalDuration int `json:"prompt_eval_duration"`
EvalCount int `json:"eval_count"`
EvalDuration int64 `json:"eval_duration"`
}

The API’s URL is typically http://localhost:11434/api/chat.

Let’s create a simple HTTP client using Go’s standard library:

func talkToOllama(url string, ollamaReq Request) (*Response, error) {
js, err := json.Marshal(&ollamaReq)
if err != nil {
return nil, err
}
client := http.Client{}
httpReq, err := http.NewRequest(http.MethodPost, url, bytes.NewReader(js))
if err != nil {
return nil, err
}
httpResp, err := client.Do(httpReq)
if err != nil {
return nil, err
}
defer httpResp.Body.Close()
ollamaResp := Response{}
err = json.NewDecoder(httpResp.Body).Decode(&ollamaResp)
return &ollamaResp, err
}

Here’s a quick breakdown:

1. talkToOllama: This function takes the Ollama API URL and the request structure.

2. JSON Marshaling: Converts the request structure to JSON.

3. HTTP Request: Creates and sends a POST request with the JSON payload.

4. Response Handling: Decodes the JSON response into the Response structure and returns it.

Step 4: Running the Go Program

Here’s the complete program:

package main

import (
"bytes"
"encoding/json"
"fmt"
"net/http"
"os"
"time"
)

type Request struct {
Model string `json:"model"`
Messages []Message `json:"messages"`
Stream bool `json:"stream"`
}

type Message struct {
Role string `json:"role"`
Content string `json:"content"`
}

type Response struct {
Model string `json:"model"`
CreatedAt time.Time `json:"created_at"`
Message Message `json:"message"`
Done bool `json:"done"`
TotalDuration int64 `json:"total_duration"`
LoadDuration int `json:"load_duration"`
PromptEvalCount int `json:"prompt_eval_count"`
PromptEvalDuration int `json:"prompt_eval_duration"`
EvalCount int `json:"eval_count"`
EvalDuration int64 `json:"eval_duration"`
}

const defaultOllamaURL = "http://localhost:11434/api/chat"

func main() {
start := time.Now()
msg := Message{
Role: "user",
Content: "Why is the sky blue?",
}
req := Request{
Model: "llama3.1",
Stream: false,
Messages: []Message{msg},
}
resp, err := talkToOllama(defaultOllamaURL, req)
if err != nil {
fmt.Println(err)
os.Exit(1)
}
fmt.Println(resp.Message.Content)
fmt.Printf("Completed in %v", time.Since(start))
}

func talkToOllama(url string, ollamaReq Request) (*Response, error) {
js, err := json.Marshal(&ollamaReq)
if err != nil {
return nil, err
}
client := http.Client{}
httpReq, err := http.NewRequest(http.MethodPost, url, bytes.NewReader(js))
if err != nil {
return nil, err
}
httpResp, err := client.Do(httpReq)
if err != nil {
return nil, err
}
defer httpResp.Body.Close()
ollamaResp := Response{}
err = json.NewDecoder(httpResp.Body).Decode(&ollamaResp)
return &ollamaResp, err
}

To run the program, use the terminal:

go run main.go

You should see the response similar to this:

The sky appears blue to us because of a phenomenon called scattering, which occurs when sunlight interacts with the tiny molecules of gases in the atmosphere. Here's a simplified explanation:

1. Sunlight enters the Earth's atmosphere**: When sunlight enters our atmosphere, it consists of a broad spectrum of electromagnetic radiation, including all the colors of the visible light (red, orange, yellow, green, blue, indigo, and violet).
2. Scattering occurs**: As sunlight travels through the atmosphere, it encounters tiny molecules of gases such as nitrogen (N2) and oxygen (O2). These molecules are much smaller than the wavelength of light, so they scatter the shorter (blue) wavelengths more efficiently than the longer (red) wavelengths.
3. Blue light is scattered in all directions**: The scattering process favors blue light because it has a shorter wavelength, which allows it to be deflected by the gas molecules more easily. This scattered blue light reaches our eyes from all parts of the sky.
4. Our eyes perceive the sky as blue**: Since we see the scattered blue light from every direction in the atmosphere, our brains interpret this as a blue color for the entire sky.

Other factors can affect the apparent color of the sky, such as:

* Dust and pollutants**: Tiny particles in the air can scatter light in a way that adds a reddish tint to the sky.
* Clouds: When sunlight passes through water droplets or ice crystals in clouds, it scatters in all directions, giving the sky a white or gray appearance.
* Time of day: The angle of the sun changes throughout the day, which can alter the intensity and color of the scattered light. For example, during sunrise and sunset, the light has to travel through more of the Earth's atmosphere, scattering off more particles and making the sky appear redder.

In summary, the sky appears blue due to the scattering of sunlight by the tiny molecules in the atmosphere, which favors shorter wavelengths like blue light.
Completed in 38.315152042s

Conclusion

In this article, we set up a local LLM and queried it using Go’s standard library. This is just the beginning — feel free to expand on this setup. You could make the code more production-ready with timeouts or by hosting Ollama on a different machine. You could even build out user interactions, create conversations, or develop a logic chain for RAG applications. The possibilities are endless!

--

--

Davin Hills
Davin Hills

Written by Davin Hills

30+ years of experience in software engineering and architecture.

No responses yet