Building an AI Brain: Part 1

Davin Hills
7 min readMay 5, 2024

--

Swarm Brain on GitHub https://github.com/dshills/swarm

I’ve been experimenting with AI for some time now, starting with my-ai, a project that demonstrated how to connect to ChatGPT using Go. Then came ai-manager, which integrates with most major LLM text generation APIs, including Ollama, a crucial component for this project. My latest endeavor, TermAI, brings AI capabilities directly into the terminal.

For those unfamiliar, Ollama enables individuals to run LLM models on personal computers. When combined with recent high-profile, public model releases from prominent AI companies, such as Meta’s Llama 3, Microsoft’s WizardLM2, Pie3, and other robust models, the potential for running high-performance text generation LLMs privately is becoming increasingly promising.

However, I’ve identified several issues with both home-based and large-scale private models. Despite significant advancements, home models are relatively slow on moderately priced hardware. Moreover, their output quality doesn’t match the standards set by industry giants like OpenAI, Mistral, Google, and Anthropic. Even the most advanced AIs require massive hardware resources to operate efficiently.

To illustrate the limitations, consider GPT-4 as the Einstein of the LLM world. If you had a genius like him within a company, how would you utilize his capabilities? Would you simply have him sit in an office, answering questions from visitors? That wouldn’t be very effective, even with his exceptional intelligence. What about a department of smart, but not genius-level, individuals?

Like any software, is it possible to scale out rather than up? This concept is not novel, and in the Python world, where most AI development takes place, several Agent-style systems like Langchain, CrewAI, and others are working to make AI more practical and useful, rather than just an impressive magic trick. Of course, I want to build it myself in Go. Not because I think it will be better, but because it’s how I learn, and who knows, maybe it will be better.

What If?

  • I could build a framework that utilized multiple LLM models
  • Allow it to be configured for general purpose or specialized tasks
  • Make it scriptable to allow for transformations of output

So I built it. Has it taken over the world? ahhh no (refer to title: Part 1). It has a long way to go. But I have a map of features I want to add and it’s already working. If you’ve made it this far let me tell you about it.

Concepts

Brain

The Brain is the top-level interface of a Swarm Brain, responsible for processing tasks and generating results. It’s loaded from a set of definition files, which define the layers, neurons, and transmitters that make up the system. It has a single function Think(Task) output

Layer

Layers are the building blocks of a Swarm Brain, each with a single functionality: to consider a task. They receive a task list, transmit it to a group of neurons for processing, and then update the task list with the results. A layer can have 1 or more Neurons depending on the size of the task list. It will run every task in the task list.

Neuron

Neurons are the AI models that process tasks and generate results. They’re given a persona, prompt, and model, and use these to generate a response to the task.

Transmitter

Transmitters move signals between layers and neurons, allowing scripts to be added to modify tasks, and results, or split results into multiple tasks.

Signal

Signals carry tasks and results throughout the system, allowing the Brain to process tasks and generate results.

Implementation

If you want to follow along at home you can find the code at https://github.com/dshills/swarm.

If you just want to try it out and build your brain skip this section and take a look at the configuration section.

type Brain interface {
Think(task string) <-chan signal.Result
}
type Layer interface {
Consider(signal.Signal, task.List) signal.Signal
}
type Transmitter interface {
Transmit(signal.Signal) signal.Signal
}
type Neuron interface {
Work(signal.Signal) signal.Signal
}

Simplified Flow

Brain

func (b *_brain)Think(task string) <-chan signal.Result {
s := convToSignal(task)
tl := createTaskList(task)
for _, layer := range layers {
s := layer.Consider(s, tl)
if s.IsComplete || s.Err() != nil {
resultCh <- s.Final()
}
}

The Brain converts the task to a signal and adds it to the task list. It then iterates through the layers, calling each one to process the signal. If the signal is complete or an error occurs, it returns the final result.

Layer

func (l *_layer)Consider(s Signal, tl taskList) Signal {
for _, transmitter := range l.transmitters {
task := taskList.Pop()
if task == "" {
break
}
newSignal := s.NewChild(task)
transmitter.Transmit(newSignal)
updateTaskList(newSignal, tl)
}
}

A Layer loops through its transmitters and task list, delegating tasks to each transmitter. A new signal is created from the original and passed to the transmitter. If a single result is returned, it is added to the task list. If a set of parsed results is returned, they are added to the task list.

Transmitter

func (t *_transmitter) Transmit(s signal.Signal) signal.Signal {
// Update the task
tsk, err := t.hookUpdateTask(s.Task())
if err != nil {
s.SetError(err)
return s
}
s.SetTask(tsk)

// Do the work
s = t.to.Work(s)

// Update the result
res, err := t.hookUpdateResult(s.Result())
if err != nil {
s.SetError(err)
return s
}
s.SetResult(res, s.ModelUsed())

// Parse the result
parsed, err := t.hookParseResults(s.Result())
if err != nil {
s.SetError(err)
return s
}
s.SetParsedResult(parsed...)

return s
}

The Transmitter updates the task, sends the signal to the Neuron, and then updates the result. Finally, it parses the result into multiple tasks.

Neuron

func (n *_neuron) Work(s signal.Signal) signal.Signal {
n.makeConv(s)
resp, err := n.com.Converse(n.modelName, n.conv)
if err != nil {
s.SetError(err)
return s
}
n.conv = append(n.conv, resp.Message)
s.SetResult(resp.Message.Content, n.modelName)
return s
}

The Neuron converts the signal/task to an AI conversation using its persona and prompt. It then records the result and updates the ongoing conversation.

Configuration

Configuring a Brain involves a few files and a specific directory structure. To take advantage of, what I consider, the real power a bit of Lua coding is also required.

brain configuration directory layout

models.yaml: This is used to define the available models and how to connect to them.

---
- Host: Groq
Model: llama3-70b-8192
API: OpenAI
BaseURL: https://api.groq.com/openai/v1
APIKey: <YOUR API KEY HERE>
Aliases:
- llama370b
- groq

It currently only supports OpenAI and Ollama’s APIs.

layers: A layer file defines the layers of the brain

---
Persona: You are an expert at answering questions about motor sports.
Prompt: |
Write a detailed answer about %%TASK%%.

NeuronModels:
- groq

ChangeResultFns:
- race_tuna

%%TASK%% can be part of the prompt and will be replaced with the actual task. If not found, the task will be added after the prompt.

Lua function names can be added for ChangeResultsFns, ChangeTaskFns, and ResultToTasksFns.

lua: Defines Lua functions for the layers.

The file name should match the function name. e.g. func do_something() should be stored in a file named do_something.lua.

Two types of Lua functions are currently supported.

1:1 — Functions that take a string parameter and return a string. Used for ChangeTask and ChangeResult.

1:Many — Functions that take a string parameter and return multiple strings. Used for ResultToTasks.

function race_tuna(str)
str = string.gsub(str, "motor sport", "tuna hunters")
str = string.gsub(str, "drivers", "hunters")
str = string.gsub(str, "driver", "hunter")
str = string.gsub(str, "racing", "tuna hunting")
str = string.gsub(str, "race", "tuna")
return str
end

brain.yaml: Brings it all together

---
Brain: Test Brain
Layers:
- test_layer1

Give it a name and list the layers in order of execution

That’s it… Compile the included swarmBrain executable point it at your brain definition and start asking questions.

What’s Next

As we move forward, we’re excited to explore the following developments:

  • Support for additional Large Language Model (LLM) APIs from Google, Anthropic, and Mistal
  • More hook points to inject Lua code for customization
  • Rethinking the error handling process: if an error occurs or the result doesn’t meet the criteria, the task will be retried
  • Quality Assurance (QA) functions for verifying correctness
  • Decision-making capabilities to choose from multiple answers
  • Checkpoints: the ability to pause processing and wait for human input
  • Reducers: converting answers from a list of tasks into a single result
  • Researchers: functions or components that can read files, retrieve data from the web, and integrate it into the conversation or train the brain
  • Public Relations: functions or components that can write to files and create emails to communicate information from the brain

Conclusion

I’m thrilled with the progress we’ve made so far. Although it’s still early days, the potential is exciting.

If you’re interested in contributing to the Go coding side of the brain, feel free to fork the project, submit pull requests, or simply leave a comment.

If you’re interested in building brains, I’d love to hear about your experiences and learn how we can improve the functionality to help you achieve your goals.

Stay tuned for Part 2, where I’ll continue to share my journey towards building a personal AI brain!

--

--

Davin Hills
Davin Hills

Written by Davin Hills

30+ years of experience in software engineering and architecture.

No responses yet