How to setup LLM locally with Ollama
- Published on
What is Ollama?
It's a lightweight framework designed for those who wish to experiment with, customize, and deploy large language models without the hassle of cloud platforms. With Ollama, the power of AI is distilled into a simple, local package, allowing developers and hobbyists alike to explore the vast capabilities of machine learning models.
Setting Up Ollama: A Step-by-Step Approach
First download ollama for your OS here: https://ollama.com/download
Second run the model you want with:
ollama run llama2
Model library
Ollama supports a list of models available on ollama.com/library
Here are some example models that can be downloaded:
Model | Parameters | Size | Download Command |
---|---|---|---|
Llama 2 | 7B | 3.8GB | ollama run llama2 |
Mistral | 7B | 4.1GB | ollama run mistral |
Dolphin Phi | 2.7B | 1.6GB | ollama run dolphin-phi |
Phi-2 | 2.7B | 1.7GB | ollama run phi |
Neural Chat | 7B | 4.1GB | ollama run neural-chat |
Starling | 7B | 4.1GB | ollama run starling-lm |
Code Llama | 7B | 3.8GB | ollama run codellama |
Llama 2 Uncensored | 7B | 3.8GB | ollama run llama2-uncensored |
Llama 2 13B | 13B | 7.3GB | ollama run llama2:13b |
Llama 2 70B | 70B | 39GB | ollama run llama2:70b |
Orca Mini | 3B | 1.9GB | ollama run orca-mini |
Vicuna | 7B | 3.8GB | ollama run vicuna |
LLaVA | 7B | 4.5GB | ollama run llava |
Gemma | 2B | 1.4GB | ollama run gemma:2b |
Gemma | 7B | 4.8GB | ollama run gemma:7b |
Memory Requirements: Keep in mind, running these models isn't light on resources. Ensure you have at least 8 GB of RAM for 7B models, and more for the larger ones, to keep your AI running smoothly.
Customization
With Ollama, you're not just running models; you're tailoring them. Import models with ease and customize prompts to fit your specific needs. Fancy a model that responds as Mario? Ollama makes it possible with simple command lines:
Customize a prompt
Models from the Ollama library can be customized with a prompt. For example, to customize the llama2 model:
ollama pull llama2
Create a Modelfile:
FROM llama2
# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1
# set the system message
SYSTEM """
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
"""
Next, create and run the model:
ollama create mario -f ./Modelfile ollama run mario
hi Hello! It's your friend Mario.
Conclusion
For more information on Ollama and to access additional resources, visit Ollama on GitHub. https://github.com/ollama/ollama
If you found this content helpful ⇢