Agent Skill
2/7/2026

grepai-ollama-setup

Install and configure Ollama for local embeddings with GrepAI. Use this skill when setting up private, local embedding generation.

Y
yoanbernabeu
8GitHub Stars
1Views
npx skills add yoanbernabeu/grepai-skills

SKILL.md

Namegrepai-ollama-setup
DescriptionInstall and configure Ollama for local embeddings with GrepAI. Use this skill when setting up private, local embedding generation.

name: grepai-ollama-setup description: Install and configure Ollama for local embeddings with GrepAI. Use this skill when setting up private, local embedding generation.

Ollama Setup for GrepAI

This skill covers installing and configuring Ollama as the local embedding provider for GrepAI. Ollama enables 100% private code search where your code never leaves your machine.

When to Use This Skill

  • Setting up GrepAI with local, private embeddings
  • Installing Ollama for the first time
  • Choosing and downloading embedding models
  • Troubleshooting Ollama connection issues

Why Ollama?

BenefitDescription
šŸ”’ PrivacyCode never leaves your machine
šŸ’° FreeNo API costs
⚔ FastLocal processing, no network latency
šŸ”Œ OfflineWorks without internet

Installation

macOS (Homebrew)

# Install Ollama
brew install ollama

# Start the Ollama service
ollama serve

macOS (Direct Download)

  1. Download from ollama.com
  2. Open the .dmg and drag to Applications
  3. Launch Ollama from Applications

Linux

# One-line installer
curl -fsSL https://ollama.com/install.sh | sh

# Start the service
ollama serve

Windows

  1. Download installer from ollama.com
  2. Run the installer
  3. Ollama starts automatically as a service

Downloading Embedding Models

GrepAI requires an embedding model to convert code into vectors.

Recommended Model: nomic-embed-text

# Download the recommended model (768 dimensions)
ollama pull nomic-embed-text

Specifications:

  • Dimensions: 768
  • Size: ~274 MB
  • Performance: Excellent for code search
  • Language: English-optimized

Alternative Models

# Multilingual support (better for non-English code/comments)
ollama pull nomic-embed-text-v2-moe

# Larger, more accurate
ollama pull bge-m3

# Maximum quality
ollama pull mxbai-embed-large
ModelDimensionsSizeBest For
nomic-embed-text768274 MBGeneral code search
nomic-embed-text-v2-moe768500 MBMultilingual codebases
bge-m310241.2 GBLarge codebases
mxbai-embed-large1024670 MBMaximum accuracy

Verifying Installation

Check Ollama is Running

# Check if Ollama server is responding
curl http://localhost:11434/api/tags

# Expected output: JSON with available models

List Downloaded Models

ollama list

# Output:
# NAME                     ID           SIZE    MODIFIED
# nomic-embed-text:latest  abc123...    274 MB  2 hours ago

Test Embedding Generation

# Quick test (should return embedding vector)
curl http://localhost:11434/api/embeddings -d '{
  "model": "nomic-embed-text",
  "prompt": "function hello() { return world; }"
}'

Configuring GrepAI for Ollama

After installing Ollama, configure GrepAI to use it:

# .grepai/config.yaml
embedder:
  provider: ollama
  model: nomic-embed-text
  endpoint: http://localhost:11434

This is the default configuration when you run grepai init, so no changes are needed if using nomic-embed-text.

Running Ollama

Foreground (Development)

# Run in current terminal (see logs)
ollama serve

Background (macOS/Linux)

# Using nohup
nohup ollama serve &

# Or as a systemd service (Linux)
sudo systemctl enable ollama
sudo systemctl start ollama

Check Status

# Check if running
pgrep -f ollama

# Or test the API
curl -s http://localhost:11434/api/tags | head -1

Resource Considerations

Memory Usage

Embedding models load into RAM:

  • nomic-embed-text: ~500 MB RAM
  • bge-m3: ~1.5 GB RAM
  • mxbai-embed-large: ~1 GB RAM

CPU vs GPU

Ollama uses CPU by default. For faster embeddings:

  • macOS: Uses Metal (Apple Silicon) automatically
  • Linux/Windows: Install CUDA for NVIDIA GPU support

Common Issues

āŒ Problem: connection refused to localhost:11434 āœ… Solution: Start Ollama:

ollama serve

āŒ Problem: Model not found āœ… Solution: Pull the model first:

ollama pull nomic-embed-text

āŒ Problem: Slow embedding generation āœ… Solution:

  • Use a smaller model
  • Ensure Ollama is using GPU (check ollama ps)
  • Close other memory-intensive applications

āŒ Problem: Out of memory āœ… Solution: Use a smaller model or increase system RAM

Best Practices

  1. Start Ollama before GrepAI: Ensure ollama serve is running
  2. Use recommended model: nomic-embed-text offers best balance
  3. Keep Ollama running: Leave it as a background service
  4. Update periodically: ollama pull nomic-embed-text for updates

Output Format

After successful setup:

āœ… Ollama Setup Complete

   Ollama Version: 0.1.x
   Endpoint: http://localhost:11434
   Model: nomic-embed-text (768 dimensions)
   Status: Running

   GrepAI is ready to use with local embeddings.
   Your code will never leave your machine.
Skills Info
Original Name:grepai-ollama-setupAuthor:yoanbernabeu