name: index description: Index codebases for semantic code search. Use this skill to make a project's source code searchable via vector similarity, enabling natural language queries like "how does authentication work" or "find the database connection logic".

Index Skill

Index codebases for semantic code search via BrAIny API. Enables natural language queries over source code using vector embeddings.

Quick Start

# Index a codebase
bun scripts/index.ts index /path/to/project --name "my-project"

# Search indexed code
bun scripts/index.ts search "authentication middleware" --limit 10

# Reindex after changes
bun scripts/index.ts reindex

# List indexed codebases
bun scripts/index.ts list

Environment Variables

Variable	Purpose	Default
`BRAINY_URL`	API base URL	`http://localhost:3090`
`BRAINY_PROJECT_ID`	Project ID for organization	Required

API Reference

Index a Codebase

POST /v1/projects/{projectId}/codebases
Content-Type: application/json

{
  "name": "brainy",
  "path": "/Users/jp/Projects/brainy"
}

Response

{
  "success": true,
  "codebaseId": "uuid",
  "stats": {
    "totalFiles": 52,
    "newFiles": 52,
    "modifiedFiles": 0,
    "deletedFiles": 0,
    "totalChunks": 215,
    "indexedChunks": 215
  }
}

Reindex a Codebase

Performs incremental update - only processes new/modified files:

POST /v1/codebases/{codebaseId}/reindex

Response

{
  "success": true,
  "codebaseId": "uuid",
  "stats": {
    "totalFiles": 54,
    "newFiles": 2,
    "modifiedFiles": 3,
    "deletedFiles": 0,
    "totalChunks": 220,
    "indexedChunks": 220
  }
}

Search Code

GET /v1/codebases/{codebaseId}/search?query={text}&limit={n}

Parameter	Description
`query`	Natural language search text
`limit`	Max results (default 10, max 50)

Response

{
  "success": true,
  "chunks": [
    {
      "id": "uuid",
      "filePath": "src/services/embedding.ts",
      "startLine": 1,
      "endLine": 56,
      "content": "import OpenAI from 'openai'...",
      "tokenCount": 428,
      "similarity": 0.42,
      "language": "typescript"
    }
  ],
  "totalCount": 20
}

List Codebases

GET /v1/projects/{projectId}/codebases

Get Codebase Details

GET /v1/codebases/{codebaseId}

Delete Codebase

Soft deletes the codebase and all its chunks:

DELETE /v1/codebases/{codebaseId}

How Indexing Works

1. Crawling

The crawler walks the directory tree and discovers source files:

Automatically Ignored

.git, .svn, .hg (version control)
node_modules, vendor (dependencies)
dist, build, target, out (build outputs)
.idea, .vscode (IDE configs)
*.min.js, *.min.css (minified files)
Binary files (images, PDFs, archives, executables)
Files > 5MB

Language Detection Detected by extension: TypeScript, JavaScript, Python, Go, Rust, Java, Ruby, PHP, C/C++, SQL, HTML, CSS, JSON, YAML, Markdown, and more.

2. Chunking

Files are split into chunks optimized for embeddings:

Parameter	Value	Purpose
Target tokens	~500	Optimal semantic density
Max tokens	2000	Safe buffer below model limit
Min tokens	50	Avoid tiny fragments

Semantic Boundaries Chunks break at natural points:

Empty lines
Function/class/interface definitions
Export statements
Comment blocks (documentation)

3. Embedding

Each chunk is embedded using OpenAI's text-embedding-3-small model (1536 dimensions). Embeddings enable semantic similarity search.

4. Storage

Chunks are stored in PostgreSQL with pgvector:

code_chunks (
  id UUID PRIMARY KEY,
  codebase_id UUID NOT NULL,
  file_id UUID,
  chunk_index INTEGER NOT NULL,
  content TEXT NOT NULL,
  start_line INTEGER,
  end_line INTEGER,
  token_count INTEGER NOT NULL,
  embedding vector(1536),
  deleted_at TIMESTAMPTZ,
  created_at TIMESTAMPTZ
)

Indexes

HNSW on embedding for fast vector similarity (cosine distance)
B-tree on codebase_id, file_id for filtering

Incremental Updates

When reindexing, the system:

Compares file hashes to detect changes
Skips unchanged files
Re-chunks and re-embeds modified files
Removes chunks from deleted files
Adds chunks for new files

This makes reindexing fast after small changes.

Search Tips

Natural Language Queries

"How does authentication work"
"Database connection pooling"
"Error handling in API routes"
"Token generation for embeddings"

Specific Code Patterns

"PostgreSQL query with vector search"
"Hono middleware for validation"
"TypeScript interface for memory"

Finding Implementations

"EmbeddingService class"
"chunkByTokens function"
"Memory repository insert"

Database Schema

codebases

codebases (
  id UUID PRIMARY KEY,
  project_id UUID NOT NULL,
  name TEXT NOT NULL,
  path TEXT NOT NULL,
  total_files INTEGER DEFAULT 0,
  total_chunks INTEGER DEFAULT 0,
  last_indexed_at TIMESTAMPTZ,
  deleted_at TIMESTAMPTZ,
  created_at TIMESTAMPTZ,
  updated_at TIMESTAMPTZ
)

codebase_files

codebase_files (
  id UUID PRIMARY KEY,
  codebase_id UUID NOT NULL,
  file_path TEXT NOT NULL,
  file_hash TEXT NOT NULL,
  language TEXT,
  chunk_count INTEGER DEFAULT 0,
  last_indexed_at TIMESTAMPTZ,
  deleted_at TIMESTAMPTZ,
  created_at TIMESTAMPTZ
)

code_chunks

code_chunks (
  id UUID PRIMARY KEY,
  codebase_id UUID NOT NULL,
  file_id UUID,
  chunk_index INTEGER NOT NULL,
  content TEXT NOT NULL,
  start_line INTEGER,
  end_line INTEGER,
  token_count INTEGER NOT NULL,
  embedding vector(1536),
  deleted_at TIMESTAMPTZ,
  created_at TIMESTAMPTZ
)

Usage Guidelines

Index codebases at the start of a project for code-aware assistance
Reindex after significant changes (new features, refactors)
Use semantic search to understand unfamiliar codebases
Search for patterns when implementing similar functionality
Keep different projects in separate BrAIny projects for isolation

Name	index
Description	Index codebases for semantic code search. Use this skill to make a project's source code searchable via vector similarity, enabling natural language queries like "how does authentication work" or "find the database connection logic".

index

SKILL.md

name: index description: Index codebases for semantic code search. Use this skill to make a project's source code searchable via vector similarity, enabling natural language queries like "how does authentication work" or "find the database connection logic".

Index Skill

Quick Start

Environment Variables

API Reference

Index a Codebase

Reindex a Codebase

Search Code

List Codebases

Get Codebase Details

Delete Codebase

How Indexing Works

1. Crawling

2. Chunking

3. Embedding

4. Storage

Incremental Updates

Search Tips

Database Schema

codebases

codebase_files

code_chunks

Usage Guidelines