concepts / ai-search

AI Search

gnaw's AI-powered search capabilities enable semantic code understanding and natural language queries.

How AI Search Works

Vector Embeddings

gnaw uses vector embeddings to understand the semantic meaning of code:

  1. Code Indexing: Code is processed and converted into high-dimensional vectors
  2. Semantic Understanding: AI models understand code structure, functions, and relationships
  3. Similarity Search: Vector similarity search finds semantically related code
  4. Context Awareness: Results include relevant context and explanations

HNSW Indexing

The Hierarchical Navigable Small World (HNSW) algorithm provides:

  • Fast Similarity Search: O(log n) complexity for vector similarity queries
  • Scalability: Handles large codebases efficiently
  • Memory Efficiency: Optimized storage for vector embeddings

Natural Language Queries

Transform natural language into precise code searches:

# Find authentication-related code
gnaw agent ask "authentication functions"

# Find error handling patterns
gnaw agent ask "error handling" --type rs

# Search for database operations
gnaw agent ask "database queries" --dir src/db

Semantic Understanding

gnaw understands code semantics beyond text matching:

Identifies function definitions, parameters, and return types Recognizes common coding patterns and idioms Understands how code components relate to each other Interprets the purpose and intent of code blocks

Query Types

Functional Queries

Search for specific functionality:

gnaw agent ask "user authentication"
gnaw agent ask "database connection handling"
gnaw agent ask "error logging functions"

Pattern Queries

Find coding patterns:

gnaw agent ask "async/await patterns"
gnaw agent ask "error handling with try/catch"
gnaw agent ask "API endpoint definitions"

Context Queries

Search within specific contexts:

gnaw agent ask "authentication" --dir src/auth
gnaw agent ask "database queries" --type rs
gnaw agent ask "API routes" --type js

Index Management

Building the Index

# Build index for current directory
gnaw agent index build

# Build index for specific directory
gnaw agent index build --target-dir /path/to/code

# Update existing index
gnaw agent index update

Index Status

# Check index status and statistics
gnaw agent index status

Output Format

AI search results include rich context:

Query: authentication functions

→ src/auth.rs (score: 0.95)
  Lines 15-25
  • Function definition matches query
  • Contains authentication keywords
  • High semantic similarity
  pub fn authenticate_user(token: &str) -> Result<User, AuthError> {
      // Authentication logic here
  }

Performance Considerations

Initial index building can take time for large codebases. Consider running during off-peak hours. Vector embeddings require memory. Monitor usage for very large codebases. Semantic search is optimized for relevance over speed. Use traditional search for simple patterns.

Best Practices

  1. Build Index Regularly: Update the index when code changes significantly
  2. Use Specific Queries: More specific queries yield better results
  3. Combine with Filters: Use --type and --dir flags to narrow results
  4. Leverage Context: Include context in your queries for better matches
AI search is most effective for understanding large, complex codebases where traditional text search falls short.