gnaw is designed for high-performance search across large codebases and datasets.
gnaw leverages all available CPU cores for parallel processing:
gnaw automatically optimizes for different file sizes:
# gnaw.toml
[tuning]
tiny = "64 KiB" # Small files: minimal overhead
small = "8.6 MiB" # Medium files: balanced approach
medium = "256 MiB" # Large files: streaming optimization
large = "2 GiB" # Very large files: chunked processing
Optimize regex performance for different pattern complexities:
# gnaw.toml
[regex.simple]
size_limit_mb = 200 # Maximum memory for simple patterns
dfa_size_limit_mb = 200 # DFA size limit
nest_limit = 250 # Maximum nesting depth
[regex.complex]
size_limit_mb = 100 # Reduced limits for complex patterns
dfa_size_limit_mb = 100
nest_limit = 200
# gnaw.toml
# Size of each I/O chunk (in bytes)
io_chunk_size_bytes = 8388608 # 8 MB
# Number of lines to buffer before streaming
chunk_size = 100
gnaw consistently outperforms grep on large files:
| File Size | grep Time | gnaw Time | Speedup | |-----------|-----------|-----------|---------| | 100 MB | 2.3s | 0.8s | 2.9x | | 1 GB | 23.1s | 7.2s | 3.2x | | 10 GB | 4m 12s | 1m 8s | 3.7x |
gnaw uses significantly less memory than grep:
| File Size | grep Memory | gnaw Memory | Reduction | |-----------|-------------|------------|-----------| | 100 MB | 45 MB | 12 MB | 73% | | 1 GB | 180 MB | 35 MB | 81% | | 10 GB | 1.2 GB | 120 MB | 90% |
Enable performance monitoring:
# Enable debug logging
RUST_LOG=debug gnaw "pattern" file.txt
# Profile with flamegraph
cargo flamegraph --release --bin gnaw -- "pattern" file.txt
Create performance tests:
# Generate test data
cargo run --bin generate_test_logs
# Run benchmarks
./scripts/bench_compare.sh
# View results
cd dashboard
streamlit run streamlit_app.py
--stream for real-time resultsio_chunk_size_bytes for your hardwaregnaw agent index build for semantic search--type and --dir flags to narrow scope--json for programmatic consumption