Performance Benchmarking ¶
markata-go includes a comprehensive benchmarking suite for measuring and optimizing build performance. This guide covers how to run benchmarks locally, interpret results, and use profiling tools to identify bottlenecks.
Quick Start ¶ #
Run the end-to-end build benchmark:
just perf
This runs the benchmark 5 times and outputs results to bench.txt.
Running Benchmarks Locally ¶ #
Prerequisites ¶ #
Install benchstat for analyzing benchmark results:
go install golang.org/x/perf/cmd/benchstat@latest
Available Commands ¶ #
| Command | Description |
|---|---|
just perf |
Run end-to-end benchmarks (5 iterations) |
just perf-profile |
Generate CPU and memory profiles |
just perf-stages |
Benchmark individual lifecycle stages |
just perf-concurrency |
Test performance at different concurrency levels |
just perf-compare old.txt new.txt |
Compare two benchmark runs |
just perf-generate |
Regenerate the benchmark fixture |
Running Specific Benchmarks ¶ #
# All benchmarks
go test -bench=. -run='^$' -benchmem ./benchmarks/...
# Only end-to-end
go test -bench=BenchmarkBuild_EndToEnd -run='^$' -benchmem ./benchmarks/...
# Only stage-specific
go test -bench='BenchmarkStage' -run='^$' -benchmem ./benchmarks/...
# With more iterations for stability
go test -bench=BenchmarkBuild -run='^$' -benchmem -count=10 ./benchmarks/...
Understanding Benchmark Output ¶ #
Raw Output ¶ #
BenchmarkBuild_EndToEnd-8 5 234567890 ns/op 123456789 B/op 1234567 allocs/op
| Field | Meaning |
|---|---|
BenchmarkBuild_EndToEnd-8 |
Test name with GOMAXPROCS |
5 |
Number of iterations |
234567890 ns/op |
Nanoseconds per operation |
123456789 B/op |
Bytes allocated per operation |
1234567 allocs/op |
Number of allocations per operation |
Using benchstat ¶ #
benchstat provides statistical analysis of benchmark results:
# Single run analysis
benchstat bench.txt
# Compare two runs
benchstat old.txt new.txt
Example output:
name time/op
Build_EndToEnd-8 235ms ± 2%
name alloc/op
Build_EndToEnd-8 124MB ± 0%
name allocs/op
Build_EndToEnd-8 1.23M ± 0%
The ± value shows the variation between runs. Lower is better for reproducibility.
Comparing Runs ¶ #
When comparing two benchmark files:
name old time/op new time/op delta
Build_EndToEnd-8 250ms ± 3% 235ms ± 2% -6.00% (p=0.008 n=5+5)
| Column | Meaning |
|---|---|
old time/op |
Time from first file |
new time/op |
Time from second file |
delta |
Percentage change (negative = faster) |
p=0.008 |
Statistical significance (p < 0.05 is significant) |
n=5+5 |
Number of samples in each file |
Profiling ¶ #
Generating Profiles ¶ #
just perf-profile
This creates:
cpu.prof- CPU profilemem.prof- Memory allocation profile
Analyzing CPU Profiles ¶ #
Interactive CLI ¶ #
go tool pprof cpu.prof
Common commands:
top- Show top functions by CPU timetop -cum- Show by cumulative timelist FunctionName- Show annotated sourceweb- Open in browser (requires graphviz)
Web Interface ¶ #
go tool pprof -http=:8080 cpu.prof
This opens an interactive web UI with:
- Flame graphs
- Call graphs
- Source code annotation
- Top functions
Analyzing Memory Profiles ¶ #
go tool pprof mem.prof
Useful options:
go tool pprof -alloc_space mem.prof- Total allocationsgo tool pprof -alloc_objects mem.prof- Number of allocationsgo tool pprof -inuse_space mem.prof- Live memory
Profile Types ¶ #
| Profile | What it Measures | When to Use |
|---|---|---|
| CPU | Time spent in functions | Slow builds |
| Memory | Allocations | High memory usage |
| Block | Blocking on sync primitives | Deadlocks/contention |
| Mutex | Mutex contention | Lock performance |
Benchmark Fixture ¶ #
The benchmark suite uses a deterministic fixture at benchmarks/site/:
benchmarks/
├── site/
│ ├── markata-go.toml # Benchmark config
│ └── posts/
│ ├── blog/2024/01/ # 60 blog posts
│ └── docs/guides/ # 40 documentation guides
└── benchmark_test.go # Benchmark tests
Fixture Characteristics ¶ #
- 100 posts total - Representative of a medium-sized site
- Code-heavy content - Syntax highlighting stress test
- Nested paths - Tests path handling
- Multiple languages - Go, Python, JavaScript, Rust, SQL
- Deterministic - Same content every generation for reproducible results
Regenerating the Fixture ¶ #
just perf-generate
Or directly:
go run benchmarks/generate_posts.go
CI Performance Tracking ¶ #
Performance benchmarks run automatically:
- Weekly - Sunday at 2 AM UTC
- Manual - Via workflow dispatch
Viewing Results ¶ #
- Go to Actions tab in GitHub
- Select “Performance Benchmarks” workflow
- View the job summary for benchstat output
- Download artifacts for profiles
Comparing Branches ¶ #
Trigger a manual workflow with a comparison branch:
- Go to Actions > Performance Benchmarks
- Click “Run workflow”
- Enter the branch name to compare against
- View the comparison in the job summary
Optimization Tips ¶ #
Common Bottlenecks ¶ #
- Markdown rendering - goldmark processing
- Template execution - Pongo2 templates
- File I/O - Reading/writing files
- Syntax highlighting - Chroma processing
- Memory allocations - String operations
Improving Performance ¶ #
Concurrency ¶ #
Adjust the concurrency level in markata-go.toml:
[markata-go]
concurrency = 8 # 0 = auto (NumCPU)
Profile-Guided Optimization ¶ #
- Run profiling:
just perf-profile - Analyze:
go tool pprof -http=:8080 cpu.prof - Identify hot paths
- Optimize targeted functions
- Re-benchmark to verify improvement
Memory Optimization ¶ #
If memory is the bottleneck:
go tool pprof -alloc_space mem.prof
Look for:
- Large allocations (
alloc_space) - Many small allocations (
alloc_objects) - Functions with high cumulative allocations
Writing Efficient Plugins ¶ #
- Reuse allocations - Use
sync.Poolfor buffers - Minimize copies - Use pointers where appropriate
- Batch operations - Group file writes
- Cache results - Use the lifecycle cache
Troubleshooting ¶ #
Benchmarks Skip ¶ #
--- SKIP: BenchmarkBuild_EndToEnd
benchmark_test.go:35: Benchmark fixture not found
Fix: Run just perf-generate to create the fixture.
High Variance ¶ #
If ± values are high (>10%), try:
- More iterations:
-count=10 - Close other applications
- Use a consistent environment
Profile is Empty ¶ #
Ensure the benchmark runs long enough:
go test -bench=BenchmarkBuild_EndToEnd -run='^$' \
-benchtime=30s \
-cpuprofile=cpu.prof \
./benchmarks/...
See Also ¶ #
- Configuration Guide - Concurrency settings
- Plugin Development - Writing efficient plugins
- Go Profiling - Official pprof documentation