-
-
Notifications
You must be signed in to change notification settings - Fork 575
Performance: optimise file writes #3841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: v3
Are you sure you want to change the base?
Conversation
tl;dr: make codegen much faster through parallel writing of files
We have a _very large_ API design, with over 100 services, which
generates almost 3k files with `goa generate`. As our API has grown,
running `goa generate` has got really slow.
This commit adds timing data to the `--debug` output.
Then, the lowest-hanging-fruit optimisation has been applied: writing
all the output files in parallel.
---
As a recap, the generation process has several stages:
1. **Compile temporary binary**: Goa creates a temporary Go program that
imports the design package (which triggers package initialization and
DSL execution via blank import)
2. **Execute binary**: The binary runs through multiple phases:
- Package initialization (runs DSL definitions)
- `eval.RunDSL()` - processes the DSL in 4 phases (execute, prepare,
validate, finalize)
- `generator.Generate()` - produces the actual Go files
Measuring codegen for the incident-io codebase on 3,036 generated files:
**Total time: 61 seconds**
Breakdown:
- build.Import: 117ms
- NewGenerator (packages.Load): 52ms
- Write (generate main.go): 14ms
- Compile (go get + go build): 3.6s
- packages.Load: 47ms
- go get: 514ms
- go build: 3.0s
- **Run (execute binary): 52.2s** ⚠️ 85% of total time
- Check eval.Context.Errors: <1ms
- eval.RunDSL(): 105ms
- **generator.Generate(): 51.6s** ⚠️
- Stage 1 (Compute design roots): <1ms
- Stage 2 (Compute gen package): 33ms
- Stage 3 (Retrieve generators): <1ms
- Stage 4 (Pre-generation plugins): <1ms
- **Stage 5 (Generate files): 26.2s** (3 generators, sequential)
- Generator 0: 7.2s → 1,438 files
- Generator 1: 18.8s → 1,594 files
- Generator 2: 0.2s → 4 files
- Stage 6 (Post-generation plugins): <1ms
- **Stage 7 (Write files): 32.1s** ⚠️ Biggest bottleneck (52% of generation)
- Stage 8 (Compute filenames): 2ms
This commit tries optimising the biggest stage of this process and
changes file rendering from sequential loop to parallel worker pool:
```go
// Before: Sequential (32.1s for 3,036 files)
for _, f := range genfiles {
filename, err := f.Render(dir)
// ...
}
// After: Parallel with runtime.NumCPU() workers
numWorkers := runtime.NumCPU()
// Worker pool processes files concurrently
```
Which changed execution time from **61 seconds total** to **34 seconds
total**, for an overall speedup of 1.8x.
While here, I also tried parallelising Stage 5 (generator functions) but
hit infinite recursion in `AsObject()` when handling circular type
references concurrently. This is where I'd go for the next biggest
speed-up.
Eliminate redundant file I/O operations that were causing severe performance degradation. Previous implementation performed 2 reads and 3 writes per Go file during code generation. Changes: - Refactor finalizeGoSource to process entirely in memory - Render template sections to buffer instead of directly to file - Perform all formatting/import cleanup in memory - Write final content exactly once per file Performance improvement: - Before: 40s for 3000+ files (5 I/O operations per file) - After: 9s for 3000+ files (1 write operation per file) - 77.5% reduction in generation time This maintains all existing functionality including: - Import cleanup optimizations - Proper goimports formatting - Parallel file writing benefits 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
| } | ||
|
|
||
| // 5. Generate initial set of files produced by goa code generators. | ||
| // NOTE: Parallelization causes infinite recursion in AsObject() for circular type references |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to try to find time to work on this problem in the next couple weeks ideally. This stage is now the slowest (~20s), but parallelising is fiddly due to circular references.
| type ExampleGenerator struct { | ||
| Randomizer | ||
| seen map[string]*any | ||
| mu sync.RWMutex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this isn't essential right now, I think it's sensible to add locks around map operations to avoid panics in future!
|
This is awesome, thank you for the great work! Just one thing that might be nice before we can merge: would it be possible to add a few tests for the new concurrent part of the code? |
|
I've added some tests that hit the "generate lots of files" code, but they don't directly test the concurrency (although I have run them with Was there anything else you were thinking of @raphael ? |
tl;dr: make codegen much faster through parallel writing of files, and writing each file's contents once.
We have a very large API design, with over 100 services, which generates almost 3k files with
goa generate. As our API has grown, runninggoa genhas got really slow.There's three changes in this PR:
[TIMING]logs when--debugis passed - this makes finding bottlenecks much easierThis cut the time on that stage for us from ~52s to ~10s 🎉