fix: strip Python bytecode from bundled backend#102
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly optimizes the backend packaging process by eliminating Python bytecode artifacts from the bundled runtime. By preventing bytecode generation during dependency installation and implementing a robust cleanup pass, the changes aim to reduce the final installer size and improve extraction performance, addressing a known issue with unnecessary files being shipped. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Hey - I've found 1 issue
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="scripts/backend/runtime-layout-utils.mjs" line_range="34" />
<code_context>
+ PYTHONDONTWRITEBYTECODE: '1',
+});
+
+const countFilesInDirectory = (directoryPath) => {
+ let total = 0;
+ for (const entry of fs.readdirSync(directoryPath, { withFileTypes: true })) {
</code_context>
<issue_to_address>
**issue (complexity):** Consider rewriting `prunePythonBytecodeArtifacts` as a single recursive traversal that deletes and counts files without using `countFilesInDirectory`.
You can keep the existing stats API and behavior while simplifying `prunePythonBytecodeArtifacts` to a single-pass traversal and removing the extra full recursion in `countFilesInDirectory`.
Key changes:
- Drop `countFilesInDirectory`.
- Use a single recursive walker that both deletes and counts.
- Track whether you’re inside a `__pycache__` directory to decide which counter to increment.
For example:
```js
const isBytecodeFile = (entryName) => entryName.endsWith('.pyc') || entryName.endsWith('.pyo');
export const prunePythonBytecodeArtifacts = (rootDir) => {
const stats = {
removedCacheDirs: 0,
removedBytecodeFiles: 0,
removedOrphanBytecodeFiles: 0,
};
const visit = (directoryPath, { inPycache = false } = {}) => {
for (const entry of fs.readdirSync(directoryPath, { withFileTypes: true })) {
const entryPath = path.join(directoryPath, entry.name);
if (entry.isDirectory()) {
if (entry.name === '__pycache__' && !inPycache) {
stats.removedCacheDirs += 1;
// Recurse marking we are in a __pycache__ tree
visit(entryPath, { inPycache: true });
// Directory should now be empty
fs.rmSync(entryPath, { force: true });
} else {
visit(entryPath, { inPycache });
}
continue;
}
if (inPycache) {
// Match previous behavior: count all files inside __pycache__
stats.removedBytecodeFiles += 1;
fs.rmSync(entryPath, { force: true });
} else if (isBytecodeFile(entry.name)) {
stats.removedOrphanBytecodeFiles += 1;
fs.rmSync(entryPath, { force: true });
}
}
};
if (fs.existsSync(rootDir)) {
visit(rootDir);
}
return stats;
};
```
This preserves:
- `removedCacheDirs`: number of `__pycache__` directories removed.
- `removedBytecodeFiles`: number of files removed from within `__pycache__` trees (all files, as before).
- `removedOrphanBytecodeFiles`: number of `.pyc`/`.pyo` files removed outside `__pycache__`.
The control flow is flatter (no extra helper traversal, fewer `continue`s) and the stats are computed in a single pass.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Code Review
This pull request introduces functionality to prevent and clean up Python bytecode artifacts during the backend build process. It modifies the pip install command to use --no-compile, adds a utility to set the PYTHONDONTWRITEBYTECODE environment variable, and implements a new function to recursively prune __pycache__ directories and orphan bytecode files. New tests are included for these utility functions. The reviewer suggested improving the readability of a log message by using a single template literal.
| console.log( | ||
| '[build-backend] removed Python bytecode artifacts ' + | ||
| `(${bytecodeCleanupStats.removedCacheDirs} cache dirs, ` + | ||
| `${bytecodeCleanupStats.removedBytecodeFiles} cached files, ` + | ||
| `${bytecodeCleanupStats.removedOrphanBytecodeFiles} orphan files).`, | ||
| ); |
There was a problem hiding this comment.
For improved readability and consistency, this log message can be constructed using a single template literal instead of mixing string concatenation with template literals.
console.log(`[build-backend] removed Python bytecode artifacts (${bytecodeCleanupStats.removedCacheDirs} cache dirs, ${bytecodeCleanupStats.removedBytecodeFiles} cached files, ${bytecodeCleanupStats.removedOrphanBytecodeFiles} orphan files).`);|
@sourcery-ai review |
Fixes #101.
This change removes Python bytecode artifacts from the packaged backend runtime before Tauri bundles
resources/backendinto the desktop installer.On the current packaging path,
copyTree()already skips__pycache__,.pyc, and.pyowhen copying source files and the standalone CPython runtime. The remaining issue is thatbuild-backend.mjsrunspython -m pip install -r requirements.txtinside the bundled runtime, and that step repopulates a large number of bytecode cache files underresources/backend/python. Those files are then shipped verbatim in the installer, which increases file count and slows install-time extraction on affected machines.This PR fixes the problem in two layers:
--no-compileto every bundledpip installPYTHONDONTWRITEBYTECODE=1for those install subprocesses__pycache__,.pyc, and.pyoartifacts as a safety netValidation:
node --test scripts/backend/runtime-layout-utils.test.mjspnpm run test:prepare-resourcesSummary by Sourcery
Ensure bundled Python backend excludes bytecode artifacts and add safeguards and tests around runtime dependency installation.
New Features:
Enhancements:
Tests: