A script to collect source code from multiple directories into a single text file. Perfect for code analysis, sharing projects with AI chatbots, archiving, or creating context for refactoring!
- Collects Python, Java, C, and C++ files by default (option to include all files 🌐)
- Ignores common directories:
.idea,.venv,venv,__pycache__,.env🚫 - Add custom directories to exclude with
--exclude🛑 - Exclude specific file types/extensions with
--exclude-langs🚷 - Supports multiple input folders 🗂️
- Preserves file structure with relative paths 🧭
- Resilient to file read errors — continues even if some files fail 🔒
- User-friendly CLI with full argument support 🖥️
- Make sure you have Python 3.7+ installed 🐍
- Clone this repo or copy the files
brew install MikhailOnyanov/code-collector/code-collectorPick one of the options below depending on your workflow.
pipx install .💡
pipxensures isolated, system-wide access to the CLI tool without polluting your global Python environment.
uv pip install .pip install .After installation, use the collect-code command from anywhere! 🚀
collect-code ./srccollect-code ./src ./tests ./utilscollect-code ./project --all-filescollect-code ./src --exclude node_modules build dist# Exclude Python files
collect-code ./src --exclude-langs=py
# Exclude multiple file types (Java and C++)
collect-code ./src --exclude-langs=java,cpp,hpp
# Works with or without dots in extension names
collect-code ./src --exclude-langs=.py,.java# Exclude 'build' directory and all Java files
collect-code ./src --exclude build --exclude-langs=java
# Exclude multiple directories and file types
collect-code ./project --exclude node_modules dist --exclude-langs=cpp,hThe generated collected_code.txt will look like:
[project/src/main.py]
def hello():
print("Hello, world!")
[project/src/utils/helper.py]
class Helper:
def __init__(self):
pass
...
Results are saved to collected_code.txt in your current working directory.
- Language: Python 3.7+
- Dependencies: Standard library only 🚫📦
- License: MIT 📜
- Files:
collect_code.py,setup.py - Supported Languages by Default: Python (
.py), Java (.java), C (.c,.h), C++ (.cpp,.cc,.cxx,.hpp)
--exclude: Excludes directories from being traversed (e.g.,node_modules,build)--exclude-langs: Excludes file types based on their extensions (e.g.,py,java)--all-files: Overrides default language filtering and collects all file types (but still respects--exclude-langs)
Install with uv:
uv pip install -e .Run directly without installation:
python collect_code.py ./src --all-filesRun the test suite:
python -m unittest test_collect_code -vOr with pytest (if installed):
pytest test_collect_code.py -vFormat code with ruff:
uv tool run ruff format collect_code.py test_collect_code.pyLint code:
uv tool run ruff check collect_code.py test_collect_code.pyThe test suite includes:
- Unit tests for the
collect_filesfunction with detailed docstrings - Unit tests for CLI argument parsing
- Integration tests for end-to-end scenarios
- Test fixtures to reduce code duplication
All tests run automatically on every push via GitHub Actions CI/CD pipeline.
@MikhailOnyanov
Created to simplify code sharing with AI chat interfaces and streamline project analysis. 💬
