GitHub - ib-bsb-br/messy-folder-reorganizer-ai: 🤖 AI-powered CLI for file reorganization. Runs fully locally

messy-folder-reorganizer-ai - 🤖 AI-powered CLI for file reorganization.

Runs fully locally with Ollama or connects to OpenAI API for enhanced capabilities.

How It Works

CLI supports multiple commands:

Process

User Input – The user runs the app and provides:
- a source folder path containing the files to organize
- a destination folder path where organized files will be placed
- an AI provider (Ollama or OpenAI) for generating folder names and embeddings
- model names for the selected provider (Ollama models or OpenAI models)
Destination Folder Scan
- The app scans the destination folder and generates embeddings for each folder name.
- These embeddings are stored in a Qdrant vector database.
Source Folder Scan
- The app scans the source folder and generates embeddings for each file name.
- It compares each file’s embedding to existing folder embeddings in the database.
- Files without a sufficiently close match are marked for further processing.
Clustering & AI Folder Naming
- Unmatched file embeddings are grouped using agglomerative hierarchical clustering.
- Each cluster is sent to the LLM to generate a suggested folder name.
Preview Results
- A table is displayed showing the proposed destination for each file.
User Decision
- The user reviews the suggested structure and decides whether to apply the changes.

Apply

If you decided to not apply changes after process, you can apply changes later with apply command. It expects that you didn't change files locations. This command applied migrations from the latest succesfull process launch.

Rollback

For the case if after files migrations you are changed your mind and want to return everything back.

⚠️ Warning: Do not use messy-folder-reorganizer-ai on important files such as passwords, confidential documents, or critical system files.
In the event of a bug or interruption, the app may irreversibly modify or delete files. Always create backups before using it on valuable data.
The author assumes no responsibility for data loss or misplaced files caused by this application.

Small articles for the curious minds

📌 Adding RAG & ML to the CLI

📌 How cosine similarity helped files find their place

📌 Teaching embeddings to understand folders

📌 Hierarchical clustering for file grouping

Setup

Install core developer tools

macOS
```
Install or update **Xcode**
```

Linux x86_64

sudo apt update
sudo apt install -y build-essential

Install Ollama and start the service.
Download the required LLM via Ollama:
```
ollama pull deepseek-r1:latest
```
Recommended: Use models with a higher number of parameters for better accuracy.
This project has been tested with deepseek-r1:latest (4.7 GB, 7.6B params).
Download the embedding model:
```
ollama pull mxbai-embed-large:latest
```

Launch Qdrant vector database (easiest via Docker):

docker pull qdrant/qdrant
docker run -p 6333:6333 \
  -v $(pwd)/path/to/data:/qdrant/storage \
  qdrant/qdrant

Download the latest app release:

Apple Silicon (macOS ARM64):

curl -s https://api.github.com/repos/PerminovEugene/messy-folder-reorganizer-ai/releases/tags/v0.2.0 | \
  grep "browser_download_url.*messy-folder-reorganizer-ai-v0.2.0-aarch64-apple-darwin.tar.gz" | \
  cut -d '"' -f 4 | \
  xargs curl -L -o messy-folder-reorganizer-ai-macos-arm64.tar.gz

Intel Mac (macOS x86_64):

curl -s https://api.github.com/repos/PerminovEugene/messy-folder-reorganizer-ai/releases/tags/v0.2.0 | \
  grep "browser_download_url.*messy-folder-reorganizer-ai-v0.2.0-x86_64-apple-darwin.tar.gz" | \
  cut -d '"' -f 4 | \
  xargs curl -L -o messy-folder-reorganizer-ai-macos-x64.tar.gz

Linux x86_64:

curl -s https://api.github.com/repos/PerminovEugene/messy-folder-reorganizer-ai/releases/tags/v0.2.0 | \
  grep "browser_download_url.*messy-folder-reorganizer-ai-v0.2.0-x86_64-unknown-linux-gnu.tar.gz" | \
  cut -d '"' -f 4 | \
  xargs curl -L -o messy-folder-reorganizer-ai-linux-x64.tar.gz

Extract and install:

Apple Silicon (macOS ARM64):

tar -xvzf messy-folder-reorganizer-ai-macos-arm64.tar.gz
sudo mv messy-folder-reorganizer-ai /usr/local/bin/messy-folder-reorganizer-ai

Intel Mac (macOS x86_64):

tar -xvzf messy-folder-reorganizer-ai-macos-x64.tar.gz
sudo mv messy-folder-reorganizer-ai /usr/local/bin/messy-folder-reorganizer-ai

Linux x86_64:

tar -xvzf messy-folder-reorganizer-ai-linux-x64.tar.gz
sudo mv messy-folder-reorganizer-ai /usr/local/bin/messy-folder-reorganizer-ai

Verify the installation:
```
messy-folder-reorganizer-ai --help
```

Build from Source

Clone the repository:

git clone [email protected]:PerminovEugene/messy-folder-reorganizer-ai.git

Build the project:
```
cargo build --release
```

Run it:

cargo run -- \
  -E mxbai-embed-large \
  -L deepseek-r1:latest \
  -S ./test_cases/clustering/messy-folder \
  -D ./test_cases/clustering/structured-folder

Usage

Run the App

messy-folder-reorganizer-ai process \
  -E <EMBEDDING_MODEL_NAME> \
  -L <LLM_MODEL_NAME> \
  -S <SOURCE_FOLDER_PATH> \
  -D <DESTINATION_FOLDER_PATH>

messy-folder-reorganizer-ai apply \
  -i <SESSION_ID>

messy-folder-reorganizer-ai rollback \
 -i <SESSION_ID>

AI Provider Configuration

This tool supports using either local AI models via Ollama or remote models via the OpenAI API.

Local AI (Ollama - Default)

By default, or by specifying --ai-provider local, the tool will use Ollama. You must have Ollama installed and running.

--language-model / -L <OLLAMA_LLM_MODEL_NAME>: (Required for local) Specifies the Ollama model for generating folder names (e.g., deepseek-r1:latest).
--embedding-model / -E <OLLAMA_EMBEDDING_MODEL_NAME>: (Required for local) Specifies the Ollama model for generating embeddings (e.g., mxbai-embed-large).
--ollama-server-address / -n <URL>: Specifies the Ollama server address (default: http://localhost:11434).

Example (Local):

messy-folder-reorganizer-ai process \
  -L deepseek-r1:latest \
  -E mxbai-embed-large \
  -S ./messy-folder \
  -D ./organized-folder

OpenAI API (Remote)

To use OpenAI models, specify --ai-provider openai.

--openai-api-key <YOUR_API_KEY>: (Required for OpenAI) Your OpenAI API key. Can also be set via the OPENAI_API_KEY environment variable.
--openai-llm-model <MODEL_ID>: Specifies the OpenAI model for folder name generation (default: gpt-4o-mini).
--openai-embedding-model <MODEL_ID>: Specifies the OpenAI model for embeddings (default: text-embedding-ada-002).
--openai-api-base <URL>: Optional. Custom base URL for OpenAI-compatible APIs (default: https://api.openai.com/v1).
--openai-temperature <FLOAT>: Optional. Sampling temperature for OpenAI LLM (0.0-2.0).
--openai-max-tokens <INT>: Optional. Max completion tokens for OpenAI LLM.
--openai-embedding-dimensions <INT>: Optional. Output dimensions for newer OpenAI embedding models (e.g., text-embedding-3-small).

Example (OpenAI):

# Ensure OPENAI_API_KEY is set in your environment or use --openai-api-key
messy-folder-reorganizer-ai process \
  --ai-provider openai \
  --openai-llm-model "gpt-4o-mini" \
  --openai-embedding-model "text-embedding-3-small" \
  -S ./messy-folder \
  -D ./organized-folder \
  -q http://localhost:6334 # Qdrant is still used for local embedding storage

Note on Qdrant with OpenAI: Even when using OpenAI for generating embeddings, Qdrant is still used locally to store these embeddings and perform similarity searches. Ensure Qdrant is running.

Command-Line Arguments

The CLI supports the following subcommands:

`process`

Processes source files, finds best-matching destination folders using embeddings, and generates a migration plan.

Argument	Short	Default	Description
`--ai-provider`		`local`	AI provider to use (`local` for Ollama, or `openai`).
`--language-model`	`-L`	required for local	Ollama LLM model name used to generate semantic folder names.
`--embedding-model`	`-E`	required for local	Ollama embedding model used for representing folder and file names as vectors.
`--ollama-server-address`	`-n`	`http://localhost:11434`	Address of the Ollama server (if using local provider).
`--openai-api-key`		required for OpenAI	OpenAI API key (can also be set via OPENAI_API_KEY environment variable).
`--openai-llm-model`		`gpt-4o-mini`	OpenAI model for folder name generation (if using OpenAI provider).
`--openai-embedding-model`		`text-embedding-ada-002`	OpenAI model for embeddings (if using OpenAI provider).
`--openai-api-base`		`https://api.openai.com/v1`	Custom base URL for OpenAI-compatible APIs.
`--openai-temperature`		`0.7`	Sampling temperature for OpenAI LLM (0.0-2.0).
`--openai-max-tokens`		`150`	Max completion tokens for OpenAI LLM.
`--openai-embedding-dimensions`		model default	Output dimensions for newer OpenAI embedding models.
`--source`	`-S`	required	Path to the folder with unorganized files.
`--destination`	`-D`	`home`	Path to the folder where organized files should go.
`--recursive`	`-R`	`false`	Whether to scan subfolders of the source folder recursively.
`--force-apply`	`-F`	`false`	Automatically apply changes after processing without showing preview.
`--continue-on-fs-errors`	`-C`	`false`	Allow skipping files/folders that throw filesystem errors (e.g., permission denied).
`--qdrant-address`	`-q`	`http://localhost:6334`	Address of the Qdrant vector database instance.

`apply`

Applies a previously saved migration plan using the session ID. Session Id will be printed during process execution.

Argument	Short	Description
`--session-id`	`-i`	The session ID generated by the `process` command.

🔙 `rollback`

Rolls back a previously applied migration using the session ID. Session Id will be printed during process execution.

Argument	Short	Description
`--session-id`	`-i`	The session ID used to identify which migration to undo.

Configuration

Model & ML Configuration

On the first run, the app creates a .messy-folder-reorganizer-ai/ directory in your home folder containing:

llm_config.toml – LLM model request configuration options
embeddings_config.toml – Embedding model request configuration options
rag_ml_config.toml – RAG and ML behavior settings

Model request configurations are commented out by default and will fall back to built-in values unless edited.

More information about LLM and Embedding model configuration options can be found https://github.com/ollama/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values.

RAG and ML configuration parameters are required and should always be present in rag_ml_config.toml. You also can set up ignore lists for destionation and source pathes in that config file.

You can change the path where .messy-folder-reorganizer-ai will be created. Simply add MESSY_FOLDER_REORGANIZER_AI_PATH environment variable with path with desired location.

Prompt Customization

Prompts are stored in:

~/.messy-folder-reorganizer-ai/prompts/

You can edit these to experiment with different phrasing.
The source file list will be appended automatically, so do not use {} or other placeholders in the prompt.

Feel free to contribute improved prompts via PR!

Auto-Recovery

If you break or delete any config/prompt files, simply re-run the app — missing files will be regenerated with default values.

Additional help

Contributing

Run the setup script before contributing:
```
bash setup-hooks.sh
```
Lint & format code:
```
cargo clippy
cargo fmt
```
Check for unused dependencies:
```
cargo +nightly udeps
```

Running tests:

To run all tests

cargo test

To run integration tests

cargo test --test '*' -- --nocapture

To run specific integration test (file_collision for example)

cargo test file_collision -- --nocapture

Uninstall & Purge

rm -f /usr/local/bin/messy-folder-reorganizer-ai
rm -rf ~/.messy-folder-reorganizer-ai

License

This project is dual-licensed under either:

at your option.

It interacts with external services including:

Ollama – MIT License
Qdrant – Apache 2.0 License

Name		Name	Last commit message	Last commit date
Latest commit History 122 Commits
.github/workflows		.github/workflows
assets		assets
hooks		hooks
src		src
tests		tests
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
IDEAS.md		IDEAS.md
LICENSE-APACHE.md		LICENSE-APACHE.md
LICENSE-MIT.md		LICENSE-MIT.md
README.md		README.md
ROADMAP.md		ROADMAP.md
setup-hooks.sh		setup-hooks.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

messy-folder-reorganizer-ai - 🤖 AI-powered CLI for file reorganization.

Runs fully locally with Ollama or connects to OpenAI API for enhanced capabilities.

How It Works

Process

Apply

Rollback

Small articles for the curious minds

Setup

Build from Source

Usage

Run the App

AI Provider Configuration

Local AI (Ollama - Default)

OpenAI API (Remote)

Command-Line Arguments

`process`

`apply`

🔙 `rollback`

Configuration

Model & ML Configuration

Prompt Customization

Auto-Recovery

Additional help

Contributing

Running tests:

Uninstall & Purge

License

About

Uh oh!

Releases 1

Packages

Languages

License

ib-bsb-br/messy-folder-reorganizer-ai

Folders and files

Latest commit

History

Repository files navigation

messy-folder-reorganizer-ai - 🤖 AI-powered CLI for file reorganization.

Runs fully locally with Ollama or connects to OpenAI API for enhanced capabilities.

How It Works

Process

Apply

Rollback

Small articles for the curious minds

Setup

Build from Source

Usage

Run the App

AI Provider Configuration

Local AI (Ollama - Default)

OpenAI API (Remote)

Command-Line Arguments

process

apply

🔙 rollback

Configuration

Model & ML Configuration

Prompt Customization

Auto-Recovery

Additional help

Contributing

Running tests:

Uninstall & Purge

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

`process`

`apply`

🔙 `rollback`

Packages