-
Notifications
You must be signed in to change notification settings - Fork 7
Description
We want to address two issues here
- define a new folder structure for profiling experiments
- identify which of the components will be version controlled.
I will update this comment periodically as the strategy evolves. I realize this is not ideal because it upsets the chronology of discussions.
This is our current folder structure specified in the Profiling Handbook. This differs slightly from the folder structure specified in the Cell Painting Gallery. For this level of nesting (under workspace) the only discrepancy is metadata/platemaps (see #70); consensus and collated are currently missing in the Gallery, but that is not a discrepancy per se.
This is the proposed folder structure in the Profiling Handbook:
├── profiles
│ └── 2016_04_01_a549_48hr_batch1
│ └── SQ00015167
│ ├── SQ00015167_augmented.csv
│ ├── SQ00015167_normalized.csv
│ ├── SQ00015167_normalized_feature_select.csv
│ └── SQ00015167_spherized.csv
├── collated (*)
│ └── 2016_04_01_a549_48hr_batch1
│ ├── 2016_04_01_a549_48hr_batch1_augmented.parquet
│ ├── 2016_04_01_a549_48hr_batch1_normalized.parquet
│ ├── 2016_04_01_a549_48hr_batch1_normalized_feature_select.parquet
│ └── 2016_04_01_a549_48hr_batch1_spherized.parquet
├── consensus (*)
│ └── 2016_04_01_a549_48hr_batch1
│ ├── 2016_04_01_a549_48hr_batch1_augmented.parquet
│ ├── 2016_04_01_a549_48hr_batch1_normalized.parquet
│ └── 2016_04_01_a549_48hr_batch1_spherized.parquet
├── backend
│ └── 2016_04_01_a549_48hr_batch1
│ └── SQ00015167
│ ├── SQ00015167.csv
│ └── SQ00015167.sqlite
├── load_data_csv
│ └── 2016_04_01_a549_48hr_batch1
│ └── SQ00015167
│ ├── load_data.csv
│ └── load_data_with_illum.csv
├── log
├── metadata
│ └── 2016_04_01_a549_48hr_batch1
│ ├── barcode_platemap.csv
│ └── platemap
│ └── C-7161-01-LM6-006.txt
└── pipelines
* collated and consensus files are saved as parquet to allow fast loading.
We will version these folders by placing them inside the project repo
| folder | generator |
|---|---|
| profiles | pycytominer |
| collated | pycytominer |
| consensus | pycytominer |
| load_data_csv | pe2loaddata |
| log | GNU parallel (when running various commands) |
| metadata | manual |
| pipelines | manual |
We will not version these folders:
| folder | generator | reason |
|---|---|---|
| backend | cytominer-database | |
| analysis | CellProfiler, Distributed-CellProfiler | redundant with SQLite backend |
| images | Microscope | Never changes, and too big! |