Skip to content

Robustness Evaluation

Aayush Grover edited this page May 13, 2025 · 2 revisions

An example script to evaluate a model for robustness using asap has been defined in tutorials/eval.py.

Create a peak or whole-genome dataset for evaluation on robustness

asap.robustness_peak_dataset(signal_file, peak_file, genome, chroms, generated, blacklist_file=None, unmap_file=None)

or

asap.robustness_wg_dataset(signal_file, genome, chroms, generated, blacklist_file=None, unmap_file=None)

Creates a peak or whole-genome dataset for evaluation on robustness.

Args:

  • signal_file (str): Path to the signal file.
  • peak_file (str): Path to the peak file.
  • genome (str): Path to the genome file.
  • chroms (List[int]): List of chromosomes for evaluation.
  • generated (str): Path to the generated data.
  • blacklist_file (List[str]): List of paths to blacklist files (including SNV VCFs).
  • unmap_file (str): Path to the unmapped regions file.

Returns:

  • test_dataset (asap.dataloader.BaseDataset): Test dataset for robustness (either peak or whole-genome)

Evaluate a pre-trained model on robustness

asap.eval_robustness(experiment_name, model, eval_dataset, logs_dir, batch_size=64, use_map=False, nr_samples_for_var=17)

Evaluates the pre-trained model for robustness on peak or whole-genome datasets.

Args:

  • experiment_name (str): The name of the experiment. This will be used to load model checkpoints.
  • model (str): The model name to evaluate. Choose from [cnn, lstm, dcnn, convnext_cnn, convnext_lstm, convnext_dcnn, convnext_transformer].
  • eval_dataset (asap.dataloader.BaseDataset): The test dataset used for model evaluation.
  • logs_dir (str): The directory to load model checkpoints from.
  • batch_size (int): The batch size for evaluation.
  • use_map (bool): If mappability information was used during training.
  • nr_samples_for_var (int): The number of samples for variance calculation.

Returns:

  • scores (Dict[Dict]): For each test chromosome, a dictionary with average coefficient of variation (cov) and average coefficient of variation stratified by position (cov_per_bin).

Clone this wiki locally