I tried to reproduce the ISIC 2018 experiments as described in the MedSegDiff paper (Paper 2). I followed the repo instructions and trained/evaluated the model on ISIC 2018.
However, my results are much lower than those reported in the paper. Specifically:
IoU: 0.1628
Dice: 0.2706
This is significantly worse than the paper’s reported numbers.
Could you please clarify:
The exact data split you used for ISIC 2018 (train/val/test)?
Any preprocessing steps (e.g., normalization, resizing, augmentation) that may not be included in the repo?
The hyperparameters or training schedule you used for ISIC 2018?