Reproduced ISIC 2018 experiments, but results are much worse than reported

I tried to reproduce the ISIC 2018 experiments as described in the MedSegDiff paper (Paper 2). I followed the repo instructions and trained/evaluated the model on ISIC 2018.

However, my results are much lower than those reported in the paper. Specifically:

IoU: 0.1628

Dice: 0.2706

This is significantly worse than the paper’s reported numbers.

Could you please clarify:

The exact data split you used for ISIC 2018 (train/val/test)?

Any preprocessing steps (e.g., normalization, resizing, augmentation) that may not be included in the repo?

The hyperparameters or training schedule you used for ISIC 2018?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reproduced ISIC 2018 experiments, but results are much worse than reported #224

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Reproduced ISIC 2018 experiments, but results are much worse than reported #224

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions