Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 64 additions & 0 deletions vision/classification_and_detection/yolo/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# YOLO README - a working design doc

## What does it take to get YOLO fully working?
- Perf comparison local to Ultralyrics
- Perf and accuracy to run with LoadGen
- Calculate and output accuracy with LoadGen run
- Compliance
- Model distribution script
- Dataset processing / distribution script

## Dataset processing
The full coco dataset has images that are not compliant with MLC legal rules. In order to run inference and accuracy on the YOLOv11 benchmark with the safe version of the dataset, please execute the following commands:
`python filter_coco_safe_images.py` to create a new folder with just the images that comply with license agreements
`python create_safe_annotations.py` to create the correct associated annotations file, needed for the mAP accuracy score calcuations.

The dataset has been uploaded the MLC S3 bucket, instructions for how to pull the safe dataset to come shortly. This is what your dataset path should look like:\
coco_safe/\
├── annotations/\
    instances_val2017_safe.json\
└── val2017_safe/\
    1525 images\
coco_safe.checksums

## How to run yolo_loadgen.py
Examples usage:
Perf run
`python yolo_loadgen.py --dataset-path {DATASET_PATH} --model {MODEL_FILE} --scenario {Offline, SingleStream, MultiStream} --output {OUTPUT_RESULTS_DIR}`

Accuracy run
`python yolo_loadgen.py --dataset-path {DATASET_PATH} --model {MODEL_FILE} --scenario {Offline, SingleStream, MultiStream} --accuracy --output {OUTPUT_RESULTS_DIR}`

Arguments:
`--dataset-path` -> path to dataset images
`--model` -> path to YOLO model
`--device` -> device # leave as is
`--scenario` -> ["Offline", "SingleStream", "MultiStream"]
`--accuracy` -> run accuracy mode
`--count` -> number of samples
`--output` -> output directory

Example output is under inference/vision/classification_and_detection/yolo_result_10232025/ for YOLOv11[N, S, M, L, X]

## How to get mAP accuracy results with yolo_ultra_map.py
`python yolo_ultra_map.py --option {1, 2} --model {MODEL_FILE} --images {DATASET PATH} --data {DATA YAML FILE PATH} --annotations {ANNOTATION JSON FILE PATH} --output_json {OUTPUT JSON FILE PATH}`
`--option` -> 1 is for the in built YOLO method (does not work) and 2 is for the pycocotools approach that requires the predicitions.json as well as the annotations file.
`--model` -> model to run test on
`--images` -> path for the dataset of images
`--data` -> yaml file that contains the path to the dir of images as well as the labels
`--annotations` -> path to the annotations json file
`--output_json` -> output file

## Method in which accuracy will be computed using mlperf_log_accuracy.json
1. LoadGen runs → creates mlperf_log_accuracy.json (hex-encoded)
2. yolo_loadgen.py calls validate_accuracy_requirement()
3. Passes mlperf_log_accuracy.json to yolo_ultra_map.py
4. yolo_ultra_map.py:
- Detects it's a MLPerf log
- Decodes hex data → decoded_predictions.json
- Evaluates with COCO tools
- Validates against threshold
5. Returns PASS/FAIL



84 changes: 84 additions & 0 deletions vision/classification_and_detection/yolo/coco_safe.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
path: /mnt/data/yolo/coco
train: train2017
val: val2017_safe
names:
0: person
1: bicycle
2: car
3: motorcycle
4: airplane
5: bus
6: train
7: truck
8: boat
9: traffic light
10: fire hydrant
11: stop sign
12: parking meter
13: bench
14: bird
15: cat
16: dog
17: horse
18: sheep
19: cow
20: elephant
21: bear
22: zebra
23: giraffe
24: backpack
25: umbrella
26: handbag
27: tie
28: suitcase
29: frisbee
30: skis
31: snowboard
32: sports ball
33: kite
34: baseball bat
35: baseball glove
36: skateboard
37: surfboard
38: tennis racket
39: bottle
40: wine glass
41: cup
42: fork
43: knife
44: spoon
45: bowl
46: banana
47: apple
48: sandwich
49: orange
50: broccoli
51: carrot
52: hot dog
53: pizza
54: donut
55: cake
56: chair
57: couch
58: potted plant
59: bed
60: dining table
61: toilet
62: tv
63: laptop
64: mouse
65: remote
66: keyboard
67: cell phone
68: microwave
69: oven
70: toaster
71: sink
72: refrigerator
73: book
74: clock
75: vase
76: scissors
77: teddy bear
78: hair drier
79: toothbrush
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
import json
import os
import shutil

# Paths
ann_file = "/mnt/data/yolo/coco/annotations/instances_val2017.json"
images_dir = "/mnt/data/yolo/coco/val2017"
output_images_dir = "/mnt/data/yolo/coco/val2017_safe"
output_ann_file = "/mnt/data/yolo/coco/annotations/instances_val2017_safe.json"

os.makedirs(output_images_dir, exist_ok=True)

# Load COCO annotations
with open(ann_file, "r") as f:
coco = json.load(f)

licenses = {l["id"]: l for l in coco["licenses"]}

# Define safe license keywords
safe_keywords = [
"creativecommons.org/licenses/by/",
"creativecommons.org/licenses/by-sa/",
"creativecommons.org/licenses/by-nd",
"flickr.com/commons/usage",
"www.usa.gov"
]

# Filter safe images
safe_images = []
for img in coco["images"]:
lic = licenses.get(img["license"], {})
url = lic.get("url", "").lower()
if any(k in url for k in safe_keywords):
safe_images.append(img)

print(f"Total images: {len(coco['images'])}")
print(f"Safe images: {len(safe_images)}")

# Copy safe images
for img in safe_images:
src = os.path.join(images_dir, img["file_name"])
dst = os.path.join(output_images_dir, img["file_name"])
if os.path.exists(src):
shutil.copy2(src, dst)

print(f"Copied {len(safe_images)} images to {output_images_dir}")

# Filter annotations for safe images
safe_image_ids = {img["id"] for img in safe_images}
safe_annotations = [ann for ann in coco["annotations"]
if ann["image_id"] in safe_image_ids]

# Build new COCO annotation structure
safe_coco = {
"info": coco.get("info", {}),
"licenses": coco.get("licenses", []),
"images": safe_images,
"annotations": safe_annotations,
"categories": coco.get("categories", [])
}

# Save new annotation file
with open(output_ann_file, "w") as f:
json.dump(safe_coco, f)

print(f"Saved filtered annotations to {output_ann_file}")
print(f"Total annotations: {len(safe_annotations)}")
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
import json
import os
import shutil

# Paths (adjust these)
ann_file = "/mnt/data/yolo/annotations/instances_val2017.json"
images_dir = "/mnt/data/yolo/coco/val2017"
output_dir = "/mnt/data/yolo/coco/val2017_safe"

os.makedirs(output_dir, exist_ok=True)

# Load COCO annotations
with open(ann_file, "r") as f:
coco = json.load(f)

# Map license id to name/url
licenses = {l["id"]: l for l in coco["licenses"]}

# Define safe licenses: CC-BY and CC-BY-SA
# safe_keywords = ["creativecommons.org/licenses/by/", "creativecommons.org/licenses/by-sa/"]
# safe_keywords = ["creativecommons.org/licenses/by/", "creativecommons.org/licenses/by-sa/", "creativecommons.org/licenses/by-nc" , "creativecommons.org/licenses/by-nc-nd" , "creativecommons.org/licenses/by-nd", "creativecommons.org/licenses/by-nc-sa" ]
safe_keywords = [
"creativecommons.org/licenses/by/",
"creativecommons.org/licenses/by-sa/",
"creativecommons.org/licenses/by-nd",
"flickr.com/commons/usage",
"www.usa.gov"]


safe_images = []
for img in coco["images"]:
lic = licenses.get(img["license"], {})
url = lic.get("url", "").lower()
if any(k in url for k in safe_keywords):
safe_images.append(img)

print(f"Total images: {len(coco['images'])}")
print(f"Safe images: {len(safe_images)}")

# Copy safe images
for img in safe_images:
src = os.path.join(images_dir, img["file_name"])
dst = os.path.join(output_dir, img["file_name"])
if os.path.exists(src):
shutil.copy2(src, dst)

print(f"Copied {len(safe_images)} images to {output_dir}")
60 changes: 60 additions & 0 deletions vision/classification_and_detection/yolo/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
certifi>=2025.10.5
charset-normalizer>=3.4.4
contourpy>=1.3.3
cycler>=0.12.1
filelock>=3.20.0
fonttools>=4.60.1
fsspec>=2025.9.0
idna>=3.11
ijson>=3.4.0.post0
Jinja2>=3.1.6
kiwisolver>=1.4.9
MarkupSafe>=3.0.3
matplotlib>=3.10.7
mlcommons_loadgen>=5.1.1
mpmath>=1.3.0
networkx>=3.5
numpy>=2.2.6
nvidia-cublas-cu12>=12.8.4.1
nvidia-cuda-cupti-cu12>=12.8.90
nvidia-cuda-nvrtc-cu12>=12.8.93
nvidia-cuda-runtime-cu12>=12.8.90
nvidia-cudnn-cu12>=9.10.2.21
nvidia-cufft-cu12>=11.3.3.83
nvidia-cufile-cu12>=1.13.1.3
nvidia-curand-cu12>=10.3.9.90
nvidia-cusolver-cu12>=11.7.3.90
nvidia-cusparse-cu12>=12.5.8.93
nvidia-cusparselt-cu12>=0.7.1
nvidia-nccl-cu12>=2.27.5
nvidia-nvjitlink-cu12>=12.8.93
nvidia-nvshmem-cu12>=3.3.20
nvidia-nvtx-cu12>=12.8.90
opencv-python>=4.12.0.88
packaging>=25.0
pandas>=2.3.3
pillow>=12.0.0
pip>=24.0
polars>=1.34.0
polars-runtime-32>=1.34.0
psutil>=7.1.0
pycocotools>=2.0.10
pyparsing>=3.2.5
python-dateutil>=2.9.0.post0
pytz>=2025.2
PyYAML>=6.0.3
requests>=2.32.5
scipy>=1.16.2
setuptools>=80.9.0
six>=1.17.0
sympy>=1.14.0
tabulate>=0.9.0
torch>=2.9.0
torchvision>=0.24.0
triton>=3.5.0
typing_extensions>=4.15.0
tzdata>=2025.2
ultralytics>=8.3.214
ultralytics-thop>=2.0.17
urllib3>=2.5.0
wget>=3.2
Loading