Submission checker modularization #2397

pgmpablo157321 · 2025-11-24T17:54:55Z

#1670
Testing command (outside the inference repo):

python -m inference.tools.submission.submission_checker.main --input inference_results_v5.1

github-actions · 2025-11-24T17:55:05Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

nv-alicheng · 2025-12-02T17:45:23Z

tools/submission/submission_checker/checks/base.py

@@ -0,0 +1,43 @@
+from abc import ABC, abstractmethod
+
+class BaseCheck(ABC):


We can utilize the __init_subclass__ feature of Python to handle this registry-like functionality.

https://peps.python.org/pep-0487/#subclass-registration

I believe a better paradigm would be something like:

Implement a Checker subclass that inherits BaseChecker

Each Checker class will implement some number of methods that are prefixed with check_

BaseChecker's __call__ and execute() methods will check all attributes on the class and run (in sequence) all attributes that are callable and start with the string check_

If there is a dependency, we can implement an @BaseCheck.mark_dependency(<str>, ...) decorator where you can pass in a list of strings that are the function names, which need to be executed before the current check.

From what I could tell, all the implemented Check classes have an init method with the arguments log, path, config, submission_logs - The BaseCheck class should probably do the same and just store the values in self to be used by the subclasses later.

nv-alicheng · 2025-12-16T21:03:18Z

tools/submission/submission_checker/checks/base.py

+            v = self.execute(check)
+            valid &= v
+            if not valid:
+                return False


I'm wondering if it makes more sense to run every check here and return the success value of each, keyed by the checker's class? Something like:

{ AccuracyCheck: True, ComplianceCheck: False, PerformanceCheck: True, ... }

I can see this being clunky if many tests depend on each other, which means that check failures will cascade. In which case my question is should there be a system to determine the dependencies of each test? Something like MeasurementsCheck depends on DirectoryStructureCheck, etc.

nv-alicheng · 2025-12-16T21:04:35Z

tools/submission/submission_checker/checks/compliance_check.py

+        if model in self.config.base["models_TEST04"]:
+            test_list.append("TEST04")
+        if model in self.config.base["models_TEST06"]:
+            test_list.append("TEST06")


Does it make more sense to have a ComplianceCheck class (which inherits BaseCheck), then have individual TEST0XCheck subclasses?

nv-alicheng · 2025-12-16T21:27:28Z

tools/submission/submission_checker/main.py

+
+    if args.scenarios_to_skip:
+        scenarios_to_skip = [
+            scenario for scenario in args.scenarios_to_skip.split(",")]


Can you add a formatter to the project like autopep8 or black?

nv-alicheng · 2025-12-16T21:54:12Z

tools/submission/submission_checker/main.py

+    for logs in loader.load():
+        # Initialize check classes
+        performance_checks = PerformanceCheck(
+            log, logs.loader_data["perf_path"], config, logs)


The overloading of the term log here feels clunky and confusing. A few comments here:

It seems like bad practices to create a logger named 'main' and pass it around to each Checker. It would be better that each file has it's own logger (logging.getLogger(__file__)) so that if (for instance) ExampleCheck did log.info("Missing file ___"), the message in console would show that it originated in ExampleCheck rather than main.

If each file has its own logger, you no longer need to pass around log everywhere (makes it more concise)

If we are passing in logs, is there a point to also pass in logs.loader_data[key]? Can't that just be extracted by the Check's init method?

If this is simplified down to just xxxx_check = XXXXCheck(config, logs), then it can further be simplified down to

for logs in loader.load(): for check_cls in [PerformanceCheck, ...]: check_cls(config, logs)()

nv-alicheng · 2025-12-16T21:56:38Z

tools/submission/submission_checker/main.py

+        measurements_checks()
+        power_checks()
+
+    with open(args.csv, "w") as csv:


Is this a TODO?

nv-alicheng · 2025-12-16T22:09:34Z

tools/submission/submission_checker/configuration/v5.0/config.yml

Why are these empty files here?

nv-alicheng · 2025-12-16T22:20:18Z

tools/submission/submission_checker/configuration/configuration.py

+
+    def load_config(self, version):
+        # TODO: Load values from 
+        self.models = self.base["models"]


I mentioned this in the GH Issue, but if the giant model dict is already being stored in a Python file, it should probably be refactored into some hierarchy of dataclasses. Having a class based representation would also make doing the key -> property remapping you're doing here either easier or unnecessary.

Having it as dataclasses rather than a dict also makes the schema of the config more defined and easier to navigate.

pgmpablo157321 requested a review from a team as a code owner November 24, 2025 17:54

pgmpablo157321 changed the title ~~First sketch of submission checker~~ Submission checker modularization Nov 24, 2025

pgmpablo157321 force-pushed the submission_checker_refactor branch from ae987c9 to 9f7e9f4 Compare November 25, 2025 17:00

pgmpablo157321 force-pushed the submission_checker_refactor branch from f95f29a to cb7db52 Compare December 2, 2025 23:56

pgmpablo157321 force-pushed the submission_checker_refactor branch from 8053b06 to cf5ff27 Compare December 12, 2025 23:04

pgmpablo157321 added 7 commits December 16, 2025 11:09

First sketch of submission checker

8a2c8b8

Add initial loader loop

786a3b5

Quick fixes for loader class

177fc01

Add performance checks new submission checker

9dffd33

Add accuracy checks to submission checker

59c0e01

Add next batch of checks

c827408

Add compliance check

a4fca64

pgmpablo157321 force-pushed the submission_checker_refactor branch from 237bbe3 to a4fca64 Compare December 16, 2025 16:10

pgmpablo157321 and others added 3 commits December 16, 2025 13:01

Add power check

bd72ec6

Add additional bandwidth check

a59e4f3

[Automated Commit] Format Codebase

b678a7a

nv-alicheng suggested changes Dec 16, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Submission checker modularization #2397

Submission checker modularization #2397

Uh oh!

pgmpablo157321 commented Nov 24, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 24, 2025 •

edited

Loading

Uh oh!

nv-alicheng Dec 2, 2025

Uh oh!

nv-alicheng Dec 16, 2025

Uh oh!

nv-alicheng Dec 16, 2025

Uh oh!

nv-alicheng Dec 16, 2025

Uh oh!

nv-alicheng Dec 16, 2025

Uh oh!

nv-alicheng Dec 16, 2025

Uh oh!

nv-alicheng Dec 16, 2025

Uh oh!

nv-alicheng Dec 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -0,0 +1,43 @@
		from abc import ABC, abstractmethod

		class BaseCheck(ABC):

Submission checker modularization #2397

Are you sure you want to change the base?

Submission checker modularization #2397

Uh oh!

Conversation

pgmpablo157321 commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pgmpablo157321 commented Nov 24, 2025 •

edited

Loading

github-actions bot commented Nov 24, 2025 •

edited

Loading