Skip to content

Conversation

@pgmpablo157321
Copy link
Contributor

@pgmpablo157321 pgmpablo157321 commented Nov 24, 2025

#1670
Testing command (outside the inference repo):

python -m inference.tools.submission.submission_checker.main --input inference_results_v5.1

@pgmpablo157321 pgmpablo157321 requested a review from a team as a code owner November 24, 2025 17:54
@github-actions
Copy link
Contributor

github-actions bot commented Nov 24, 2025

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@pgmpablo157321 pgmpablo157321 changed the title First sketch of submission checker Submission checker modularization Nov 24, 2025
@pgmpablo157321 pgmpablo157321 force-pushed the submission_checker_refactor branch from ae987c9 to 9f7e9f4 Compare November 25, 2025 17:00
@pgmpablo157321 pgmpablo157321 force-pushed the submission_checker_refactor branch from f95f29a to cb7db52 Compare December 2, 2025 23:56
@pgmpablo157321 pgmpablo157321 force-pushed the submission_checker_refactor branch from 8053b06 to cf5ff27 Compare December 12, 2025 23:04
@pgmpablo157321 pgmpablo157321 force-pushed the submission_checker_refactor branch from 237bbe3 to a4fca64 Compare December 16, 2025 16:10
@@ -0,0 +1,43 @@
from abc import ABC, abstractmethod

class BaseCheck(ABC):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can utilize the __init_subclass__ feature of Python to handle this registry-like functionality.

https://peps.python.org/pep-0487/#subclass-registration

I believe a better paradigm would be something like:

  1. Implement a Checker subclass that inherits BaseChecker
  2. Each Checker class will implement some number of methods that are prefixed with check_
  3. BaseChecker's __call__ and execute() methods will check all attributes on the class and run (in sequence) all attributes that are callable and start with the string check_
  4. If there is a dependency, we can implement an @BaseCheck.mark_dependency(<str>, ...) decorator where you can pass in a list of strings that are the function names, which need to be executed before the current check.

From what I could tell, all the implemented Check classes have an init method with the arguments log, path, config, submission_logs - The BaseCheck class should probably do the same and just store the values in self to be used by the subclasses later.

v = self.execute(check)
valid &= v
if not valid:
return False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if it makes more sense to run every check here and return the success value of each, keyed by the checker's class? Something like:

{
    AccuracyCheck: True,
    ComplianceCheck: False,
    PerformanceCheck: True,
    ...
}

I can see this being clunky if many tests depend on each other, which means that check failures will cascade. In which case my question is should there be a system to determine the dependencies of each test? Something like MeasurementsCheck depends on DirectoryStructureCheck, etc.

if model in self.config.base["models_TEST04"]:
test_list.append("TEST04")
if model in self.config.base["models_TEST06"]:
test_list.append("TEST06")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make more sense to have a ComplianceCheck class (which inherits BaseCheck), then have individual TEST0XCheck subclasses?


if args.scenarios_to_skip:
scenarios_to_skip = [
scenario for scenario in args.scenarios_to_skip.split(",")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a formatter to the project like autopep8 or black?

for logs in loader.load():
# Initialize check classes
performance_checks = PerformanceCheck(
log, logs.loader_data["perf_path"], config, logs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overloading of the term log here feels clunky and confusing. A few comments here:

  1. It seems like bad practices to create a logger named 'main' and pass it around to each Checker. It would be better that each file has it's own logger (logging.getLogger(__file__)) so that if (for instance) ExampleCheck did log.info("Missing file ___"), the message in console would show that it originated in ExampleCheck rather than main.
  2. If each file has its own logger, you no longer need to pass around log everywhere (makes it more concise)
  3. If we are passing in logs, is there a point to also pass in logs.loader_data[key]? Can't that just be extracted by the Check's init method?
  4. If this is simplified down to just xxxx_check = XXXXCheck(config, logs), then it can further be simplified down to
for logs in loader.load():
    for check_cls in [PerformanceCheck, ...]:
        check_cls(config, logs)()

measurements_checks()
power_checks()

with open(args.csv, "w") as csv:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a TODO?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these empty files here?


def load_config(self, version):
# TODO: Load values from
self.models = self.base["models"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned this in the GH Issue, but if the giant model dict is already being stored in a Python file, it should probably be refactored into some hierarchy of dataclasses. Having a class based representation would also make doing the key -> property remapping you're doing here either easier or unnecessary.

Having it as dataclasses rather than a dict also makes the schema of the config more defined and easier to navigate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants