Add a Job Status enum + scheduler refactoring by AlexJones0 · Pull Request #92 · lowRISC/dvsim

AlexJones0 · 2026-01-30T17:09:46Z

This PR performs some small cleanup of the scheduler in preparation for potential larger-scale changes. Specifically, it aims to fix any outstanding pyright and ruff check failures that exist within the scheduler itself, to provide a good base for further refactoring or modification.

One other clear deficiency in the design that I wanted to address was the use of a string (always a single upper-case character) like "D", "P" or "K" to refer to the Job status, rather than the more standard Enumerated type. An enum would improve readability & ergonomics, and add stricter typing of status values - so I've opted to make this change, which is quite wide-spread (albeit simple).

Address the `S101` failure with ruff check. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>

Break apart a couple of the more complex functions to resolve ruff lint warnings related to method complexity. Also take the opportunity to slightly improve some of the documentation on these functions and to refactor some of the code to be nicer to read. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>

There may still be more work needed on the typing front but this resolves the singular remaining type error given by pyright within the scheduler. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>

This 'E' error launcher status is not well-defined or well-known within the rest of DVSim, but can be seen as meaningfully different from the 'F'/'FAILED' status which describes the dispatched job failing, as this denotes a failure within the launcher. As DVSim currently does not handle this status (almost at-all!) and this is the single occurrence of the 'E' status in the code, we instead convert this into a more appropriate raised `LauncherError`, as in reality this case only happens when calling `poll()` either before `launch()` or after `launch()` already reported some error. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>

Currently the job status is passed around DVSim as a string, where the expectation is that it is always a single upper case character in ('Q', 'D', 'P', 'F', 'K') - some limited parts of the code also handle an 'E' error case. Turn these into a small Enum type so that we have much stricter typing on these values and improved ergonomics when referring to different job statuses in code. This status Enum lives in its own module instead of with the other job data to avoid circular imports (as the job status' registered callbacks reference the launcher definition). Signed-off-by: Alex Jones <alex.jones@lowrisc.org>

rswarbrick

Lots of nitty comments (sorry!) but I like this. I especially like the enum change: it's a great improvement from magic one-character strings!

rswarbrick · 2026-02-05T20:28:05Z

src/dvsim/scheduler.py

-                    assert job.dependencies
+                    if not job.dependencies:
+                        err = "Job item exists but the next_item's dependency list is empty?"
+                        raise RuntimeError(err)


Probably better to skip the variable, I think:

if not job.dependencies: raise RuntimeError("Job item exists but the " "next_item's dependency list is empty?")

I'd probably also write it as "... the next item's...", to make a bit more sense to the person reading it.

rswarbrick · 2026-02-05T20:30:49Z

src/dvsim/scheduler.py

+        job: JobSpec = self._jobs[job_name]
+
+        if job.target not in self._scheduled:
+            msg = f"Scheduler does not contain target {job.target}"


Again, I'd compute the message inside the function call parentheses.

The linting rules explicitly reject that. Can't remember why exactly, something to do with the stack trace. I think you end up with the message there twice, pre formatted and then formatted.

We could disable the lint rule if you feel strongly about it. But that's how it is at the moment.

rswarbrick · 2026-02-05T20:33:17Z

src/dvsim/scheduler.py


-        We choose the target that follows the 'item''s current target and find
-        the list of successors whose dependency list contains 'item'. If 'item'
+        We choose the target that follows the "item"'s current target and find


I think the correct change here is to remove the quotes entirely for the first occurrence and replace the other two with "job_name".

rswarbrick · 2026-02-05T20:33:43Z

src/dvsim/scheduler.py

-            cfgs = set(self._scheduled[target])
-
+            target = next(iter(self._scheduled), None)
+            cfgs = {} if target is None else set(self._scheduled[target])


Oops: {} is an empty dictionary, not a set.

rswarbrick · 2026-02-05T20:34:41Z

src/dvsim/scheduler.py

                    if not job.dependencies:
-                        err = "Job item exists but the next_item's dependency list is empty?"
-                        raise RuntimeError(err)
+                        msg = "Job item exists but the next_item's dependency list is empty?"


I know this code was here already, but you're touching it! Probably put the message inside the function call and get rid of a variable.

Again, this is due to the linting rules. Doesn't mean it's what we want, but that's why.

rswarbrick · 2026-02-05T20:41:03Z

src/dvsim/launcher/base.py

-    def _check_status(self) -> tuple[str, ErrorMessage | None]:
-        """Determine the outcome of the job (P/F if it ran to completion).
+    def _check_status(self) -> tuple[JobStatus, ErrorMessage | None]:
+        """Determine the outcome of the job (Passed/Failed if it ran to completion).


Again, I think the enum names need to be ALLCAPS

rswarbrick · 2026-02-05T20:41:28Z

src/dvsim/launcher/base.py


        Args:
-            status: status of the job, either 'P', 'F' or 'K'.
+            status: status of the completed job (must be either Passed, Failed or Killed).


ALL_CAPS here as well, I think.

rswarbrick · 2026-02-05T20:42:05Z

src/dvsim/launcher/local.py

-        This returns 'D', 'P', 'F', or 'K'. If 'D', the job is still running.
-        If 'P', the job finished successfully. If 'F', the job finished with
-        an error. If 'K' it was killed.
+        This returns a job status. If dispatched, the job is still running.


More capitals here, I think

rswarbrick · 2026-02-05T20:42:27Z

src/dvsim/launcher/nc.py

-        This returns 'D', 'P', 'F', or 'K'. If 'D', the job is still running.
-        If 'P', the job finished successfully. If 'F', the job finished with
-        an error. If 'K' it was killed.
+        This returns a job status. If dispatched, the job is still running.


Yet more capitals (sorry).

rswarbrick · 2026-02-05T20:42:44Z

src/dvsim/launcher/slurm.py


-        This returns 'D', 'P' or 'F'. If 'D', the job is still running. If 'P',
-        the job finished successfully. If 'F', the job finished with an error.
+        This returns a job status. If dispatched, the job is still running.


More capitals! :-)

machshev

Thanks @AlexJones0 this is a really nice tidy up. Some comments, but nothing major.

machshev · 2026-02-05T20:12:00Z

src/dvsim/flow/one_shot.py

+                JobStatus.DISPATCHED: self.scratch_path + "/" + "dispatched",
+                JobStatus.PASSED: self.scratch_path + "/" + "passed",
+                JobStatus.FAILED: self.scratch_path + "/" + "failed",
+                JobStatus.KILLED: self.scratch_path + "/" + "killed",


This could be improved further by converting it to a dictionary comprehension. The path based on lowercase enumeration key.

That way further job statuses can be added by just extending the enum.

machshev · 2026-02-05T20:20:22Z

src/dvsim/launcher/nc.py

        return (datetime.datetime.now() - self.start_time).total_seconds() / 60

-    def poll(self):
+    def poll(self) -> JobStatus | None:


Under what conditions is the status None? Are we missing a value in JobStatus?

machshev · 2026-02-05T20:25:17Z

src/dvsim/launcher/local.py

+                "poll() was called either before calling launch() or after "
+                "ignoring a LauncherError from launch()."
+            )
+            raise LauncherError(msg)


Is this Exception caught by the scheduler? We don't want to kill the DVSim process due to exceptions in launchers.

machshev · 2026-02-05T20:30:54Z

src/dvsim/scheduler.py

-            cfgs = set(self._scheduled[target])
-
+            target = next(iter(self._scheduled), None)
+            cfgs = {} if target is None else set(self._scheduled[target])


Fairly sure {} is an empty dictionary. I'm assuming you wanted an empty set instead?

machshev · 2026-02-05T20:46:14Z

src/dvsim/scheduler.py

            cfgs = {job.block.name}

+        if target is None:
+            return []


If the typing is correct and Sequence is all that's needed then you could return an empty tuple instead (). Generally I'd lean towards immutable types where possible.

machshev · 2026-02-05T20:49:40Z

src/dvsim/sim/flow.py

+                JobStatus.PASSED: self.scratch_path + "/" + "passed",
+                JobStatus.FAILED: self.scratch_path + "/" + "failed",
+                JobStatus.KILLED: self.scratch_path + "/" + "killed",
            }


This could also be a comprehension.

AlexJones0 added 3 commits January 30, 2026 17:02

refactor: replace assert with RuntimeError

e31c00e

Address the `S101` failure with ruff check. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>

fix: resolve JobSpec/str type error

fbe9914

There may still be more work needed on the typing front but this resolves the singular remaining type error given by pyright within the scheduler. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>

AlexJones0 force-pushed the scheduler_cleanup branch from c8dd48b to 4f91b01 Compare January 30, 2026 17:13

AlexJones0 added 2 commits February 5, 2026 17:06

AlexJones0 force-pushed the scheduler_cleanup branch from 4f91b01 to 32564b2 Compare February 5, 2026 17:22

AlexJones0 requested review from hcallahan-lowrisc, machshev and rswarbrick February 5, 2026 17:25

rswarbrick reviewed Feb 5, 2026

View reviewed changes

machshev approved these changes Feb 5, 2026

View reviewed changes

Conversation

AlexJones0 commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rswarbrick left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

machshev left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AlexJones0 commented Jan 30, 2026 •

edited

Loading