Add a Job Status enum + scheduler refactoring#92
Add a Job Status enum + scheduler refactoring#92AlexJones0 wants to merge 5 commits intolowRISC:masterfrom
Conversation
Address the `S101` failure with ruff check. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Break apart a couple of the more complex functions to resolve ruff lint warnings related to method complexity. Also take the opportunity to slightly improve some of the documentation on these functions and to refactor some of the code to be nicer to read. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
There may still be more work needed on the typing front but this resolves the singular remaining type error given by pyright within the scheduler. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
c8dd48b to
4f91b01
Compare
This 'E' error launcher status is not well-defined or well-known within the rest of DVSim, but can be seen as meaningfully different from the 'F'/'FAILED' status which describes the dispatched job failing, as this denotes a failure within the launcher. As DVSim currently does not handle this status (almost at-all!) and this is the single occurrence of the 'E' status in the code, we instead convert this into a more appropriate raised `LauncherError`, as in reality this case only happens when calling `poll()` either before `launch()` or after `launch()` already reported some error. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Currently the job status is passed around DVSim as a string, where the
expectation is that it is always a single upper case character in ('Q',
'D', 'P', 'F', 'K') - some limited parts of the code also handle an 'E'
error case.
Turn these into a small Enum type so that we have much stricter typing
on these values and improved ergonomics when referring to different job
statuses in code.
This status Enum lives in its own module instead of with the other job
data to avoid circular imports (as the job status' registered callbacks
reference the launcher definition).
Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
4f91b01 to
32564b2
Compare
rswarbrick
left a comment
There was a problem hiding this comment.
Lots of nitty comments (sorry!) but I like this. I especially like the enum change: it's a great improvement from magic one-character strings!
src/dvsim/scheduler.py
Outdated
| assert job.dependencies | ||
| if not job.dependencies: | ||
| err = "Job item exists but the next_item's dependency list is empty?" | ||
| raise RuntimeError(err) |
There was a problem hiding this comment.
Probably better to skip the variable, I think:
if not job.dependencies:
raise RuntimeError("Job item exists but the "
"next_item's dependency list is empty?")I'd probably also write it as "... the next item's...", to make a bit more sense to the person reading it.
| job: JobSpec = self._jobs[job_name] | ||
|
|
||
| if job.target not in self._scheduled: | ||
| msg = f"Scheduler does not contain target {job.target}" |
There was a problem hiding this comment.
Again, I'd compute the message inside the function call parentheses.
There was a problem hiding this comment.
The linting rules explicitly reject that. Can't remember why exactly, something to do with the stack trace. I think you end up with the message there twice, pre formatted and then formatted.
We could disable the lint rule if you feel strongly about it. But that's how it is at the moment.
|
|
||
| We choose the target that follows the 'item''s current target and find | ||
| the list of successors whose dependency list contains 'item'. If 'item' | ||
| We choose the target that follows the "item"'s current target and find |
There was a problem hiding this comment.
I think the correct change here is to remove the quotes entirely for the first occurrence and replace the other two with "job_name".
| cfgs = set(self._scheduled[target]) | ||
|
|
||
| target = next(iter(self._scheduled), None) | ||
| cfgs = {} if target is None else set(self._scheduled[target]) |
There was a problem hiding this comment.
Oops: {} is an empty dictionary, not a set.
| if not job.dependencies: | ||
| err = "Job item exists but the next_item's dependency list is empty?" | ||
| raise RuntimeError(err) | ||
| msg = "Job item exists but the next_item's dependency list is empty?" |
There was a problem hiding this comment.
I know this code was here already, but you're touching it! Probably put the message inside the function call and get rid of a variable.
There was a problem hiding this comment.
Again, this is due to the linting rules. Doesn't mean it's what we want, but that's why.
| def _check_status(self) -> tuple[str, ErrorMessage | None]: | ||
| """Determine the outcome of the job (P/F if it ran to completion). | ||
| def _check_status(self) -> tuple[JobStatus, ErrorMessage | None]: | ||
| """Determine the outcome of the job (Passed/Failed if it ran to completion). |
There was a problem hiding this comment.
Again, I think the enum names need to be ALLCAPS
|
|
||
| Args: | ||
| status: status of the job, either 'P', 'F' or 'K'. | ||
| status: status of the completed job (must be either Passed, Failed or Killed). |
There was a problem hiding this comment.
ALL_CAPS here as well, I think.
| This returns 'D', 'P', 'F', or 'K'. If 'D', the job is still running. | ||
| If 'P', the job finished successfully. If 'F', the job finished with | ||
| an error. If 'K' it was killed. | ||
| This returns a job status. If dispatched, the job is still running. |
There was a problem hiding this comment.
More capitals here, I think
| This returns 'D', 'P', 'F', or 'K'. If 'D', the job is still running. | ||
| If 'P', the job finished successfully. If 'F', the job finished with | ||
| an error. If 'K' it was killed. | ||
| This returns a job status. If dispatched, the job is still running. |
There was a problem hiding this comment.
Yet more capitals (sorry).
|
|
||
| This returns 'D', 'P' or 'F'. If 'D', the job is still running. If 'P', | ||
| the job finished successfully. If 'F', the job finished with an error. | ||
| This returns a job status. If dispatched, the job is still running. |
machshev
left a comment
There was a problem hiding this comment.
Thanks @AlexJones0 this is a really nice tidy up. Some comments, but nothing major.
| JobStatus.DISPATCHED: self.scratch_path + "/" + "dispatched", | ||
| JobStatus.PASSED: self.scratch_path + "/" + "passed", | ||
| JobStatus.FAILED: self.scratch_path + "/" + "failed", | ||
| JobStatus.KILLED: self.scratch_path + "/" + "killed", |
There was a problem hiding this comment.
This could be improved further by converting it to a dictionary comprehension. The path based on lowercase enumeration key.
That way further job statuses can be added by just extending the enum.
| return (datetime.datetime.now() - self.start_time).total_seconds() / 60 | ||
|
|
||
| def poll(self): | ||
| def poll(self) -> JobStatus | None: |
There was a problem hiding this comment.
Under what conditions is the status None? Are we missing a value in JobStatus?
| "poll() was called either before calling launch() or after " | ||
| "ignoring a LauncherError from launch()." | ||
| ) | ||
| raise LauncherError(msg) |
There was a problem hiding this comment.
Is this Exception caught by the scheduler? We don't want to kill the DVSim process due to exceptions in launchers.
| cfgs = set(self._scheduled[target]) | ||
|
|
||
| target = next(iter(self._scheduled), None) | ||
| cfgs = {} if target is None else set(self._scheduled[target]) |
There was a problem hiding this comment.
Fairly sure {} is an empty dictionary. I'm assuming you wanted an empty set instead?
| cfgs = {job.block.name} | ||
|
|
||
| if target is None: | ||
| return [] |
There was a problem hiding this comment.
If the typing is correct and Sequence is all that's needed then you could return an empty tuple instead (). Generally I'd lean towards immutable types where possible.
| JobStatus.PASSED: self.scratch_path + "/" + "passed", | ||
| JobStatus.FAILED: self.scratch_path + "/" + "failed", | ||
| JobStatus.KILLED: self.scratch_path + "/" + "killed", | ||
| } |
There was a problem hiding this comment.
This could also be a comprehension.
This PR performs some small cleanup of the scheduler in preparation for potential larger-scale changes. Specifically, it aims to fix any outstanding
pyrightandruff checkfailures that exist within the scheduler itself, to provide a good base for further refactoring or modification.One other clear deficiency in the design that I wanted to address was the use of a string (always a single upper-case character) like "D", "P" or "K" to refer to the Job status, rather than the more standard Enumerated type. An enum would improve readability & ergonomics, and add stricter typing of status values - so I've opted to make this change, which is quite wide-spread (albeit simple).