Rework some preprocessing functions, test jobdb implementation for preprocessing #1348

mmccrackan · 2025-08-22T14:57:17Z

Sort of an update to #1156. Does the following:

Switches to per-group parallelization for preprocess_tod and multilayer_preprocess_tod.
Reworks preprocess_tod and multilayer_preprocess_tod to rely more on preproc_or_load_group since it does many of the same things.
Progress bar for preprocess_tod and multilayer_preprocess_tod.
Start adding jobdb support for both preprocess_tod and multilayer_preprocess_tod following similar implementation to record_qa.
Adds some standardized error messages in a class which can be input into a preprocessing error db.
Some more cleanup of functions and docstrings in preprocess_util.

Might split into multiple PRs if possible. Will need to adjust some things to make sure it doesn't break things downstream.

msilvafe

I like how this is looking. Comments for site_pipeline.multilayer_preprocess_tod also apply to site_pipeline.preprocess_tod

sotodlib/preprocess/preprocess_util.py

msilvafe · 2025-12-01T22:16:37Z

sotodlib/preprocess/preprocess_util.py

-        return error[0], [error[1], error[2]], [error[1], error[2]], None
+    if errors[0] is not None:
+        logger.error(f"Get Groups Error for {obs_id}: {group}\n{errors[1]}\n{errors[2]}")
+        return None, None, None, (errors[0], errors[1], errors[2])


Why doesn't this one get the error class?

error[0] is set to PreprocessErrors.GetGroupsError in the get_groups function itself since that gets called outside of preproc_or_load_group sometimes.

I guess a side question here is whether or not calling get_groups to verify if the group exists is worth it since preproc_or_load_group is called by preprocess_tod/multilayer_preprocess_tod which calls get_groups already and we could just let it fail later on if the group doesn't exist.

Alright yes let's delete the get_groups call here but then in the get_obs/load_and_preprocess try/except blocks before returning the error call get_groups to check for the GetGroupsError to distinguish it from other metadata errors that way you only suffer the time penalty to call get_groups twice if it fails on data load.

Already discussed offline, but we don't really need the get_groups call in the excepts since the errors should be fairly explanative (IncompleteMetadata was the primary reason why we added the try except in get_groups before), will fail early on before processing (in get_obs/get_meta), and we have the full traceback. We're also doing per-group runs now so we don't need to worry about a single group messing an entire obs up like before.

sotodlib/preprocess/preprocess_util.py

sotodlib/site_pipeline/multilayer_preprocess_tod.py

mmccrackan · 2025-12-05T18:35:50Z

Okay, after addressing comments and running some tests, I think this is probably in a state where it can be reviewed more officially. I'll do some more testing of the various logic branches to look for failure cases but I've run the first 25 obs_ids from ISO v3 for SATp3 here:

/global/cfs/cdirs/sobs/users/mmccrack/so/preprocess/common/20250821_preproc_jobdb/archives

I've made some edits to the mapmaker to accomodate these changes, but these are untested.

We need to check and update the sims versions.

For future to-dos I'd like to update the error class to actually have its own exceptions but will do that in a different PR. I was also thinking of removing error logging from the standard output to clean that up a bit and only putting it in the error log which should be much more parsable now. I'll also do the merging of load_and_preprocess and multilayer_load_and_preprocess in a different PR after this.

adrien-laposta

I have no particular comments on the code of this PR.
I mainly focused on the functions in preprocess_util.py that are used in the sim filtering routines.
I tested it on v3 preprocessing configuration files loading a data AxisManager from multilayer_load_and_preprocess and using multilayer_and_preprocess_sim to filter a simulated CAR map. I did not get any error and did a quick comparison with processed TOD from the master branch: there is perfect agreement between timestreams.

This PR looks good to me, let me know if you'd like me to test specific features beyond the standard functions we are using to filter simulations.

kmharrington · 2025-12-16T20:21:35Z

sotodlib/site_pipeline/preprocess_tod.py

+                with jdb.locked(job) as j:
+                    j.mark_visited()
+                    if errors[0] is not None:
+                        j.jstate = "failed"


throughout this whole space I think you should import jobdb.JState and use that class instead of these strings.

kmharrington · 2025-12-20T19:41:26Z

I've noticed what might be an edge case to this setup. (but also could be related to how I decided to try and run a bunch of things straight off a jobdb). The save_group_and_cleanup function does not update the jobdb. So if I run has made a bunch of open jobs, they've been saved in temp but not added into the manifestdb, and then things were cancelled. The current setup doesn't have an obvious way to set all those completed jobs to done when things are restarted.

sotodlib/site_pipeline/preprocess_tod.py

Michael McCrackan added 3 commits August 22, 2025 07:50

rework some preproc functions, test jobdb implementation

b0892f2

some updates

520cb90

fix typo, only remove failed jobs

1051c87

mmccrackan added the preprocess label Aug 22, 2025

mmccrackan mentioned this pull request Nov 4, 2025

On the fly load_and_preprocess #1440

Closed

Michael McCrackan and others added 6 commits November 19, 2025 12:13

push for to allow merge

59bf9a4

Merge branch 'master' into 20250822_preproc_update

675e0db

merge fixes

f4a3f54

some fixes/cleanup

20a8537

rework outputs, add jobdb tag

5a09ff4

fix some errors

7370dd9

msilvafe requested changes Dec 1, 2025

View reviewed changes

Michael McCrackan added 8 commits December 2, 2025 11:38

address some comments, rework jobdb, cleanup docstring format

7ff3d39

match dets dict for jobdb, try updating mapmaker

1ada5df

fix init db jobdb check

bc6414f

fix for race condition

e139212

add ignore config check option, tqdm to file, docstring fixes

e0e45ad

fix typos

355dccd

rework group handing, fix error output

9033846

handle None in errlog

111e78e

msilvafe mentioned this pull request Dec 5, 2025

PSD noise ratio cuts #1406

Merged

Michael McCrackan and others added 3 commits December 5, 2025 07:57

track failed vs done, fix subdir

b4bbaa6

Merge branch 'master' into 20250822_preproc_update

fe4b2ed

fix db creation problem

10670ed

mmccrackan marked this pull request as ready for review December 5, 2025 18:29

mmccrackan requested review from adrien-laposta, chervias, kwolz and msilvafe December 5, 2025 18:36

fix save_proc_aman

4ecd0f2

adrien-laposta approved these changes Dec 15, 2025

View reviewed changes

mmccrackan mentioned this pull request Dec 15, 2025

preprocess-tod cannot accept just an obs-id #1500

Open

kmharrington reviewed Dec 16, 2025

View reviewed changes

Michael McCrackan added 6 commits December 18, 2025 13:28

many updates

559919d

fix for jobdb

7d33589

docstring fix

3eeb165

fix for docstring

843f713

add compression

088478c

fix quanta for float32

b52c448

This was referenced Dec 22, 2025

Move PSD back to process #1507

Open

feat: add functions to add multiple jobs at once #1487

Open

Michael McCrackan and others added 3 commits December 23, 2025 08:56

update site pipeline utils imports

13a1e62

Merge branch 'master' into 20250822_preproc_update

78c2d59

disable existence check, handle cleaned up jobs

dc6f345

kmharrington reviewed Dec 30, 2025

View reviewed changes

sotodlib/site_pipeline/preprocess_tod.py Outdated Show resolved Hide resolved

Michael McCrackan added 2 commits January 2, 2026 06:14

fix state

84a418d

update jobdb

9aff589

Rework some preprocessing functions, test jobdb implementation for preprocessing #1348

Are you sure you want to change the base?

Rework some preprocessing functions, test jobdb implementation for preprocessing #1348

Uh oh!

Conversation

mmccrackan commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

msilvafe left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

msilvafe Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

mmccrackan Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

msilvafe Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

mmccrackan Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mmccrackan commented Dec 5, 2025

Uh oh!

adrien-laposta left a comment

Choose a reason for hiding this comment

Uh oh!

kmharrington Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

kmharrington commented Dec 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

mmccrackan commented Aug 22, 2025 •

edited

Loading