-
Notifications
You must be signed in to change notification settings - Fork 1
feat(era5): rewrite era5 package based on ecmwf-datastores-client #193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Hey, Yeah we could add modular installation options , i can do a pr for this |
| msg = "Dataset still contains 'step' dimension. Please aggregate to daily data first." | ||
| raise ValueError(msg) | ||
| da = ds[variable] | ||
| area_weights = np.cos(np.deg2rad(ds.latitude)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the weighted area was not considered in the previous implementation right?
Just wondering if this could introduce a noticeable difference with the climate data that has already been imported to DHIS2 PNLP, I guess this will not produce a big difference so it should be fine..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difference is going to be higher as the distance to the equator increases, so it should be quite low for DRC
openhexa/toolbox/era5/extract.py
Outdated
| download_format="unarchived", | ||
| ) | ||
|
|
||
| max_requests = 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering, this max request is based on some ERA5 daily download limit that gets reset after 24 hrs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the max limit is 200 active requests in the queue. But it can change depending on the load... :/
openhexa/toolbox/era5/README.md
Outdated
| weekly_data = aggregate_in_time( | ||
| results, | ||
| period=Period.WEEK, | ||
| agg="mean" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the existing implementations (DRC PNLP and SNT) we have all the functionalities in terms of aggregation we need ;) .
Now I'm afraid I'll have to change the era5 pipelines a bit 🤪
Congrats monsieur, nice job ! 🥇
3436ab1 to
34012bf
Compare
The previous Python lib we were using to interact with the Climate Data Store have been deprecated by ECMWF (https://github.com/ecmwf-projects/datapi). They recommend to switch to
ecmwf-datastores-client(https://github.com/ecmwf/ecmwf-datastores-client).We needed a big update to accommodate the change - plus we always had issue with previous version of the package anyway.
The rewrite uses the new
ecmwf-datastores-clientinstead of the deprecateddatapiand reworks the data acquisition pipeline:See README: https://github.com/BLSQ/openhexa-toolbox/tree/feat/era5-rewrite/openhexa/toolbox/era5
@EstebanMontandon can you have a look at the README and tell me if the API makes sense to you? Or if there is any missing feature?
@nazarfil this implies 2 new dependencies for the toolbox: zarr and ecmwf-datastores-client. I wonder if we should make the toolbox more modular to avoid having to install all dependencies when you only want to use the toolbox for the
hexaordhis2modules for example.