Add CLI for ingesting tabular forecasts and observations into EMOS#16
Add CLI for ingesting tabular forecasts and observations into EMOS#16gavinevans wants to merge 17 commits into
Conversation
|
I'm reviewing this |
8b134fa to
ed31c06
Compare
Codecov Report
@@ Coverage Diff @@
## improver1538_tabular_ingestion_functions #16 +/- ##
============================================================================
+ Coverage 98.04% 98.08% +0.03%
============================================================================
Files 109 110 +1
Lines 9872 10069 +197
============================================================================
+ Hits 9679 9876 +197
Misses 193 193
Continue to review full report at Codecov.
|
5288575 to
23cb8c1
Compare
… of forecast and observation tables. (metoppv#1572)
bayliffe
left a comment
There was a problem hiding this comment.
Goodness you've written a lot of code for all of this calibration work. Two cli exceptions need tests, otherwise this looks fine to me. I've not run the tests as I am waiting on very slow conda dependency solving.
| f"The requested filepath {forecast} does not contain the " | ||
| f"requested contents: {filters}" | ||
| ) | ||
| raise IOError(msg) |
| f"The requested filepath {truth} does not contain the " | ||
| f"requested contents: {filters}" | ||
| ) | ||
| raise IOError(msg) |
…s cube (metoppv#1582) * Modifications to functions required to support converting tabular data into iris cubes. * Modifications to the columns expected within the truth table. * Modify environments in preparation for changes required for ingestion of forecast and observation tables. * Edits to expect the period column to be a timedelta64 dtype. * Corrections to __init__.py * Modifications to use assertCubeEqual to ensure the full cubes are compared. * Modifications following review comments. * Minor updates to docstrings. * Add missing unit tests. * Correction to csv files. * Sort lists. * Minor docstring amendment. * NOT WORKING: Working commit to try to avoid using the truth altitude, latitude and longitude at all and replace with those from the forecast. * Extend docstrings. * Extended documentation updates. * Refinement and addition of tests for column name checking. * Modifications to tidy up dataframe preparation. * Minor extended documentation edits. * Further minor docstring edit. * Correct test class naming. * Fix isort.
| iterations may require increasing, as there will be more | ||
| coefficients to solve. | ||
| percentiles (List[float]): | ||
| The set of percentiles to be used for estimating EMOS coefficients. |
There was a problem hiding this comment.
| The set of percentiles to be used for estimating EMOS coefficients. | |
| The set of percentiles to be used for estimating EMOS coefficients. | |
| These should be a set of equally spaced quantiles. |
There was a problem hiding this comment.
I've updated this docstring.
| time, wmo_id, diagnostic, latitude, longitude and altitude. | ||
| Any other columns are ignored. | ||
| percentiles: | ||
| The set of percentiles to be used for estimating EMOS coefficients. |
There was a problem hiding this comment.
| The set of percentiles to be used for estimating EMOS coefficients. | |
| The set of percentiles to be used for estimating EMOS coefficients. | |
| These should be a set of equally spaced quantiles. |
There was a problem hiding this comment.
I've updated this docstring.
| forecast_period, | ||
| training_length, | ||
| percentiles=percentiles, | ||
| ) |
There was a problem hiding this comment.
I think as we discussed last week, some leadtime-cycle combinations won't be possible. This will result in no cubes being returned from this function. In this case we should just return and not produce output
There was a problem hiding this comment.
You're correct that this is handling within the plugins by None being returned, although this is a bit trickier within the CLIs, as the CLIs are expecting to save a netCDF file, if a --output is provided. I've made a minor edit to skip the saving if the result is None.
…passing in a list of percentiles, improvements to the unit tests to make them more reflective of the intended input files, checks to ensure the percentiles can be considered to be quantiles and other improvements.
…5.0." This reverts commit 5ccce69.
5ddf334 to
a390238
Compare
… dependencies between the calibration and ECC code.
…rover1538_tabular_ingestion_cli4 * improver1538_move_tabular_ingestion_functions: Move dataframe to cube utilities to a separate file to avoid circular dependencies between the calibration and ECC code.
* Move dataframe to cube utilities to a separate file to avoid circular dependencies between the calibration and ECC code. * Minor docstring updates.
…lar_ingestion_cli4 * upstream/master: Move dataframe to cube utilities (metoppv#1593) Added constant for ultraviolet_index_daytime_max. (metoppv#1590)
Closes: metoppv#1538
Dependent upon metoppv#1572, metoppv#1582.
Description
This PR builds upon work done in metoppv#1582 to add an
estimate_emos_coefficients_from_tableCLI that converts the pandas dataframe into an iris cube (metoppv#1582) and passes that iris cube to the existing EMOS functionality.Note that this PR does contain an update to acceptance.py to support providing a directory as a known good output in the acceptance tests. This is required for compatibility with a partitioned parquet file.
Please note that the decision about whether to use fastparquet 0.5.0 or 0.7.1 is subject to change due to issues with the Conda environments (https://github.com/MetOffice/improver_suite/pull/1026). The first three commits in this PR assumed the use of 0.7.1. This commit reverts the version of fastparquet to 0.5.0.
Further information is available in this comment.
Testing: