Given a description yaml file, generate a set of types and datasets for ServiceX
Produce a analysis-focused python package that:
- Has types for the objects a user is likely to access on a particular backend
- Has any injected code needed (to access things like special collections off the main
Eventobject) - Can inject intelligent code, like apply special scripts or include files for the C++ backend.
- Defines a typed flavor of a
servicexdataset object.
After pip installing this package, the following command will write out a package in the parent directory:
sx_type_gen <path-to-type-yaml-file> --version 1.1.0b2 [--output_directory <dir-for-output>]Note that output package name is configured to be func_adl_servicex_xaodrRXX where XX comes
from the input yaml file.
Full set of options:
usage: sx_type_gen [-h] [--version VERSION] [--output_directory OUTPUT_DIRECTORY] yaml_type_file
Generate python package
positional arguments:
yaml_type_file The yaml file that contains the type info
options:
-h, --help show this help message and exit
--version VERSION The version of the package to generate (1.1.0b2 or 1.1.0, etc.)
--output_directory OUTPUT_DIRECTORY
The output directory for the generated python package
You'll need to setup:
- The
func-adl-types-atlaspackage will need to be checked out - This package,
func_adl_servicex_type_generatorwill need to bepip installed
Then follow these steps:
- Build the type file for a given atlas release:
func-adl-types-atlas\scripts\build_xaod_edm.ps1 21.2.184 184.yaml - Build the package
sx_type_gen 184.yaml --version 1.X.XaX --output_directory <dir>. - Publish the package. Use a shell that has
poetryinstalledpoetry buildpoetry publish
To get things setup:
- Load the
yamlfile that contains a definition of the collections and other object types necessary to access the data in the type of file we are producing for. - Uses a small amount of other metadata in addition
- The name of what it is producing, like xAODR21 or similar.
- Write out a python package that a user can install.
Then the user:
- User
pip install's the package in the environment - User starts a typed
func_adlquery from the dataset provided by this package.- User will need to specify the dataset and the backend name.
This package is now in production. Below is a good list of features that were built, but should be removed eventually:
- Produce very simple ATLAS
xAODtyped objects to access collections likeJets, etc, in a R21 xAOD (C++ backend). This should include a locally installable package (pip install -e). - In a second package start developing a Jupyter notebook/book showing off the features for accessing the above collections
- After
Jets, do 'EventInfo' and 'MissingET'. These two should generalize the system to other types. - Add automatic collection injection (so that we don't need definitions in the xAOD backend)
- Access Jet constituents from Jet objects
- Access truth particle arrays from their parent collection articles
- Do something that requires a separate include file to access an object (include file injection).
- Add support for arbitrary injection of other packages in the ATLAS C++ backend (e.g. corrections). Use
Jetsto develop this. - Support getting a single systematic error or nominal.
- Use common knowledge (CP groups) to get the first set of collections and implement those:
- Muons
- Electrons
- Taus
- Tracks
- Primary Vertices
- MissingET (basics)
- Trigger
- Technical Debt Cleanup
- Use python decorators for all class methods and classes themselves (and convert everything to use them)
- Track changes to the ast inside nested functions (a default argument to a function inside a select)
- Make sure the type propagations works inside the lambda functions for Select, Where, etc.
- Fix the trigger object matching
- Get rid of ctor generation (ops!)
- Add extra methods to make method resolution in jets work properly
- Add missing Where, etc., so predicate type checking works in all demos
- Once calibrations fixed, make sure calibration=None (if value) is allowed by type checker
- Add support for
Jet::getAttribute, which is a C++ code-behind function, but likely it is a method - Add support for decorator access
- Fix up calibration model
- Enum's
This package is using poetry. As of writing, the following works on windows (the latest version of poetry is broken on windows):
pip install poetry==1.1.7
cd func_adl_servicex_type_generator
poetry env use python
poetry install
poetry shell
code .All tests should run out of the box with pytest. Everything on master should always pass all tests and have excellent code coverage. Work should occur on branches.