Skip to content

Commit 184a759

Browse files
committed
Merge branch 'develop'
2 parents dda5005 + da71554 commit 184a759

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

62 files changed

+2752
-756
lines changed

.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,4 +9,8 @@ __pycache__/
99
/tests/data/icatdump-*.xml
1010
/tests/data/icatdump-*.yaml
1111
/tests/data/ingest-*.xml
12+
/tests/data/ingest-*.xsd
13+
/tests/data/ingest.xslt
14+
/tests/data/metadata-*-inl.xml
15+
/tests/data/metadata-*-sep.xml
1216
/tests/scripts/

.readthedocs.yaml

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Read the Docs configuration file for Sphinx projects
2+
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details
3+
4+
version: 2
5+
6+
build:
7+
os: ubuntu-22.04
8+
tools:
9+
python: "3.11"
10+
jobs:
11+
post_checkout:
12+
- git fetch --unshallow
13+
post_install:
14+
- python setup.py meta
15+
16+
sphinx:
17+
configuration: doc/src/conf.py
18+
19+
python:
20+
install:
21+
- requirements: .rtd-require

.rtd-require

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,8 @@
1-
docutils<0.18
2-
setuptools_scm
3-
suds-community
41
PyYAML
52
lxml
3+
packaging
4+
setuptools
5+
setuptools_scm
6+
suds
7+
sphinx>=2,<3
8+
sphinx-rtd-theme>=0.5,<1

CHANGES.rst

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,61 @@ Changelog
22
=========
33

44

5+
1.1.0 (2023-06-30)
6+
~~~~~~~~~~~~~~~~~~
7+
8+
New features
9+
------------
10+
11+
+ `#113`_, `#123`_: Add module :mod:`icat.ingest`.
12+
13+
+ `#124`_: Add an optional keyword argument `keepInstRel` to
14+
:meth:`icat.entity.Entity.truncateRelations`.
15+
16+
Bug fixes and minor changes
17+
---------------------------
18+
19+
+ `#126`_, `#127`_: Update outdated documentation.
20+
21+
+ `#112`_, `#118`_: Extend icatdata XSD adding extra attributes to
22+
reference objects.
23+
24+
+ `#111`_, `#121`_: Change the type of
25+
:attr:`icat.client.Client.Register` to
26+
:class:`weakref.WeakValueDictionary`, fixing a memory leak.
27+
28+
+ `#119`_, `#120`_: Remove `_config` attribute from
29+
:class:`icat.config.Configuration`.
30+
31+
+ `#115`_, `#116`_: Fix the test suite to work if either PyYAML or
32+
lxml is not available.
33+
34+
+ `#128`_: Return an empty list from
35+
:func:`icat.dump_queries.getDataPublicationQueries` when talking to
36+
an ICAT server older than 5.0.
37+
38+
+ `#117`_: Fixed deprecation warnings from upcoming Python 3.12.
39+
40+
+ `#129`_: Review the build of the documentation at Read the Docs.
41+
42+
.. _#111: https://github.com/icatproject/python-icat/issues/111
43+
.. _#112: https://github.com/icatproject/python-icat/issues/112
44+
.. _#113: https://github.com/icatproject/python-icat/issues/113
45+
.. _#115: https://github.com/icatproject/python-icat/issues/115
46+
.. _#116: https://github.com/icatproject/python-icat/pull/116
47+
.. _#117: https://github.com/icatproject/python-icat/pull/117
48+
.. _#118: https://github.com/icatproject/python-icat/pull/118
49+
.. _#119: https://github.com/icatproject/python-icat/issues/119
50+
.. _#120: https://github.com/icatproject/python-icat/pull/120
51+
.. _#121: https://github.com/icatproject/python-icat/pull/121
52+
.. _#123: https://github.com/icatproject/python-icat/pull/123
53+
.. _#124: https://github.com/icatproject/python-icat/pull/124
54+
.. _#126: https://github.com/icatproject/python-icat/issues/126
55+
.. _#127: https://github.com/icatproject/python-icat/pull/127
56+
.. _#128: https://github.com/icatproject/python-icat/pull/128
57+
.. _#129: https://github.com/icatproject/python-icat/pull/129
58+
59+
560
1.0.0 (2022-12-21)
661
~~~~~~~~~~~~~~~~~~
762

MANIFEST.in

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,12 +9,16 @@ include doc/examples/icat.cfg
99
include doc/examples/icatdump-*.xml
1010
include doc/examples/icatdump-*.yaml
1111
include doc/examples/ingest-*.xml
12+
include doc/examples/metadata-*.xml
1213
include doc/icatdata*.xsd
1314
include doc/man/*
1415
include doc/tutorial/*.py
16+
include etc/ingest-*.xsd
17+
include etc/ingest.xslt
1518
include tests/conftest.py
1619
include tests/data/legacy-icatdump-*.xml
1720
include tests/data/legacy-icatdump-*.yaml
21+
include tests/data/metadata-5.0-badref.xml
1822
include tests/data/ref-icatdump-*.xml
1923
include tests/data/ref-icatdump-*.yaml
2024
include tests/data/summary*

Makefile

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,9 +18,11 @@ doc-man: meta
1818

1919
clean:
2020
rm -rf build
21-
rm -rf __pycache__
21+
rm -rf __pycache__ icat/__pycache__
2222
rm -rf tests/data/example_data.yaml
2323
rm -rf tests/data/icatdump-* tests/data/ingest-*.xml
24+
rm -rf tests/data/ingest-*.xsd tests/data/ingest.xslt
25+
rm -rf tests/data/metadata-*-inl.xml tests/data/metadata-*-sep.xml
2426
rm -rf tests/scripts
2527

2628
distclean: clean

README.rst

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,7 @@
1-
|rtd| |pypi|
1+
|doi| |rtd| |pypi|
2+
3+
.. |doi| image:: https://zenodo.org/badge/37250056.svg
4+
:target: https://zenodo.org/badge/latestdoi/37250056
25

36
.. |rtd| image:: https://img.shields.io/readthedocs/python-icat/latest
47
:target: https://python-icat.readthedocs.io/en/latest/
@@ -47,7 +50,7 @@ the reason why the example scripts require PyYAML.
4750
Copyright and License
4851
---------------------
4952

50-
Copyright 2013–2022
53+
Copyright 2013–2023
5154
Helmholtz-Zentrum Berlin für Materialien und Energie GmbH
5255

5356
Licensed under the `Apache License`_, Version 2.0 (the "License"); you

doc/examples/create-datafile.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,8 @@
6666
investigation = client.assertedSearch(query)[0]
6767

6868
fstats = df_path.stat()
69-
modTime = datetime.datetime.utcfromtimestamp(fstats.st_mtime).isoformat() + "Z"
69+
utc = datetime.timezone.utc
70+
modTime = datetime.datetime.fromtimestamp(fstats.st_mtime, tz=utc)
7071
datafile = client.new("Datafile")
7172
datafile.datafileFormat = dff
7273
datafile.name = conf.datafile.name

doc/examples/ingest.py

Lines changed: 153 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,153 @@
1+
#! /usr/bin/python3
2+
"""Ingest metadata into ICAT.
3+
4+
This scripts demonstrates how to use class IngestReader from the
5+
icat.ingest module to read metadata from a file and add that to ICAT.
6+
The script intents to model the use case of ingesting raw datasets
7+
from the experiment.
8+
9+
The script expects an input directory containing one metadata input
10+
file and one or more subdirectories for each dataset respectively,
11+
e.g. something like::
12+
13+
input_dir
14+
├── metadata.xml
15+
├── dataset_1
16+
│ ├── datafile_a.dat
17+
│ ├── datafile_b.dat
18+
│ └── datafile_c.dat
19+
└── dataset_2
20+
├── datafile_d.dat
21+
├── datafile_e.dat
22+
└── datafile_f.dat
23+
24+
The script takes the name of an investigation as argument. The
25+
investigation MUST exist in ICAT beforehand and all datasets in the
26+
input directory MUST belong to this investigation. The script will
27+
create tha datasets in ICAT, e.g. they MUST NOT exist in ICAT
28+
beforehand. The metadata input file may contain attributes and
29+
related objects (datasetInstrument, datasetTechnique,
30+
datasetParameter) for the datasets provided in the input directory.
31+
The metadata input is restricted in that sense, e.g. this script
32+
enforces that the metadata does not contain any other input.
33+
34+
The XML Schema Definition and XSL Transformation files (ingest.xsd and
35+
ingest.xslt) provided by python-icat (or customized versions thereof)
36+
need to be installed so that class IngestReader will find them
37+
(e.g. in the IngestReader.SchemaDir directory).
38+
39+
There are some limitations to keep things simple:
40+
41+
* the script creates the dataset and datafile objects in ICAT, but
42+
does not upload the file content to IDS. In a real production
43+
workflow, you'd probably have a separate step that copies the files
44+
to the storage managed by IDS while creating the dataset and
45+
datafile objects in ICAT at the same time.
46+
47+
* the script does not care to add a datafileFormat or any descriptive
48+
attributes (fileSize, checksum, datafileModTime) to the datafiles it
49+
creates.
50+
51+
* it is assumed that the investigation can be unambiguously found by
52+
its name.
53+
54+
* a real production workflow would probably apply much stricter
55+
conformance checks on the input (e.g. restrictions on allowed
56+
dataset or datafile names, make sure not to follow any symlinks from
57+
the input directory) and have a more elaborated error handling.
58+
59+
"""
60+
61+
import logging
62+
from pathlib import Path
63+
import icat
64+
import icat.config
65+
from icat.ingest import IngestReader
66+
from icat.query import Query
67+
68+
69+
logging.basicConfig(level=logging.DEBUG)
70+
# Silence some rather chatty modules.
71+
logging.getLogger('suds.client').setLevel(logging.CRITICAL)
72+
logging.getLogger('suds').setLevel(logging.ERROR)
73+
74+
logger = logging.getLogger(__name__)
75+
76+
77+
config = icat.config.Config(ids=False)
78+
config.add_variable('investigation', ("investigation",),
79+
dict(help="name of the investigation"))
80+
config.add_variable('inputdir', ("inputdir",),
81+
dict(help="path to the input directory"),
82+
type=Path)
83+
client, conf = config.getconfig()
84+
client.login(conf.auth, conf.credentials)
85+
86+
query = Query(client, "Investigation", conditions={
87+
"name": "= '%s'" % conf.investigation
88+
})
89+
investigation = client.assertedSearch(query)[0]
90+
91+
92+
class ContentError(RuntimeError):
93+
"""Some invalid content in the input directory.
94+
"""
95+
def __init__(self, base, p, msg):
96+
p = p.relative_to(base)
97+
super().__init__("%s: %s" % (p, msg))
98+
99+
100+
def check(client, path, investigation):
101+
"""Verify the content of the input directory.
102+
103+
The idea is to check the input directory for conformance as much
104+
as possible and to fail early if anything is not as required,
105+
before having committed anything to ICAT.
106+
107+
Returns a tuple with two items: a list of datasets and an
108+
IngestReader.
109+
"""
110+
datasets = []
111+
metadata_path = path / "metadata.xml"
112+
for p0 in path.iterdir():
113+
if p0.name.startswith('.') or p0 == metadata_path:
114+
continue
115+
elif p0.is_dir():
116+
is_empty = True
117+
dataset = client.new("dataset")
118+
dataset.name = p0.name
119+
dataset.complete = False
120+
for p1 in p0.iterdir():
121+
if p1.is_file():
122+
is_empty = False
123+
datafile = client.new("datafile")
124+
datafile.name = p1.name
125+
dataset.datafiles.append(datafile)
126+
else:
127+
raise ContentError(path, p1, 'unexpected item')
128+
if is_empty:
129+
raise ContentError(path, p0, 'empty dataset directory')
130+
datasets.append(dataset)
131+
else:
132+
raise ContentError(path, p0, 'unexpected item')
133+
try:
134+
reader = IngestReader(client, metadata_path, investigation)
135+
reader.ingest(datasets, dry_run=True, update_ds=True)
136+
except (icat.InvalidIngestFileError, icat.SearchResultError) as e:
137+
raise ContentError(path, metadata_path,
138+
"%s: %s" % (type(e).__name__, e))
139+
return (datasets, reader)
140+
141+
logger.info("ingesting from directory %s into investigation %s",
142+
conf.inputdir, investigation.name)
143+
datasets, reader = check(client, conf.inputdir, investigation)
144+
logger.debug("input directory checked, found %d datasets", len(datasets))
145+
for ds in datasets:
146+
ds.create()
147+
ds.truncateRelations(keepInstRel=True)
148+
logger.debug("created dataset %s", ds.name)
149+
reader.ingest(datasets)
150+
for ds in datasets:
151+
ds.complete = True
152+
ds.update()
153+
logger.debug("ingest done")

doc/examples/metadata-4.4-inl.xml

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
<?xml version='1.0' encoding='UTF-8'?>
2+
<icatingest version="1.0">
3+
<head>
4+
<date>2023-06-16T11:01:15+02:00</date>
5+
<generator>metadata-writer 0.27a</generator>
6+
</head>
7+
<data>
8+
<dataset id="Dataset_1">
9+
<name>testingest_inl_1</name>
10+
<description>Dy01Cp02 at 2.7 K</description>
11+
<startDate>2022-02-03T15:40:12+01:00</startDate>
12+
<endDate>2022-02-03T17:04:22+01:00</endDate>
13+
<parameters>
14+
<stringValue>neutron</stringValue>
15+
<type name="Probe"/>
16+
</parameters>
17+
<parameters>
18+
<numericValue>5.3</numericValue>
19+
<type name="Reactor power" units="MW"/>
20+
</parameters>
21+
<parameters>
22+
<numericValue>2.74103</numericValue>
23+
<rangeBottom>2.7408</rangeBottom>
24+
<rangeTop>2.7414</rangeTop>
25+
<type name="Sample temperature" units="K"/>
26+
</parameters>
27+
<parameters>
28+
<numericValue>4.1357</numericValue>
29+
<rangeBottom>4.0573</rangeBottom>
30+
<rangeTop>4.1567</rangeTop>
31+
<type name="Magnetic field" units="T"/>
32+
</parameters>
33+
<parameters>
34+
<stringValue>Dy01Cp02</stringValue>
35+
<type name="Comment"/>
36+
</parameters>
37+
</dataset>
38+
<dataset id="Dataset_2">
39+
<name>testingest_inl_2</name>
40+
<description>Dy01Cp02 at 5.1 K</description>
41+
<startDate>2022-02-03T17:13:10+01:00</startDate>
42+
<endDate>2022-02-03T18:45:27+01:00</endDate>
43+
<parameters>
44+
<stringValue>neutron</stringValue>
45+
<type name="Probe"/>
46+
</parameters>
47+
<parameters>
48+
<numericValue>5.3</numericValue>
49+
<type name="Reactor power" units="MW"/>
50+
</parameters>
51+
<parameters>
52+
<numericValue>5.1239</numericValue>
53+
<rangeBottom>5.1045</rangeBottom>
54+
<rangeTop>5.1823</rangeTop>
55+
<type name="Sample temperature" units="K"/>
56+
</parameters>
57+
<parameters>
58+
<numericValue>3.9345</numericValue>
59+
<rangeBottom>3.7253</rangeBottom>
60+
<rangeTop>4.0365</rangeTop>
61+
<type name="Magnetic field" units="T"/>
62+
</parameters>
63+
<parameters>
64+
<stringValue>Dy01Cp02</stringValue>
65+
<type name="Comment"/>
66+
</parameters>
67+
</dataset>
68+
</data>
69+
</icatingest>

0 commit comments

Comments
 (0)