spech5: h5py-like API to SpecFile¶
This module provides a h5py-like API to access SpecFile data.
API description¶
Specfile data structure exposed by this API:
/
1.1/
title = "…"
start_time = "…"
instrument/
specfile/
file_header = "…"
scan_header = "…"
positioners/
motor_name = value
…
mca_0/
data = …
calibration = …
channels = …
preset_time = …
elapsed_time = …
live_time = …
mca_1/
…
…
measurement/
colname0 = …
colname1 = …
…
mca_0/
data -> /1.1/instrument/mca_0/data
info -> /1.1/instrument/mca_0/
…
2.1/
…
file_header and scan_header are the raw headers as they
appear in the original file, as a string of lines separated by newline (\n) characters.
The title is the content of the #S scan header line without the leading
#S (e.g "1 ascan ss1vo -4.55687 -0.556875 40 0.2").
The start time is converted to ISO8601 format ("2016-02-23T22:49:05Z"),
if the original date format is standard.
Numeric datasets are stored in float32 format, except for scalar integers which are stored as int64.
Motor positions (e.g. /1.1/instrument/positioners/motor_name) can be
1D numpy arrays if they are measured as scan data, or else scalars as defined
on #P scan header lines. A simple test is done to check if the motor name
is also a data column header defined in the #L scan header line.
Scan data (e.g. /1.1/measurement/colname0) is accessed by column,
the dataset name colname0 being the column label as defined in the #L
scan header line.
If a / character is present in a column label or in a motor name in the
original SPEC file, it will be substituted with a % character in the
corresponding dataset name.
MCA data is exposed as a 2D numpy array containing all spectra for a given analyser. The number of analysers is calculated as the number of MCA spectra per scan data line. Demultiplexing is then performed to assign the correct spectra to a given analyser.
MCA calibration is an array of 3 scalars, from the #@CALIB header line.
It is identical for all MCA analysers, as there can be only one
#@CALIB line per scan.
MCA channels is an array containing all channel numbers. This information is
computed from the #@CHANN scan header line (if present), or computed from
the shape of the first spectrum in a scan ([0, … len(first_spectrum] - 1]).
Accessing data¶
Data and groups are accessed in h5py fashion:
from silx.io.spech5 import SpecH5
# Open a SpecFile
sfh5 = SpecH5("test.dat")
# using SpecH5 as a regular group to access scans
scan1group = sfh5["1.1"]
instrument_group = scan1group["instrument"]
# alternative: full path access
measurement_group = sfh5["/1.1/measurement"]
# accessing a scan data column by name as a 1D numpy array
data_array = measurement_group["Pslit HGap"]
# accessing all mca-spectra for one MCA device
mca_0_spectra = measurement_group["mca_0/data"]
SpecH5 and SpecH5Group provide a SpecH5Group.keys() method:
>>> sfh5.keys()
['96.1', '97.1', '98.1']
>>> sfh5['96.1'].keys()
['title', 'start_time', 'instrument', 'measurement']
They can also be treated as iterators:
for scan_group in SpecH5("test.dat"):
dataset_names = [item.name in scan_group["measurement"] if
isinstance(item, SpecH5Dataset)]
print("Found data columns in scan " + scan_group.name)
print(", ".join(dataset_names))
You can test for existence of data or groups:
>>> "/1.1/measurement/Pslit HGap" in sfh5
True
>>> "positioners" in sfh5["/2.1/instrument"]
True
>>> "spam" in sfh5["1.1"]
False
Strings are stored encoded as numpy.string_, as recommended by
the h5py documentation.
This ensures maximum compatibility with third party software libraries,
when saving a SpecH5 to a HDF5 file using silx.io.spectoh5.
The type numpy.string_ is a byte-string format. The consequence of this
is that you should decode strings before using them in Python 3:
>>> from silx.io.spech5 import SpecH5
>>> sfh5 = SpecH5("31oct98.dat")
>>> sfh5["/68.1/title"]
b'68 ascan tx3 -28.5 -24.5 20 0.5'
>>> sfh5["/68.1/title"].decode()
'68 ascan tx3 -28.5 -24.5 20 0.5'
Classes¶
-
silx.io.spech5.is_group(name)[source]¶ Check if
namematches a valid group name pattern in aSpecH5.Parameters: name (str) – Full name of member For example:
is_group("/123.456/instrument/")returnsTrue.is_group("spam")returnsFalsebecause"spam"is not at all a valid group name.is_group("/1.2/instrument/positioners/xyz")returnsFalsebecause this key would point to a motor position, which is a dataset and not a group.
-
silx.io.spech5.is_dataset(name)[source]¶ Check if
namematches a valid dataset name pattern in aSpecH5.Parameters: name (str) – Full name of member For example:
is_dataset("/1.2/instrument/positioners/xyz")returnsTruebecause this name could be the key to the dataset recording motor positions for motorxyzin scan1.2.is_dataset("/123.456/instrument/")returnsFalsebecause this name points to a group.is_dataset("spam")returnsFalsebecause"spam"is not at all a valid dataset name.
-
silx.io.spech5.is_link_to_group(name)[source]¶ Check if
nameis a valid link to a group in aSpecH5. ReturnTrueorFalseParameters: name (str) – Full name of member
-
silx.io.spech5.is_link_to_dataset(name)[source]¶ Check if
nameis a valid link to a dataset in aSpecH5. ReturnTrueorFalseParameters: name (str) – Full name of member
-
silx.io.spech5.spec_date_to_iso8601(date, zone=None)[source]¶ Convert SpecFile date to Iso8601.
Parameters: - date (str) – Date (see supported formats below)
- zone – Time zone as it appears in a ISO8601 date
Supported formats:
DDD MMM dd hh:mm:ss YYYYDDD YYYY/MM/dd hh:mm:ss YYYY
where DDD is the abbreviated weekday, MMM is the month abbreviated name, MM is the month number (zero padded), dd is the weekday number (zero padded) YYYY is the year, hh the hour (zero padded), mm the minute (zero padded) and ss the second (zero padded). All names are expected to be in english.
Examples:
>>> spec_date_to_iso8601("Thu Feb 11 09:54:35 2016") '2016-02-11T09:54:35' >>> spec_date_to_iso8601("Sat 2015/03/14 03:53:50") '2015-03-14T03:53:50'
-
class
silx.io.spech5.SpecH5Dataset(value, name, file_, parent)[source]¶ Bases:
objectEmulate
h5py.Datasetfor a SpecFile object.A
SpecH5Datasetinstance is basically a proxy for the numpy arrayvalueattribute, with additional attributes for compatibility with h5py datasets.Parameters: - value – Actual dataset value
- name (str) – Dataset full name (posix path format, starting with
/) - file – Parent
SpecH5 - parent – Parent
SpecH5Groupwhich contains this dataset
-
value= None¶ Actual dataset, can be a numpy array, a numpy.string_, a numpy.int_ or a numpy.float32
All operations applied to an instance of the class use this.
-
shape= None¶ Dataset shape, as a tuple with the length of each dimension of the dataset.
-
dtype= None¶ Dataset dtype
-
size= None¶ Dataset size (number of elements)
-
name= None¶ “Dataset name (posix path format, starting with
/)
-
parent= None¶ Parent
SpecH5Groupobject which contains this dataset
-
attrs= None¶ Attributes dictionary
-
compression= None¶ Compression attribute as provided by h5py.Dataset
-
compression_opts= None¶ Compression options attribute as provided by h5py.Dataset
-
h5py_class¶ Return h5py class which is mimicked by this class:
h5py.dataset.Accessing this attribute if
h5pyis not installed causes anImportErrorto be raised
-
class
silx.io.spech5.SpecH5LinkToDataset(value, name, file_, parent, target)[source]¶ Bases:
silx.io.spech5.SpecH5DatasetSpecial
SpecH5Datasetrepresenting a link to a dataset. It works like a regular dataset, butSpecH5Group.visit()andSpecH5Group.visititems()methods will recognize that it is a link and will ignore it.A special attribute contains the name of the target dataset:
target-
target= None¶ Name of the target dataset
-
-
class
silx.io.spech5.SpecH5Group(name, specfileh5)[source]¶ Bases:
objectEmulate
h5py.Groupfor a SpecFile objectParameters: - name (str) – Group full name (posix path format, starting with
/) - specfileh5 – parent
SpecH5instance
-
name= None¶ Full name/path of group
-
file= None¶ Parent SpecH5 object
-
attrs= None¶ Attributes dictionary
-
h5py_class¶ Return h5py class which is mimicked by this class:
h5py.Group.Accessing this attribute if
h5pyis not installed causes anImportErrorto be raised
-
parent¶ Parent group (group that contains this group)
-
__contains__(key)[source]¶ Parameters: key – Path to child element (e.g. "mca_0/info") or full name of group or dataset (e.g."/2.1/instrument/positioners")Returns: True if key refers to a valid member of this group, else False
-
get(name, default=None, getclass=False, getlink=False)[source]¶ Retrieve an item by name, or a default value if name does not point to an existing item.
Parameters: - str (name) – name of the item
- default – Default value returned if the name is not found
- getclass (bool) – if True, the returned object is the class of the item, instead of the item instance.
- getlink (bool) – Not implemented. This method always returns an instance of the original class of the requested item (or just the class, if getclass is True)
Returns: The requested item, or its class if getclass is True, or the specified default value if the group does not contain an item with the requested name.
-
__getitem__(key)[source]¶ Return a
SpecH5Groupor aSpecH5Datasetifkeyis a valid name of a group or dataset.keycan be a member ofself.keys(), i.e. an immediate child of the group, or a path reaching into subgroups (e.g."instrument/positioners")In the special case were this group is the root group,
keycan start with a/character.Parameters: key (str) – Name of member Raise: KeyError if keyis not a known member of this group.
-
visit(func, follow_links=False)[source]¶ Recursively visit all names in this group and subgroups.
Parameters: func (function) – Callable (function, method or callable object) You supply a callable (function, method or callable object); it will be called exactly once for each link in this group and every group below it. Your callable must conform to the signature:
func(<member name>) => <None or return value>Returning
Nonecontinues iteration, returning anything else stops and immediately returns that value from the visit method. No particular order of iteration within groups is guaranteed.Example:
# Get a list of all contents (groups and datasets) in a SpecFile mylist = [] f = File('foo.dat') f.visit(mylist.append)
-
visititems(func, follow_links=False)[source]¶ Recursively visit names and objects in this group.
Parameters: func (function) – Callable (function, method or callable object) You supply a callable (function, method or callable object); it will be called exactly once for each member in this group and every group below it. Your callable must conform to the signature:
func(<member name>, <object>) => <None or return value>Returning
Nonecontinues iteration, returning anything else stops and immediately returns that value from the visit method. No particular order of iteration within groups is guaranteed.Example:
# Get a list of all datasets in a specific scan mylist = [] def func(name, obj): if isinstance(obj, SpecH5Dataset): mylist.append(name) f = File('foo.dat') f["1.1"].visititems(func)
- name (str) – Group full name (posix path format, starting with
-
class
silx.io.spech5.SpecH5LinkToGroup(name, specfileh5, target)[source]¶ Bases:
silx.io.spech5.SpecH5GroupSpecial
SpecH5Grouprepresenting a link to a group.It works like a regular group but
SpecH5Group.visit()andSpecH5Group.visititems()methods will recognize it as a link and will ignore it.An additional attribute indicates the name of the target group:
target-
target= None¶ Name of the target group.
-
-
class
silx.io.spech5.SpecH5(filename)[source]¶ Bases:
silx.io.spech5.SpecH5GroupSpecial
SpecH5Grouprepresenting the root of a SpecFile.Parameters: filename (str) – Path to SpecFile in filesystem In addition to all generic
SpecH5Groupattributes, this class also keeps a reference to the originalSpecFileobject and has afilenameattribute.Its immediate children are scans, but it also gives access to any group or dataset in the entire SpecFile tree by specifying the full path.
-
close()[source]¶ Close the object, and free up associated resources.
After calling this method, attempts to use the object may fail.
-
h5py_class¶ h5py class which is mimicked by this class
-