HDF5 files / Data Evaluation
Display HDF5 File
You can display the HDF5 file you obtained from CAMELS by dragging-and-dropping it into the following webpage or any other HDF5-viewer.
NOMAD CAMELS toolbox
To assist with the evaluation of data, we provide the package nomad_camels_toolbox
.
Currently, it only helps with reading the data from the hdf5 file. More functionality is planned for the future.
Installation
To install the NOMAD CAMELS toolbox, run
pip install nomad-camels-toolbox[pandas]
in the Python environment you use for your evaluation.
This installs pandas
as a powerful package for data evaluation along with the toolbox, so the data can be read directly as a pandas.DataFrame.
Note
If you do not want to install the functionalities that come along with pandas, you can run pip install nomad-camels-toolbox
instead. However, we recommend using pandas.
Reading Data
The main usage is to read the data from files produced by CAMELS. To read the file at file_path
, you can run:
import nomad_camels_toolbox as nct
data = nct.read_camels_file(file_path)
If there is only one entry in the hdf5 file, it will automatically read the main dataset with this code.
Note
For more information, see the code reference.
Your data will then be in a pandas DataFrame and can be accessed like:
detector_data = data['demo_instrument_detectorComm']
motorx_data = data['demo_instrument_motorX']
motory_data = data['demo_instrument_motorY']
You can find more examples for data evaluation below.
Using h5py
Note
For most data, we provide a more convinient way to handle the data, see the NOMAD CAMELS toolbox.
Warning
This section uses older versions of the CAMELS data structure and may be deprecated in some parts. The concepts about Using h5py are still valid.
Reading HDF5 Files from CAMELS
You can read a measurement file using the Python h5py package.
To work with your data in Python run
import h5py
f = h5py.File('mytestfile.hdf5', 'r')
You can access the contents of the HDF5 file just like a dictionary.
Create Data Plots
This is how you can create a 2D-plot from a detector where the motor was moved in x and y direction and for each position the detector was read
import h5py
import matplotlib.pyplot as plt
f = h5py.File("data.h5", 'r')
all_measurement_keys = list(f.keys()) # List of all the measurements that were performed
# Pick a single measurement
measurement = f['2023-12-01T10:47:44.718633+01:00']
keys_measurement = list(measurement.keys()) # List of all the entries to this measurement
# Pick a single data set
data = measurement['data']
data_keys = list(data.keys()) # List of all the channels that were read
# Get the data of a single channel you read
detector_data = data['demo_instrument_detectorComm']
motorx_data = data['demo_instrument_motorX']
motory_data = data['demo_instrument_motorY']
# These HDF5 datasets act very similar to ndarrays
# And can be used very similar to any np.array
for index, detector_data_point in enumerate(detector_data):
plt.scatter(motorx_data[index],
motory_data[index] ,
c=detector_data_point,
marker=',' ,
cmap='viridis',
vmin=min(detector_data),
vmax=max(detector_data)
)
Get Metadata
Navigate the file like a Python dictionary to get the desired metadata:
Protocol Overview and Python Script
To get the protocol overview of the performed measurement run
import h5py
import matplotlib.pyplot as plt
f = h5py.File("data.h5", 'r')
all_measurement_keys = list(f.keys()) # List of all the measurements that were performed
# Pick a single measurement
measurement = f['2023-12-01T10:47:44.718633+01:00']
# The strings are saved as UTF-8 encoded bytes
protocol_overview = measurement['protocol_overview'][()].decode('utf-8')
print(protocol_overview)
To get the Python script of the performed measurement run
python_script = measurement['python_script'][()].decode('utf-8')
print(python_script)
Instrument Settings
All instrument settings are found in measurement/instrument/environment/<instrument_name>
. You can then list the saved meta data of a single instrument if you run
meta_data_demo_instrument = measurement['instrument']['environment']['demo_instrument']
meta_data_demo_instrument_keys = list(meta_data_demo_instrument.keys())
print(meta_data_demo_instrument_keys)
... ['model', 'name', 'settings', 'short_name']
To list all the actual settings entered in the Manage Instruments
window run
list(meta_data_demo_instrument['settings'].keys())
... ['amps',
'description',
'detector_noises',
'motor_noises',
'mus',
'set_delays',
'sigmas',
'system_delays'
]
# Get the gaussian line widths (sigma) of the demo_instrument
meta_data_demo_instrument['settings']['sigmas'][()]
... array([5. , 7. , 0.1])
Convert the full HDF5 File to Python Dictionary
You can use this Python script to read your HDF5 recursively and convert it to a nested dictionary.
Note
This is typically not necessary as the HDF5 file can be used like a dictionary in Python but might be useful in some situations.
import h5py
def h5_to_dict(h5file):
def h5_to_dict_rec(h5group):
d = {}
for key, item in h5group.items():
if isinstance(item, h5py.Dataset):
d[key] = item[()]
elif isinstance(item, h5py.Group):
d[key] = h5_to_dict_rec(item)
return d
with h5py.File(h5file, 'r') as f:
return h5_to_dict_rec(f)
# Example usage:
data = h5_to_dict(r'C:\Users\file.h5')
Then access the relevant data by navigating through the dictionary.