HDF5 files / Data Evaluation

Display HDF5 File

You can display the HDF5 file you obtained from CAMELS by dragging-and-dropping it into the following webpage or any other HDF5-viewer.

NOMAD CAMELS toolbox

To assist with the evaluation of data, we provide the package nomad_camels_toolbox.

Currently, it only helps with reading the data from the hdf5 file. More functionality is planned for the future.

Installation

To install the NOMAD CAMELS toolbox, run

pip install nomad-camels-toolbox[pandas]

in the Python environment you use for your evaluation. This installs pandas as a powerful package for data evaluation along with the toolbox, so the data can be read directly as a pandas.DataFrame.

Note

If you do not want to install the functionalities that come along with pandas, you can run pip install nomad-camels-toolbox instead. However, we recommend using pandas.

Reading Data

The main usage is to read the data from files produced by CAMELS. To read the file at file_path, you can run:

import nomad_camels_toolbox as nct

data = nct.read_camels_file(file_path)

If there is only one entry in the hdf5 file, it will automatically read the main dataset with this code.

Note

For more information, see the code reference.

Your data will then be in a pandas DataFrame and can be accessed like:

detector_data = data['demo_instrument_detectorComm']
motorx_data = data['demo_instrument_motorX']
motory_data = data['demo_instrument_motorY']

You can find more examples for data evaluation below.

Using h5py

Note

For most data, we provide a more convinient way to handle the data, see the NOMAD CAMELS toolbox.

Warning

This section uses older versions of the CAMELS data structure and may be deprecated in some parts. The concepts about Using h5py are still valid.

Reading HDF5 Files from CAMELS

You can read a measurement file using the Python h5py package.

To work with your data in Python run

import h5py
f = h5py.File('mytestfile.hdf5', 'r')

You can access the contents of the HDF5 file just like a dictionary.

Create Data Plots

This is how you can create a 2D-plot from a detector where the motor was moved in x and y direction and for each position the detector was read

import h5py
import matplotlib.pyplot as plt

f = h5py.File("data.h5", 'r')
all_measurement_keys = list(f.keys()) # List of all the measurements that were performed

# Pick a single measurement
measurement = f['2023-12-01T10:47:44.718633+01:00']
keys_measurement = list(measurement.keys()) # List of all the entries to this measurement

# Pick a single data set
data = measurement['data']
data_keys = list(data.keys()) # List of all the channels that were read

# Get the data of a single channel you read
detector_data = data['demo_instrument_detectorComm']
motorx_data = data['demo_instrument_motorX']
motory_data = data['demo_instrument_motorY']
# These HDF5 datasets act very similar to ndarrays
# And can be used very similar to any np.array

for index, detector_data_point in enumerate(detector_data):
    plt.scatter(motorx_data[index], 
                motory_data[index] , 
                c=detector_data_point, 
                marker=',' ,
                cmap='viridis', 
                vmin=min(detector_data), 
                vmax=max(detector_data)
                )

Alt text

Get Metadata

Navigate the file like a Python dictionary to get the desired metadata:

Protocol Overview and Python Script

To get the protocol overview of the performed measurement run

import h5py
import matplotlib.pyplot as plt

f = h5py.File("data.h5", 'r')
all_measurement_keys = list(f.keys()) # List of all the measurements that were performed

# Pick a single measurement
measurement = f['2023-12-01T10:47:44.718633+01:00']

# The strings are saved as UTF-8 encoded bytes
protocol_overview = measurement['protocol_overview'][()].decode('utf-8')
print(protocol_overview)

To get the Python script of the performed measurement run

python_script = measurement['python_script'][()].decode('utf-8')
print(python_script)

Instrument Settings

All instrument settings are found in measurement/instrument/environment/<instrument_name>. You can then list the saved meta data of a single instrument if you run

meta_data_demo_instrument = measurement['instrument']['environment']['demo_instrument']
meta_data_demo_instrument_keys = list(meta_data_demo_instrument.keys())
print(meta_data_demo_instrument_keys)
... ['model', 'name', 'settings', 'short_name']

To list all the actual settings entered in the Manage Instruments window run

list(meta_data_demo_instrument['settings'].keys())
... ['amps',
    'description',
    'detector_noises',
    'motor_noises',
    'mus',
    'set_delays',
    'sigmas',
    'system_delays'
    ]
# Get the gaussian line widths (sigma) of the demo_instrument
meta_data_demo_instrument['settings']['sigmas'][()]
... array([5. , 7. , 0.1])

Convert the full HDF5 File to Python Dictionary

You can use this Python script to read your HDF5 recursively and convert it to a nested dictionary.

Note

This is typically not necessary as the HDF5 file can be used like a dictionary in Python but might be useful in some situations.

import h5py


def h5_to_dict(h5file):
    def h5_to_dict_rec(h5group):
        d = {}
        for key, item in h5group.items():
            if isinstance(item, h5py.Dataset):
                d[key] = item[()]
            elif isinstance(item, h5py.Group):
                d[key] = h5_to_dict_rec(item)
        return d
    with h5py.File(h5file, 'r') as f:
        return h5_to_dict_rec(f)


# Example usage:
data = h5_to_dict(r'C:\Users\file.h5')

Then access the relevant data by navigating through the dictionary.