espaloma.data.dataset.Dataset

class espaloma.data.dataset.Dataset(graphs=None)[source]

Bases: abc.ABC, torch.utils.data.dataset.Dataset

The base class of map-style dataset.

Parameters

graphs (List) – objects in the dataset

shuffle()[source]

Randomly shuffle the graphs in the dataset.

apply(fn, in_place=True)[source]

Apply a function to every graph in the dataset. If in_place=True, modify the graph in-place.

split(partitions)[source]

Split the dataset into partitions

subsample(ratio, seed=None)[source]

Subsample the dataset.

save(path)[source]

Save the dataset to a local path.

load(path)[source]

Load a dataset from local path.

Note

This also supports iterative-style dataset by deleting __getitem__ and __len__ function.

Variables

transforms (an iterable of callables that transforms the input.) – the __getiem__ method applies these transforms later.

Examples

>>> data = Dataset([esp.Graph("C")])
__init__(graphs=None)[source]

Methods

__init__([graphs])

apply(fn[, in_place])

Apply functions to the elements of the dataset.

load(path)

Load path to dataset.

regenerate_impropers([improper_def])

Regenerate the improper nodes for all graphs.

save(path)

Save dataset to path.

shuffle([seed])

split(partition)

Split the dataset according to some partition.

subsample(ratio[, seed])

Subsample the dataset according to some ratio.

apply(fn, in_place=False)[source]

Apply functions to the elements of the dataset.

Parameters

fn (callable)

Note

If in_place is False, fn is added to the transforms else it is applied to elements and modifies them.

classmethod load(path)[source]

Load path to dataset.

regenerate_impropers(improper_def='smirnoff')[source]

Regenerate the improper nodes for all graphs.

Parameters

improper_def (str) – Which convention to use for permuting impropers.

save(path)[source]

Save dataset to path.

Parameters

path (path-like object)

split(partition)[source]

Split the dataset according to some partition.

Parameters

partition (sequence of integers or floats)

subsample(ratio, seed=None)[source]

Subsample the dataset according to some ratio.

Parameters

ratio (float) – Ratio between the size of the subsampled dataset and the original dataset.