NXTfusion.NXDatasetUtils module

class NXTfusion.NXDatasetUtils.MetaDataset(datasetList, domain1, domain2, name, ignore_index, side1=None, side2=None)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

Class that represents the MetaRelations in the NNwrapper internal Dataset-based version of the ERgraph used for allowing a fast and consistent multi-task mini batching. Each MetaDataset can contain many SubDatasets, and when asked it provides a minibatch sampling from all of them in parallel.

__init__(datasetList, domain1, domain2, name, ignore_index, side1=None, side2=None)[source]

Constructor method for the MetaDataset class. It puts in a pytorch-friendly structure the data corresponding to a target MetaRelation, by storing several SubDataset (each corresponding to a Relation/DataMatrix/matrix).

Parameters
  • datasetList (list of SubDatasets) – List of Subdatasets. Each SubDataset corresponds to a Relation. The MetaDataset thus corresponds to a MetaRelation.

  • domain1 (NX.Entity) – First entity involved in this MetaRelation (all the Relations in it are between the same entities)

  • domain2 (NX.Entity) – Second entity involved in this list of relations (MetaRelation).

  • name (str) – Name of the corresponding MetaRelation

  • ignore_index (int) – Value corresponding to missing values. Used to allow fast runs on GPUs and minibatching even with different percentages of missing values among the Relations/SubDatasets in the same MetaRelation/MetaDataset.

countBalance()[source]
countInstances()[source]
getEstBatchSizeForXsamples(targetDomain1, samplesPerBatch)[source]
getEstBatchSizeForXsamples2(numSamples)[source]
getEstSize()[source]
getTypes()[source]
mergeDataSimple(v, idx)[source]
class NXTfusion.NXDatasetUtils.PredictionDataset(x, label=True)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

class NXTfusion.NXDatasetUtils.SideDataset(side)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

class NXTfusion.NXDatasetUtils.SubDataset(xht, typep='binary')[source]

Bases: Generic[torch.utils.data.dataset.T_co]

Within the NNwrapper, during training, batches need to be rapidly provided for all the MetaRelations in the ERgraph and for each Relation in every MetaRelation. To do so, the NNwrapper.processDatasets function builds an internal Dataset structure that mimicks the structure of the input ERgraph. In this case, MetaDataset correspond to MetaRelation, and each Relation in a MetaRelation is represendet by a SubDataset in the corresponding MetaDataset.

Nevertheless, this is internal and it is transparent to the user. :meta private:

__init__(xht, typep='binary')[source]

Constructor method for the SubDataset class. It puts in a pytorch-friendly structure the matrix corresponding to a target Relation, by transforming its DataMatrix into a pytorch Dataset.

Parameters
  • xht (dict) – Dict used to represent the matrix/relation data within a DataMatrix object

  • type (str) – String specifying the type of the prediction. It must be “regression” or “binary”.

countBalance()[source]
countInstances()[source]
dump(name)[source]
static load(name)[source]