shuffle¶
Module: missions.operations.shuffle
¶
Take combinations of datasets in the summary for training and test each
The input of this operation has to contain several comparable datasets of the same type. Depending on whether the input datasets contain split data, the behavior of this operation differs slightly.
Note
This operation creates an output directory with links, not duplicated files!
If the input datasets are not split, the result of this operation contains one dataset for every pair of datasets of the input_path. For instance, if the input consists of the three datasets “A”, “B”, “C”, the result will at least contain the 6 datasets “A_vs_B”, “A_vs_C”, “B_vs_A”, “B_vs_C, “C_vs_A”, “C_vs_B”. The result dataset “A_vs_B” uses the feature vectors from dataset “A” as training data and the feature vectors from dataset “B” as test data.
If the input datasets contain split data, additionally the input datasets are copied to the result directory so that this would contain 9 datasets. The dataset “X_vs_Y” contains the train data from dataset X from the respective split for training and the test data from dataset Y for testing.
A typical operation specification file might look like this
Specification file Parameters¶
dataset_constraints¶
Optionally, constraints can be passed to the operation that specify which datasets are combined based on the dataset name. For instance, the constraint ‘”%(dataset_name1)s”.strip(“}{”).split(“}{”)[1:] == “%(dataset_name2)s”.strip(“}{”).split(“}{”)[1:]’ would cause that only datasets are combined, that were created by the same preprocessing with the same parameterization.
(optional, default: [])
Exemplary Call¶
type: shuffle
input_path: "operation_results/2009_8_13_15_8_57"
dataset_constraints:
# Combine only datasets that have been created using the same parameterization
- '"%(dataset_name1)s".strip("}{").split("}{")[1:] == "%(dataset_name2)s".strip("}{").split("}{")[1:]'
Inheritance diagram for pySPACE.missions.operations.shuffle
:
Class Summary¶
ShuffleOperation (processes, operation_spec, ...) |
Forwards processing to process |
ShuffleProcess (input_dataset, ...) |
The shuffle process |
Classes¶
ShuffleOperation
¶
-
class
pySPACE.missions.operations.shuffle.
ShuffleOperation
(processes, operation_spec, result_directory, number_processes, create_process=None)[source]¶ Bases:
pySPACE.missions.operations.base.Operation
Forwards processing to process
Class Components Summary
_createProcesses
(processes, ...)Function that creates the shuffle process. consolidate
()Consolidation of the operation’s results create
(operation_spec, result_directory[, ...])Factory method that creates a ShuffleOperation -
__init__
(processes, operation_spec, result_directory, number_processes, create_process=None)[source]¶
-
classmethod
create
(operation_spec, result_directory, debug=False, input_paths=[])[source]¶ Factory method that creates a ShuffleOperation
A factory method that creates a ShuffleOperation based on the information given in the operation specification operation_spec
-
ShuffleProcess
¶
-
class
pySPACE.missions.operations.shuffle.
ShuffleProcess
(input_dataset, result_directory, dataset_constraints)[source]¶ Bases:
pySPACE.missions.operations.base.Process
The shuffle process
Combines datasets that fulfill all dataset_constraints
Class Components Summary
__call__
()Executes this process on the respective modality _copy_arff_file
(input_arff_file_name, ...)Copy the arff files and adjust the relation name in the arff file