consume_training_data¶
Module: missions.nodes.meta.consume_training_data
¶
Splits training data for internal usage and usage of successor nodes
Inheritance diagram for pySPACE.missions.nodes.meta.consume_training_data
:
ConsumeTrainingDataNode
¶
-
class
pySPACE.missions.nodes.meta.consume_training_data.
ConsumeTrainingDataNode
(wrapped_node, consumption_rate, random_seed=0, *args, **kwargs)[source]¶ Bases:
pySPACE.missions.nodes.base_node.BaseNode
Split training data for internal usage and usage of successor nodes
This node allows to handle situations where some model needs to be trained and later on evaluated on the given training data (using test data may not be allowed for certain reasons). Simply training and evaluating the model on the same data is not an option, since the evaluation would have a strong optimistic bias (model is well adapted to the data it was trained on).
One example of such a situation is when a node chain is trained on the data that should be combined later on with an ensemble of node chains trained on historic data. The ensemble training should not happen on the same data as training.
This node therefore splits the training data into two parts: one for internal use (training the model) and one for usage of successor nodes (model evaluation). The ratio of training data that should be used internally can be controlled with the argument consumption_rate (a value between 0.0 and 1.0).
Note
When defining this node in the pySPACE YAML syntax, “wrapped_node” can be the definition of a node in YAML syntax (see below). The node object is then created automatically based on this definition.
Parameters
wrapped_node: The node that is trained with the internally used training data. consumption_rate: The rate of training data that is used internally for training wrapped_node. The remaining data is supplied for the successor nodes. random_seed: The seed of the random generator. Defaults to 0. Exemplary Call
- node: ConsumeTrainingData parameters : consumption_rate : 0.8 wrapped_node : node : Flow_Node parameters : input_dim : 64 output_dim : 1 nodes : ......
Author: Jan Hendrik Metzen (jhm@informatik.uni-bremen.de)
Created: 2010/08/06
POSSIBLE NODE NAMES: - ConsumeTrainingData
- ConsumeTrainingDataNode
POSSIBLE INPUT TYPES: - PredictionVector
- FeatureVector
- TimeSeries
Class Components Summary
_execute
(data)Executes the node on the given data vector data _get_train_set
([use_test_data])Returns the data that can be used for training _stop_training
()Finish the training of the node. _train
(data, label)Trains the wrapped nodes on the given data vector data get_output_type
(input_type[, as_string])Return the output type input_types
is_supervised
()Returns whether this node requires supervised training is_trainable
()Returns whether this node is trainable. node_from_yaml
(node_spec)Creates a node based on the node_spec to overwrite default request_data_for_training
(use_test_data)Returns data for training of subsequent nodes store_state
(result_dir[, index])Stores this node in the given directory result_dir -
static
node_from_yaml
(node_spec)[source]¶ Creates a node based on the node_spec to overwrite default
-
get_output_type
(input_type, as_string=True)[source]¶ Return the output type
The method calls the corresponding method in the wrapped node
-
input_types
= ['PredictionVector', 'FeatureVector', 'TimeSeries']¶