ensemble

Module: missions.nodes.classification.ensemble

Ensemble classifiers

http://en.wikipedia.org/wiki/Ensemble_learning

Current implementations use gating for training ensemble methods.

Each gating function expects as input a special kind of PredictionVector: Each component in the vector should correspond to the classification of one node chain of the ensembles (i.e. the dimensionality should be equal to the cardinality of the ensemble and each value of the vector should be one of the prediction scores and you should get a list of labels).

This can be created using the ClassificationFlowsLoader or the SameInputLayerNode.

Inheritance diagram for pySPACE.missions.nodes.classification.ensemble:

Inheritance diagram of pySPACE.missions.nodes.classification.ensemble

Class Summary

ProbVotingGatingNode([enforce_absolute_values]) Add up prediction values for labels to find out most probable label
LabelVotingGatingNode([enforce_absolute_values]) Gating function to classify based on the majority vote
PrecisionWeightedGatingNode(class_labels[, ...]) Gating function to classify based on weighted majority vote
ChampionGatingNode([relevant_class]) Gating function to classify with the classifier that performs best on training data
RidgeRegressionGatingNode([class_labels, ...]) Gating function using ridge regression to learn weighting
KNNGatingNode([n]) Gating function based on k-Nearest-Neighbors

Classes

ProbVotingGatingNode

class pySPACE.missions.nodes.classification.ensemble.ProbVotingGatingNode(enforce_absolute_values=False, **kwargs)[source]

Bases: pySPACE.missions.nodes.base_node.BaseNode

Add up prediction values for labels to find out most probable label

Parameters

enforce_absolute_values:
 

Switch to map the prediction values to their absolute value.

(optional, default:False)

Exemplary Call

-
    node : ProbVotingGating
Author:

Mario M. Krell (mario.krell@dfki.de)

Created:

2012/10/01

POSSIBLE NODE NAMES:
 
  • ProbVotingGating
  • ProbVotingGatingNode
POSSIBLE INPUT TYPES:
 
  • PredictionVector

Class Components Summary

_execute(data) Label with highest sum of prediction values wins
input_types
__init__(enforce_absolute_values=False, **kwargs)[source]
_execute(data)[source]

Label with highest sum of prediction values wins

input_types = ['PredictionVector']

LabelVotingGatingNode

class pySPACE.missions.nodes.classification.ensemble.LabelVotingGatingNode(enforce_absolute_values=False, **kwargs)[source]

Bases: pySPACE.missions.nodes.classification.ensemble.ProbVotingGatingNode

Gating function to classify based on the majority vote

This gating function counts how often each class occurs in the feature vectors. It assigns the instance to the class that got the most votes. It does not require training. If there is no clear vote, the base class is used.

Parameters

see: base node documentation

Exemplary Call

-
    node : LabelVotingGating
Author:

Mario M. Krell (mario.krell@dfki.de)

Created:

2012/10/01

POSSIBLE NODE NAMES:
 
  • LabelVotingGatingNode
  • LabelVotingGating
  • Voting_Gating_Function
POSSIBLE INPUT TYPES:
 
  • PredictionVector

Class Components Summary

_execute(data) Executes the classifier on the given data vector data
input_types
_execute(data)[source]

Executes the classifier on the given data vector data

input_types = ['PredictionVector']

PrecisionWeightedGatingNode

class pySPACE.missions.nodes.classification.ensemble.PrecisionWeightedGatingNode(class_labels, required_vote_ratio=0.5, **kwargs)[source]

Bases: pySPACE.missions.nodes.base_node.BaseNode

Gating function to classify based on weighted majority vote

This gating function computes weights for the ensemble’s classification results based on training data. These weights are set based on the relative precision (compared to the other classification results) on the predicted class. If more than required_vote_ratio of the sum of weighted votes are for class 1, than this node classifies as class 1 from class_labels, else as class 2 from class_labels.

Parameters

class_labels:

Determines the order of the two classes. This is important, when you want that the prediction value is negative for the first class and positive for the other one. Here it is used to define the relevant class for the voting.

required_vote_ratio:
 

Determines the value the weighted sum of votes has to exceed to classify for the first class. The acceptable range is from zero to one, where zero means, classification is always class one and one means, classification is class two if and only if all the votes are for class one.

(optional, default: 0.5)

Exemplary Call

-
    node : Precision_Weighted_Gating_Function
    parameters :
        class_labels : ["Target","Standard"]
        required_vote_ratio : 0.25
Author:

Jan Hendrik Metzen (jhm@informatik.uni-bremen.de)

Created:

2010/05/21

POSSIBLE NODE NAMES:
 
  • Precision_Weighted_Gating_Function
  • PrecisionWeightedGating
  • PrecisionWeightedGatingNode
POSSIBLE INPUT TYPES:
 
  • PredictionVector

Class Components Summary

_execute(data) Executes the classifier on the given data vector data
_stop_training([debug])
_train(data, class_label)
input_types
is_supervised() Returns whether this node requires supervised training
is_trainable() Returns whether this node is trainable.
__init__(class_labels, required_vote_ratio=0.5, **kwargs)[source]
is_trainable()[source]

Returns whether this node is trainable.

is_supervised()[source]

Returns whether this node requires supervised training

_train(data, class_label)[source]
_stop_training(debug=False)[source]
_execute(data)[source]

Executes the classifier on the given data vector data

input_types = ['PredictionVector']

ChampionGatingNode

class pySPACE.missions.nodes.classification.ensemble.ChampionGatingNode(relevant_class=None, **kwargs)[source]

Bases: pySPACE.missions.nodes.base_node.BaseNode

Gating function to classify with the classifier that performs best on training data

This gating function evaluates the ensemble classifiers on the training data. It picks the classifier that maximizes the F-Measure on the relevant_class and uses this one to classify instances from the test data.

Parameters

relevant_class:

Determines the class being relevant for the F-measure calculation.

(optional, default: first occurring class in training phase)

Exemplary Call

-
    node : Champion_Gating_Function
    parameters :
        relevant_class : "Target"
Author:

Jan Hendrik Metzen (jhm@informatik.uni-bremen.de)

Created:

2010/05/21

POSSIBLE NODE NAMES:
 
  • Champion_Gating_Function
  • ChampionGatingNode
  • ChampionGating
POSSIBLE INPUT TYPES:
 
  • PredictionVector

Class Components Summary

_execute(data) Executes the classifier on the given data vector data
_stop_training()
_train(data, label)
input_types
is_supervised() Returns whether this node requires supervised training
is_trainable() Returns whether this node is trainable.
__init__(relevant_class=None, **kwargs)[source]
is_trainable()[source]

Returns whether this node is trainable.

is_supervised()[source]

Returns whether this node requires supervised training

_train(data, label)[source]
_stop_training()[source]
_execute(data)[source]

Executes the classifier on the given data vector data

input_types = ['PredictionVector']

RidgeRegressionGatingNode

class pySPACE.missions.nodes.classification.ensemble.RidgeRegressionGatingNode(class_labels=['Standard', 'Target'], use_labels=True, regularization_coefficient=0.0, classification_threshold=0.0, **kwargs)[source]

Bases: pySPACE.missions.nodes.base_node.BaseNode

Gating function using ridge regression to learn weighting

This method performs ridge regression solving the linear least squares solution with Tikhonov regularization: weights = (A^TA + Tau^T Tau)^-1 * A^T b where A is the feature matrix, b is the class vector and Tau is the Tikhonov regularization matrix. It classifies as class 1 from class_labels if the dot product of weights and data is larger than the the classification_threshold else as class 2 from class_labels.

The regularization matrix is diag(regularization_coefficient**0.5).

Parameters

class_labels:

Determines the order of the two classes. This is important, when you want that the prediction value is negative for the first class and positive for the other one. Here it is used to define the relevant class where the resulting voting value has to exceed the threshold.

(optional, default:[“Standard”,”Target”])

use_labels:

Should determine whether the labels are mapped to -1 and 1 or if the prediction value is used. NOT yet implemented!

(optional, default:True)

regularization_coefficient:
 

Necessary parameter of the Tikhanov regularization. As a default this is not active.

(optional, default:0.0)

classification_threshold:
 

Threshold which has to be exceeded by regression, such that the sample is classified with the second class.

(optional, default:0.0)

Exemplary Call

-
    node : Ridge_Regression_Gating_Function
    parameters :
        class_labels : ["Target","Standard"]
        regularization_coefficien : 0.0
        classification_threshold : 0.2
Author:

Jan Hendrik Metzen (jhm@informatik.uni-bremen.de)

Created:

2010/05/21

POSSIBLE NODE NAMES:
 
  • RidgeRegressionGating
  • Ridge_Regression_Gating_Function
  • RidgeRegressionGatingNode
POSSIBLE INPUT TYPES:
 
  • PredictionVector

Class Components Summary

_execute(data) Executes the classifier on the given data vector data
_stop_training([debug])
_train(data, class_label)
input_types
is_supervised() Returns whether this node requires supervised training
is_trainable() Returns whether this node is trainable.
__init__(class_labels=['Standard', 'Target'], use_labels=True, regularization_coefficient=0.0, classification_threshold=0.0, **kwargs)[source]
is_trainable()[source]

Returns whether this node is trainable.

is_supervised()[source]

Returns whether this node requires supervised training

_train(data, class_label)[source]
_stop_training(debug=False)[source]
_execute(data)[source]

Executes the classifier on the given data vector data

Classifies as class 1 if the dot product of weights and data is larger than the the classification threshold else as class 2. .. todo:: Check mapping

input_types = ['PredictionVector']

KNNGatingNode

class pySPACE.missions.nodes.classification.ensemble.KNNGatingNode(n=1, **kwargs)[source]

Bases: pySPACE.missions.nodes.base_node.BaseNode

Gating function based on k-Nearest-Neighbors

Parameters
n:

Number of considered neighbors

(optional, default: 1)

Exemplary Call

-
    node : :KNN_Gating_Function
    parameters :
        n : 1
Author:

Jan Hendrik Metzen (jhm@informatik.uni-bremen.de)

Created:

2010/05/21

POSSIBLE NODE NAMES:
 
  • KNNGatingNode
  • KNN_Gating_Function
  • KNNGating
POSSIBLE INPUT TYPES:
 
  • PredictionVector

Class Components Summary

_execute(data) Executes the classifier on the given data vector data
_train(data, label)
input_types
is_supervised() Returns whether this node requires supervised training
is_trainable() Returns whether this node is trainable.
__init__(n=1, **kwargs)[source]
is_trainable()[source]

Returns whether this node is trainable.

is_supervised()[source]

Returns whether this node requires supervised training

_train(data, label)[source]
_execute(data)[source]

Executes the classifier on the given data vector data

input_types = ['PredictionVector']