analysis

Module: missions.operations.analysis

Create one plot for each possible parameter combination from a PerformanceResultSummary

This module contains implementations for analyzing data contained in a csv file (e.g. the result of a Weka Classification Operation).

An AnalysisProcess consists of evaluating the effect of several parameter on a set of metrics. For each numeric parameter, each pair of numeric parameters and each nominal parameter, one plot is created for each metric.

Furthermore, for each value of each parameter, the rows of the data where the specific parameter takes on the specific value are selected and the same analysis is done for this subset recursively.

This is useful for large experiments where several parameters are differed. For instance, if one wants to analyze how the performance is for certain settings of certain parameters, on can get all plots in the respective subdirectories. For instance, if one is interested only in the performance of one classifier, on can go into the subdirectory of the respective classifier.

Note

This operation should not be used any longer, since it produces to many files. If you want to draw all interesting pictures, use comp_analysis instead. If you want to have only few pictures, use the performance_results_analysis gui.

Inheritance diagram for pySPACE.missions.operations.analysis:

Inheritance diagram of pySPACE.missions.operations.analysis

Class Summary

AnalysisOperation(processes, operation_spec, ...) Operation to analyze and plot performance result data
AnalysisProcess(result_dir, data_dict, ...) Process for analyzing and plotting data

Classes

AnalysisOperation

class pySPACE.missions.operations.analysis.AnalysisOperation(processes, operation_spec, result_directory, number_processes, create_process=None)[source]

Bases: pySPACE.missions.operations.base.Operation

Operation to analyze and plot performance result data

An AnalysisOperation loads the data from a csv-file (typically the result of a Weka Classification Operation) and evaluates the effect of various parameters on several metrics.

Class Components Summary

_createProcesses(processes, result_dir, ...) Recursive function that is used to create the analysis processes
_numberOfProcesses(number_of_processes, ...) Recursive function to determine the number of processes that
consolidate()
create(operation_spec, result_directory[, ...]) A factory method that creates an Analysis operation based on the
__init__(processes, operation_spec, result_directory, number_processes, create_process=None)[source]
classmethod create(operation_spec, result_directory, debug=False, input_paths=[])[source]

A factory method that creates an Analysis operation based on the information given in the operation specification operation_spec

classmethod _numberOfProcesses(number_of_processes, number_of_parameter_values)[source]

Recursive function to determine the number of processes that will be created for the given number_of_parameter_values

classmethod _createProcesses(processes, result_dir, data_dict, parameters, metrics, top_level)[source]

Recursive function that is used to create the analysis processes

Each process creates one plot for each numeric parameter, each pair of numeric parameters, and each nominal parameter based on the data contained in the data_dict. The results are stored in result_dir. The method calls itself recursively for each value of each parameter.

consolidate()[source]

AnalysisProcess

class pySPACE.missions.operations.analysis.AnalysisProcess(result_dir, data_dict, parameters, metrics)[source]

Bases: pySPACE.missions.operations.base.Process

Process for analyzing and plotting data

An AnalysisProcess consists of evaluating the effect of several parameters on a set of metrics. For each numeric parameter, each pair of numeric parameters and each nominal parameter, one plot is created for each metric.

Expected arguments

result_dir:The directory in which the actual results are stored
data_dict:A dictionary containing all the data. The dictionary contains a mapping from an attribute (e.g. accuracy) to a list of values taken by an attribute. An entry is the entirety of all i-th values over all dict-values
parameters:The parameters which have been varied during the experiment and whose effect on the metrics should be investigated. These must be keys of the data_dict.
metrics:The metrics the should be evaluated. Must be keys of the data_dict.

Class Components Summary

__call__() Executes this process on the respective modality
_plot_nominal(data, result_dir, x_key, y_key) Creates a boxplot of the y_keys for the given nominal parameter x_key.
_plot_numeric(data, result_dir, x_key, y_key) Creates a plot of the y_keys for the given numeric parameter x_key.
_plot_numeric_vs_nominal(data, result_dir, ...) Plot for comparison of several different values of a nominal parameter
_plot_numeric_vs_numeric(data, result_dir, ...) Contour plot of the value_keys for the two numeric parameters axis_keys.
_scalar_metric(metric, numeric_parameters, ...) Creates the plots for a scalar metric
_sequence_metric(metric, numeric_parameters, ...) Creates the plots for a sequence metric
__init__(result_dir, data_dict, parameters, metrics)[source]
__call__()[source]

Executes this process on the respective modality

_scalar_metric(metric, numeric_parameters, nominal_parameters)[source]

Creates the plots for a scalar metric

_sequence_metric(metric, numeric_parameters, nominal_parameters, mwa_window_length)[source]

Creates the plots for a sequence metric

_plot_numeric(data, result_dir, x_key, y_key, conditions=[], one_figure=False, show_errors=False)[source]

Creates a plot of the y_keys for the given numeric parameter x_key.

A method that allows to create a plot that visualizes the effect of differing one variable onto a second one (e.g. the effect of differing the number of features onto the accuracy).

Expected arguments

data:A dictionary, that contains a mapping from an attribute (e.g. accuracy) to a list of values taken by an attribute. An entry is the entirety of all i-th values over all dict-values
result_dir:The directory in which the plots will be saved.
x_key:The key of the dictionary whose values should be used as values for the x-axis (the independent variables)
y_key:The key of the dictionary whose values should be used as values for the y-axis, i.e. the dependent variables
conditions:A list of functions that need to be fulfilled in order to use one entry in the plot. Each function has to take two arguments: The data dictionary containing all entries and the index of the entry that should be checked. Each condition must return a boolean value.
one_figure:If true, all curves are plotted in the same figure. Otherwise, for each value of curve_key, a new figure is generated (currently ignored)
show_errors:If true, error bars are plotted
_plot_numeric_vs_numeric(data, result_dir, axis_keys, value_key)[source]

Contour plot of the value_keys for the two numeric parameters axis_keys.

A method that allows to create a contour plot that visualizes the effect of differing two variables on a third one (e.g. the effect of differing the lower and upper cutoff frequency of a bandpass filter onto the accuracy).

Expected arguments

data:A dictionary that contains a mapping from an attribute (e.g. accuracy) to a list of values taken by an attribute. An entry is the entirety of all i-th values over all dict-values
result_dir:The directory in which the plots will be saved.
axis_keys:The two keys of the dictionary that are assumed to have an effect on a third variable (the dependent variable)
value_key:The dependent variables whose values determine the color of the contour plot
_plot_numeric_vs_nominal(data, result_dir, numeric_key, nominal_key, value_key)[source]

Plot for comparison of several different values of a nominal parameter

A method that allows to create a plot that visualizes the effect of varying one numeric parameter onto the performance for several different values of a nominal parameter.

Expected arguments

data:A dictionary that contains a mapping from an attribute (e.g. accuracy) to a list of values taken by an attribute. An entry is the entirety of all i-th values over all dict-values
result_dir:The directory in which the plots will be saved.
numeric_key:The numeric parameter whose effect (together with the nominal parameter) onto the dependent variable should be investigated.
nominal_key:The nominal parameter whose effect (together with the numeric parameter) onto the dependent variable should be investigated.
value_key:The dependent variables whose values determine the color of the contour plot
_plot_nominal(data, result_dir, x_key, y_key)[source]

Creates a boxplot of the y_keys for the given nominal parameter x_key.

A method that allows to create a plot that visualizes the effect of differing one nominal variable onto a second one (e.g. the effect of differing the classifier onto the accuracy).

Expected arguments

data:A dictionary, that contains a mapping from an attribute (e.g. accuracy) to a list of values taken by an attribute. An entry is the entirety of all i-th values over all dict-values
result_dir:The director in which the plots will be saved.
x_key:The key of the dictionary whose values should be used as values for the x-axis (the independent variables)
y_key:The key of the dictionary whose values should be used as values for the y-axis, i.e. the dependent variable