analysis¶
Module: missions.operations.analysis
¶
Create one plot for each possible parameter combination from a PerformanceResultSummary
This module contains implementations for analyzing data contained in a csv file (e.g. the result of a Weka Classification Operation).
An AnalysisProcess consists of evaluating the effect of several parameter on a set of metrics. For each numeric parameter, each pair of numeric parameters and each nominal parameter, one plot is created for each metric.
Furthermore, for each value of each parameter, the rows of the data where the specific parameter takes on the specific value are selected and the same analysis is done for this subset recursively.
This is useful for large experiments where several parameters are differed. For instance, if one wants to analyze how the performance is for certain settings of certain parameters, on can get all plots in the respective subdirectories. For instance, if one is interested only in the performance of one classifier, on can go into the subdirectory of the respective classifier.
Note
This operation should not be used any longer, since it produces to many files.
If you want to draw all interesting pictures,
use comp_analysis instead. If you want to have only few pictures,
use the performance_results_analysis
gui.
Inheritance diagram for pySPACE.missions.operations.analysis
:
Class Summary¶
AnalysisOperation (processes, operation_spec, ...) |
Operation to analyze and plot performance result data |
AnalysisProcess (result_dir, data_dict, ...) |
Process for analyzing and plotting data |
Classes¶
AnalysisOperation
¶
-
class
pySPACE.missions.operations.analysis.
AnalysisOperation
(processes, operation_spec, result_directory, number_processes, create_process=None)[source]¶ Bases:
pySPACE.missions.operations.base.Operation
Operation to analyze and plot performance result data
An AnalysisOperation loads the data from a csv-file (typically the result of a Weka Classification Operation) and evaluates the effect of various parameters on several metrics.
Class Components Summary
_createProcesses
(processes, result_dir, ...)Recursive function that is used to create the analysis processes _numberOfProcesses
(number_of_processes, ...)Recursive function to determine the number of processes that consolidate
()create
(operation_spec, result_directory[, ...])A factory method that creates an Analysis operation based on the -
__init__
(processes, operation_spec, result_directory, number_processes, create_process=None)[source]¶
-
classmethod
create
(operation_spec, result_directory, debug=False, input_paths=[])[source]¶ A factory method that creates an Analysis operation based on the information given in the operation specification operation_spec
-
classmethod
_numberOfProcesses
(number_of_processes, number_of_parameter_values)[source]¶ Recursive function to determine the number of processes that will be created for the given number_of_parameter_values
-
classmethod
_createProcesses
(processes, result_dir, data_dict, parameters, metrics, top_level)[source]¶ Recursive function that is used to create the analysis processes
Each process creates one plot for each numeric parameter, each pair of numeric parameters, and each nominal parameter based on the data contained in the data_dict. The results are stored in result_dir. The method calls itself recursively for each value of each parameter.
-
AnalysisProcess
¶
-
class
pySPACE.missions.operations.analysis.
AnalysisProcess
(result_dir, data_dict, parameters, metrics)[source]¶ Bases:
pySPACE.missions.operations.base.Process
Process for analyzing and plotting data
An AnalysisProcess consists of evaluating the effect of several parameters on a set of metrics. For each numeric parameter, each pair of numeric parameters and each nominal parameter, one plot is created for each metric.
Expected arguments
result_dir: The directory in which the actual results are stored data_dict: A dictionary containing all the data. The dictionary contains a mapping from an attribute (e.g. accuracy) to a list of values taken by an attribute. An entry is the entirety of all i-th values over all dict-values parameters: The parameters which have been varied during the experiment and whose effect on the metrics should be investigated. These must be keys of the data_dict. metrics: The metrics the should be evaluated. Must be keys of the data_dict. Class Components Summary
__call__
()Executes this process on the respective modality _plot_nominal
(data, result_dir, x_key, y_key)Creates a boxplot of the y_keys for the given nominal parameter x_key. _plot_numeric
(data, result_dir, x_key, y_key)Creates a plot of the y_keys for the given numeric parameter x_key. _plot_numeric_vs_nominal
(data, result_dir, ...)Plot for comparison of several different values of a nominal parameter _plot_numeric_vs_numeric
(data, result_dir, ...)Contour plot of the value_keys for the two numeric parameters axis_keys. _scalar_metric
(metric, numeric_parameters, ...)Creates the plots for a scalar metric _sequence_metric
(metric, numeric_parameters, ...)Creates the plots for a sequence metric -
_scalar_metric
(metric, numeric_parameters, nominal_parameters)[source]¶ Creates the plots for a scalar metric
-
_sequence_metric
(metric, numeric_parameters, nominal_parameters, mwa_window_length)[source]¶ Creates the plots for a sequence metric
-
_plot_numeric
(data, result_dir, x_key, y_key, conditions=[], one_figure=False, show_errors=False)[source]¶ Creates a plot of the y_keys for the given numeric parameter x_key.
A method that allows to create a plot that visualizes the effect of differing one variable onto a second one (e.g. the effect of differing the number of features onto the accuracy).
Expected arguments
data: A dictionary, that contains a mapping from an attribute (e.g. accuracy) to a list of values taken by an attribute. An entry is the entirety of all i-th values over all dict-values result_dir: The directory in which the plots will be saved. x_key: The key of the dictionary whose values should be used as values for the x-axis (the independent variables) y_key: The key of the dictionary whose values should be used as values for the y-axis, i.e. the dependent variables conditions: A list of functions that need to be fulfilled in order to use one entry in the plot. Each function has to take two arguments: The data dictionary containing all entries and the index of the entry that should be checked. Each condition must return a boolean value. one_figure: If true, all curves are plotted in the same figure. Otherwise, for each value of curve_key, a new figure is generated (currently ignored) show_errors: If true, error bars are plotted
-
_plot_numeric_vs_numeric
(data, result_dir, axis_keys, value_key)[source]¶ Contour plot of the value_keys for the two numeric parameters axis_keys.
A method that allows to create a contour plot that visualizes the effect of differing two variables on a third one (e.g. the effect of differing the lower and upper cutoff frequency of a bandpass filter onto the accuracy).
Expected arguments
data: A dictionary that contains a mapping from an attribute (e.g. accuracy) to a list of values taken by an attribute. An entry is the entirety of all i-th values over all dict-values result_dir: The directory in which the plots will be saved. axis_keys: The two keys of the dictionary that are assumed to have an effect on a third variable (the dependent variable) value_key: The dependent variables whose values determine the color of the contour plot
-
_plot_numeric_vs_nominal
(data, result_dir, numeric_key, nominal_key, value_key)[source]¶ Plot for comparison of several different values of a nominal parameter
A method that allows to create a plot that visualizes the effect of varying one numeric parameter onto the performance for several different values of a nominal parameter.
Expected arguments
data: A dictionary that contains a mapping from an attribute (e.g. accuracy) to a list of values taken by an attribute. An entry is the entirety of all i-th values over all dict-values result_dir: The directory in which the plots will be saved. numeric_key: The numeric parameter whose effect (together with the nominal parameter) onto the dependent variable should be investigated. nominal_key: The nominal parameter whose effect (together with the numeric parameter) onto the dependent variable should be investigated. value_key: The dependent variables whose values determine the color of the contour plot
-
_plot_nominal
(data, result_dir, x_key, y_key)[source]¶ Creates a boxplot of the y_keys for the given nominal parameter x_key.
A method that allows to create a plot that visualizes the effect of differing one nominal variable onto a second one (e.g. the effect of differing the classifier onto the accuracy).
Expected arguments
data: A dictionary, that contains a mapping from an attribute (e.g. accuracy) to a list of values taken by an attribute. An entry is the entirety of all i-th values over all dict-values result_dir: The director in which the plots will be saved. x_key: The key of the dictionary whose values should be used as values for the x-axis (the independent variables) y_key: The key of the dictionary whose values should be used as values for the y-axis, i.e. the dependent variable
-