models._feature_management

Module Contents

Functions

process_or_validate_classifier_output_features(output_features,class_labels,supports_class_scores=True) Given a list of class labels and a list of output_features, validate the
is_valid_feature_list(features)
dimension_of_array_features(features)
process_or_validate_features(features,num_dimensions=None,feature_type_map=dict) Puts features into a standard form from a number of different possible forms.
process_or_validate_classifier_output_features(output_features, class_labels, supports_class_scores=True)

Given a list of class labels and a list of output_features, validate the list and return a valid version of output_features with all the correct data type information included.

is_valid_feature_list(features)
dimension_of_array_features(features)
process_or_validate_features(features, num_dimensions=None, feature_type_map=dict)

Puts features into a standard form from a number of different possible forms.

The standard form is a list of 2-tuples of (name, datatype) pairs. The name is a string and the datatype is an object as defined in the _datatype module.

The possible input forms are as follows:

  • A list of strings. in this case, the overall dimension is assumed to be the length of the list. If neighboring names are identical, they are assumed to be an input array of that length. For example:

    [“a”, “b”, “c”]

    resolves to

    [(“a”, Double), (“b”, Double), (“c”, Double)].

    And:

    [“a”, “a”, “b”]

    resolves to

    [(“a”, Array(2)), (“b”, Double)].

  • A dictionary of keys to indices or ranges of feature indices.

    In this case, it’s presented as a mapping from keys to indices or ranges of contiguous indices. For example,

    {“a” : 0, “b” : [2,3], “c” : 1}

    Resolves to

    [(“a”, Double), (“c”, Double), (“b”, Array(2))].

    Note that the ordering is determined by the indices.

  • A single string. In this case, the input is assumed to be a single array, with the number of dimensions set using num_dimensions.

Notes:

If the features variable is in the standard form, it is simply checked and returned.

If num_dimensions is given, it is used to check against the existing features, or fill in missing information in the case when features is a single string.