LBJ2.learn
Class LinearThresholdUnit

java.lang.Object
  extended by LBJ2.classify.Classifier
      extended by LBJ2.learn.Learner
          extended by LBJ2.learn.LinearThresholdUnit
All Implemented Interfaces:
java.io.Serializable, java.lang.Cloneable
Direct Known Subclasses:
PassiveAggressive, SparseConfidenceWeighted, SparsePerceptron, SparseWinnow

public abstract class LinearThresholdUnit
extends Learner

A LinearThresholdUnit is a Learner for binary classification in which a score is computed as a linear function a weight vector and the input example, and the decision is made by comparing the score to some threshold quantity. Deriving a linear threshold algorithm from this class gives the programmer more flexible access to the score it computes as well as its promotion and demotion methods (if it's on-line).

On-line, mistake driven algorithms derived from this class need only override the promote(int[],double[],double), and demote(int[],double[],double) methods, assuming the score returned by the score(Object) method need only be compared with threshold to make a prediction. Otherwise, the Learner.classify(Object) method also needs to be overridden. If the algorithm is not mistake driven, the Learner.learn(Object) method needs to be overridden as well.

It is assumed that Learner.labeler is a single discrete classifier that produces the same feature for every example object and that the values that feature may take are available through the Classifier.allowableValues() method. The first value returned from Classifier.allowableValues() is treated as "negative", and it is assumed there are exactly 2 allowable values. Assertions will produce error messages if these assumptions do not hold.

Fitting a "thick separator" instead of just a hyperplane is also supported through this class.

This algorithm's user-configurable parameters are stored in member fields of this class. They may be set via either a constructor that names each parameter explicitly or a constructor that takes an instance of Parameters as input. The documentation in each member field in this class indicates the default value of the associated parameter when using the former type of constructor. The documentation of the associated member field in the Parameters class indicates the default value of the parameter when using the latter type of constructor.

See Also:
Serialized Form

Nested Class Summary
static class LinearThresholdUnit.Parameters
          Simply a container for all of LinearThresholdUnit's configurable parameters.
 
Field Summary
protected  java.lang.String[] allowableValues
          The label producing classifier's allowable values.
protected  double bias
          The bias is stored here rather than as an element of the weight vector.
static double defaultInitialWeight
          Default for initialWeight.
static double defaultLearningRate
          Default value for learningRate.
static double defaultThickness
          Default for positiveThickness.
static double defaultThreshold
          Default for threshold.
static SparseWeightVector defaultWeightVector
          Default for weightVector.
protected  double initialWeight
          The weight associated with a feature when first added to the vector; default defaultInitialWeight.
protected  double learningRate
          The rate at which weights are updated; default defaultLearningRate.
protected  double negativeThickness
          The thickness of the hyperplane on the negative side; default equal to positiveThickness.
protected  double positiveThickness
          The thickness of the hyperplane on the positive side; default defaultThickness.
protected  double threshold
          The score is compared against this value to make predictions; default defaultThreshold.
protected  SparseWeightVector weightVector
          The LTU's weight vector; default is an empty vector.
 
Fields inherited from class LBJ2.learn.Learner
encoding, extractor, labeler, labelLexicon, lcFilePath, lexFilePath, lexicon, predictions, readLexiconOnDemand
 
Fields inherited from class LBJ2.classify.Classifier
containingPackage, name
 
Constructor Summary
  LinearThresholdUnit()
          Default constructor.
  LinearThresholdUnit(double r)
          Initializing constructor.
  LinearThresholdUnit(double r, double t)
          Sets the learning rate and threshold to the specified values, while the name of the classifier gets the empty string.
  LinearThresholdUnit(double r, double t, double pt)
          Use this constructor to fit a thick separator, where both the positive and negative sides of the hyperplane will be given the specified thickness, while the name of the classifier gets the empty string.
  LinearThresholdUnit(double r, double t, double pt, double nt)
          Use this constructor to fit a thick separator, where the positive and negative sides of the hyperplane will be given the specified separate thicknesses, while the name of the classifier gets the empty string.
protected LinearThresholdUnit(LinearThresholdUnit.Parameters p)
          Initializing constructor.
protected LinearThresholdUnit(java.lang.String n)
          Initializing constructor.
protected LinearThresholdUnit(java.lang.String n, double r)
          Default constructor.
protected LinearThresholdUnit(java.lang.String n, double r, double t)
          Initializing constructor.
protected LinearThresholdUnit(java.lang.String n, double r, double t, double pt)
          Initializing constructor.
protected LinearThresholdUnit(java.lang.String n, double r, double t, double pt, double nt)
          Initializing constructor.
protected LinearThresholdUnit(java.lang.String n, double r, double t, double pt, double nt, SparseWeightVector v)
          Initializing constructor.
protected LinearThresholdUnit(java.lang.String n, LinearThresholdUnit.Parameters p)
          Initializing constructor.
 
Method Summary
 java.lang.String[] allowableValues()
          Returns the array of allowable values that a feature returned by this classifier may take.
 FeatureVector classify(int[] exampleFeatures, double[] exampleValues)
          The default evaluation method simply computes the score for the example and returns a DiscretePrimitiveStringFeature set to either the second value from the label classifier's array of allowable values if the score is greater than or equal to threshold or the first otherwise.
 java.lang.Object clone()
          Returns a deep clone of this learning algorithm.
 double computeLearningRate(int[] exampleFeatures, double[] exampleValues, double s, boolean label)
          Computes the value of the learningRate variable if needed and returns the value.
abstract  void demote(int[] exampleFeatures, double[] exampleValues, double rate)
          If the LinearThresholdUnit is mistake driven, this method should be overridden and used to update the internal representation when a mistake is made on a negative example.
 java.lang.String discreteValue(int[] exampleFeatures, double[] exampleValues)
          The default evaluation method simply computes the score for the example and returns a DiscretePrimitiveStringFeature set to either the second value from the label classifier's array of allowable values if the score is greater than or equal to threshold or the first otherwise.
 Feature featureValue(int[] f, double[] v)
          Returns the classification of the given example as a single feature instead of a FeatureVector.
 void forget()
          Resets the weight vector to associate the default weight with all features.
 double getInitialWeight()
          Returns the current value of the initialWeight variable.
 double getNegativeThickness()
          Returns the current value of the negativeThickness variable.
 Learner.Parameters getParameters()
          Retrieves the parameters that are set in this learner.
 double getPositiveThickness()
          Returns the current value of the positiveThickness variable.
 double getThreshold()
          Returns the current value of the threshold variable.
 void initialize(int numExamples, int numFeatures)
          Initializes the weight vector array to the size of the specified number of features, setting each weight equal to initialWeight.
 void learn(int[] exampleFeatures, double[] exampleValues, int[] exampleLabels, double[] labelValues)
          The default training algorithm for a linear threshold unit consists of evaluating the example object with the score(Object) method and threshold, checking the result of evaluation against the label, and, if they are different, promoting when the label is positive or demoting when the label is negative.
abstract  void promote(int[] exampleFeatures, double[] exampleValues, double rate)
          If the LinearThresholdUnit is mistake driven, this method should be overridden and used to update the internal representation when a mistake is made on a positive example.
 void read(ExceptionlessInputStream in)
          Reads the binary representation of a learner with this object's run-time type, overwriting any and all learned or manually specified parameters as well as the label lexicon but without modifying the feature lexicon.
 double score(int[] exampleFeatures, double[] exampleValues)
          Computes the score for the specified example vector which will be thresholded to make the binary classification.
 double score(java.lang.Object example)
          Computes the score for the specified example vector which will be thresholded to make the binary classification.
 ScoreSet scores(int[] exampleFeatures, double[] exampleValues)
          An LTU returns two scores; one for the negative classification and one for the positive classification.
 void setInitialWeight(double w)
          Sets the initialWeight member variable to the specified value.
 void setLabeler(Classifier l)
          Sets the labels list.
 void setNegativeThickness(double t)
          Sets the negativeThickness member variable to the specified value.
 void setParameters(LinearThresholdUnit.Parameters p)
          Sets the values of parameters that control the behavior of this learning algorithm.
 void setPositiveThickness(double t)
          Sets the positiveThickness member variable to the specified value.
 void setThickness(double t)
          Sets the positiveThickness and negativeThickness member variables to the specified value.
 void setThreshold(double t)
          Sets the threshold member variable to the specified value.
 boolean shouldDemote(boolean label, double s, double threshold, double negativeThickness)
          Determines if the weights should be demoted
 boolean shouldPromote(boolean label, double s, double threshold, double positiveThickness)
          Determines if the weights should be promoted
 void write(ExceptionlessOutputStream out)
          Writes the learned function's internal representation in binary form.
 
Methods inherited from class LBJ2.learn.Learner
classify, classify, classify, classify, countFeatures, createPrediction, createPrediction, demandLexicon, discreteValue, discreteValue, doneLearning, doneWithRound, emptyClone, featureValue, featureValue, getExampleArray, getExampleArray, getExtractor, getLabeler, getLabelLexicon, getLexicon, getLexiconDiscardCounts, getLexiconLocation, getModelLocation, getPrunedLexiconSize, learn, learn, learn, learn, read, readLabelLexicon, readLearner, readLearner, readLearner, readLearner, readLearner, readLearner, readLexicon, readLexicon, readLexiconOnDemand, readLexiconOnDemand, readModel, readModel, readParameters, realValue, realValue, realValue, save, saveLexicon, saveModel, scores, scores, setEncoding, setExtractor, setLabelLexicon, setLexicon, setLexiconLocation, setLexiconLocation, setModelLocation, setModelLocation, setParameters, unclone, write, write, writeLexicon, writeModel, writeParameters
 
Methods inherited from class LBJ2.classify.Classifier
classify, discreteValueArray, getCompositeChildren, getInputType, getOutputType, realValueArray, test, toString, valueIndexOf
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

defaultInitialWeight

public static final double defaultInitialWeight
Default for initialWeight.

See Also:
Constant Field Values

defaultThreshold

public static final double defaultThreshold
Default for threshold.

See Also:
Constant Field Values

defaultThickness

public static final double defaultThickness
Default for positiveThickness.

See Also:
Constant Field Values

defaultLearningRate

public static final double defaultLearningRate
Default value for learningRate.

See Also:
Constant Field Values

defaultWeightVector

public static final SparseWeightVector defaultWeightVector
Default for weightVector.


learningRate

protected double learningRate
The rate at which weights are updated; default defaultLearningRate.


weightVector

protected SparseWeightVector weightVector
The LTU's weight vector; default is an empty vector.


initialWeight

protected double initialWeight
The weight associated with a feature when first added to the vector; default defaultInitialWeight.


threshold

protected double threshold
The score is compared against this value to make predictions; default defaultThreshold.


bias

protected double bias
The bias is stored here rather than as an element of the weight vector.


positiveThickness

protected double positiveThickness
The thickness of the hyperplane on the positive side; default defaultThickness.


negativeThickness

protected double negativeThickness
The thickness of the hyperplane on the negative side; default equal to positiveThickness.


allowableValues

protected java.lang.String[] allowableValues
The label producing classifier's allowable values.

Constructor Detail

LinearThresholdUnit

public LinearThresholdUnit()
Default constructor. The learning rate and threshold take default values, while the name of the classifier gets the empty string.


LinearThresholdUnit

public LinearThresholdUnit(double r)
Initializing constructor. Sets the learning rate to the specified value, and the threshold and thickness take the default, while the name of the classifier gets the empty string.

Parameters:
r - The desired learning rate.

LinearThresholdUnit

public LinearThresholdUnit(double r,
                           double t)
Sets the learning rate and threshold to the specified values, while the name of the classifier gets the empty string.

Parameters:
r - The desired learning rate value.
t - The desired threshold value.

LinearThresholdUnit

public LinearThresholdUnit(double r,
                           double t,
                           double pt)
Use this constructor to fit a thick separator, where both the positive and negative sides of the hyperplane will be given the specified thickness, while the name of the classifier gets the empty string.

Parameters:
r - The desired learning rate value.
t - The desired threshold value.
pt - The desired thickness.

LinearThresholdUnit

public LinearThresholdUnit(double r,
                           double t,
                           double pt,
                           double nt)
Use this constructor to fit a thick separator, where the positive and negative sides of the hyperplane will be given the specified separate thicknesses, while the name of the classifier gets the empty string.

Parameters:
r - The desired learning rate value.
t - The desired threshold value.
pt - The desired positive thickness.
nt - The desired negative thickness.

LinearThresholdUnit

protected LinearThresholdUnit(java.lang.String n)
Initializing constructor. Sets the threshold, positive thickness, and negative thickness to their default values.

Parameters:
n - The name of the classifier.

LinearThresholdUnit

protected LinearThresholdUnit(java.lang.String n,
                              double r)
Default constructor. Sets the threshold, positive thickness, and negative thickness to their default values.

Parameters:
n - The name of the classifier.
r - The desired learning rate.

LinearThresholdUnit

protected LinearThresholdUnit(java.lang.String n,
                              double r,
                              double t)
Initializing constructor. Sets the threshold to the specified value, while the positive and negative thicknesses get their defaults.

Parameters:
n - The name of the classifier.
r - The desired learning rate.
t - The desired value for the threshold.

LinearThresholdUnit

protected LinearThresholdUnit(java.lang.String n,
                              double r,
                              double t,
                              double pt)
Initializing constructor. Sets the threshold and positive thickness to the specified values, and the negative thickness is set to the same value as the positive thickness.

Parameters:
n - The name of the classifier.
r - The desired learning rate.
t - The desired value for the threshold.
pt - The desired thickness.

LinearThresholdUnit

protected LinearThresholdUnit(java.lang.String n,
                              double r,
                              double t,
                              double pt,
                              double nt)
Initializing constructor. Sets the threshold, positive thickness, and negative thickness to the specified values.

Parameters:
n - The name of the classifier.
r - The desired learning rate.
t - The desired value for the threshold.
pt - The desired positive thickness.
nt - The desired negative thickness.

LinearThresholdUnit

protected LinearThresholdUnit(java.lang.String n,
                              double r,
                              double t,
                              double pt,
                              double nt,
                              SparseWeightVector v)
Initializing constructor. Sets the threshold, positive thickness, and negative thickness to the specified values.

Parameters:
n - The name of the classifier.
r - The desired learning rate.
t - The desired value for the threshold.
pt - The desired positive thickness.
nt - The desired negative thickness.
v - An initial weight vector.

LinearThresholdUnit

protected LinearThresholdUnit(LinearThresholdUnit.Parameters p)
Initializing constructor. Sets all member variables to their associated settings in the LinearThresholdUnit.Parameters object. The name of the classifier is the empty string.

Parameters:
p - The settings of all parameters.

LinearThresholdUnit

protected LinearThresholdUnit(java.lang.String n,
                              LinearThresholdUnit.Parameters p)
Initializing constructor. Sets all member variables to their associated settings in the LinearThresholdUnit.Parameters object.

Parameters:
n - The name of the classifier.
p - The settings of all parameters.
Method Detail

setParameters

public void setParameters(LinearThresholdUnit.Parameters p)
Sets the values of parameters that control the behavior of this learning algorithm.

Parameters:
p - The parameters.

getParameters

public Learner.Parameters getParameters()
Retrieves the parameters that are set in this learner.

Overrides:
getParameters in class Learner
Returns:
An object containing all the values of the parameters that control the behavior of this learning algorithm.

setLabeler

public void setLabeler(Classifier l)
Sets the labels list.

Overrides:
setLabeler in class Learner
Parameters:
l - A new label producing classifier.

getInitialWeight

public double getInitialWeight()
Returns the current value of the initialWeight variable.

Returns:
The value of the initialWeight variable.

setInitialWeight

public void setInitialWeight(double w)
Sets the initialWeight member variable to the specified value.

Parameters:
w - The new value for initialWeight.

getThreshold

public double getThreshold()
Returns the current value of the threshold variable.

Returns:
The value of the threshold variable.

setThreshold

public void setThreshold(double t)
Sets the threshold member variable to the specified value.

Parameters:
t - The new value for threshold.

getPositiveThickness

public double getPositiveThickness()
Returns the current value of the positiveThickness variable.

Returns:
The value of the positiveThickness variable.

setPositiveThickness

public void setPositiveThickness(double t)
Sets the positiveThickness member variable to the specified value.

Parameters:
t - The new value for positiveThickness.

getNegativeThickness

public double getNegativeThickness()
Returns the current value of the negativeThickness variable.

Returns:
The value of the negativeThickness variable.

setNegativeThickness

public void setNegativeThickness(double t)
Sets the negativeThickness member variable to the specified value.

Parameters:
t - The new value for negativeThickness.

setThickness

public void setThickness(double t)
Sets the positiveThickness and negativeThickness member variables to the specified value.

Parameters:
t - The new thickness value.

allowableValues

public java.lang.String[] allowableValues()
Returns the array of allowable values that a feature returned by this classifier may take.

Overrides:
allowableValues in class Classifier
Returns:
If a labeler has not yet been established for this LTU, byte strings equivalent to { "*", "*" } are returned, which indicates to the compiler that classifiers derived from this learner will return features that take one of two values that are specified in the source code. Otherwise, the allowable values of the labeler are returned.

learn

public void learn(int[] exampleFeatures,
                  double[] exampleValues,
                  int[] exampleLabels,
                  double[] labelValues)
The default training algorithm for a linear threshold unit consists of evaluating the example object with the score(Object) method and threshold, checking the result of evaluation against the label, and, if they are different, promoting when the label is positive or demoting when the label is negative.

This method does not call Learner.classify(Object); it calls score(Object) directly.

Specified by:
learn in class Learner
Parameters:
exampleFeatures - The example's array of feature indices
exampleValues - The example's array of feature values
exampleLabels - The example's label(s)
labelValues - The labels' values

computeLearningRate

public double computeLearningRate(int[] exampleFeatures,
                                  double[] exampleValues,
                                  double s,
                                  boolean label)
Computes the value of the learningRate variable if needed and returns the value. By default, the current value of learningRate is returned.

Parameters:
exampleFeatures - The example's array of feature indices
exampleValues - The example's array of feature values
s - The score of the example object
label - The label of the example object
Returns:
The computed value of the learningRate variable

shouldPromote

public boolean shouldPromote(boolean label,
                             double s,
                             double threshold,
                             double positiveThickness)
Determines if the weights should be promoted

Parameters:
label - The label of the example object
s - The score of the example object
threshold - The LTU threshold
positiveThickness - The thickness of the hyperplane on the positive side
Returns:
True if the weights should be promoted, false otherwise.

shouldDemote

public boolean shouldDemote(boolean label,
                            double s,
                            double threshold,
                            double negativeThickness)
Determines if the weights should be demoted

Parameters:
label - The label of the example object
s - The score of the example object
threshold - The LTU threshold
negativeThickness - The thickness of the hyperplane on the negative side
Returns:
True if the weights should be demoted, false otherwise.

initialize

public void initialize(int numExamples,
                       int numFeatures)
Initializes the weight vector array to the size of the specified number of features, setting each weight equal to initialWeight.

Overrides:
initialize in class Learner
Parameters:
numExamples - The number of examples that will be observed during training.
numFeatures - The number of features that will be observed during training.

scores

public ScoreSet scores(int[] exampleFeatures,
                       double[] exampleValues)
An LTU returns two scores; one for the negative classification and one for the positive classification. By default, the score for the positive classification is the result of score(Object) minus the threshold, and the score for the negative classification is the opposite of the positive classification's score.

Specified by:
scores in class Learner
Parameters:
exampleFeatures - The example's array of feature indices
exampleValues - The example's array of feature values
Returns:
Two scores as described above.

featureValue

public Feature featureValue(int[] f,
                            double[] v)
Returns the classification of the given example as a single feature instead of a FeatureVector.

Overrides:
featureValue in class Learner
Parameters:
f - The features array.
v - The values array.
Returns:
The classification of the example as a feature.

discreteValue

public java.lang.String discreteValue(int[] exampleFeatures,
                                      double[] exampleValues)
The default evaluation method simply computes the score for the example and returns a DiscretePrimitiveStringFeature set to either the second value from the label classifier's array of allowable values if the score is greater than or equal to threshold or the first otherwise.

Overrides:
discreteValue in class Learner
Parameters:
exampleFeatures - The example's array of feature indices
exampleValues - The example's array of feature values
Returns:
The computed feature (in a vector).

classify

public FeatureVector classify(int[] exampleFeatures,
                              double[] exampleValues)
The default evaluation method simply computes the score for the example and returns a DiscretePrimitiveStringFeature set to either the second value from the label classifier's array of allowable values if the score is greater than or equal to threshold or the first otherwise.

Specified by:
classify in class Learner
Parameters:
exampleFeatures - The example's array of feature indices
exampleValues - The example's array of feature values
Returns:
The computed feature (in a vector).

score

public double score(java.lang.Object example)
Computes the score for the specified example vector which will be thresholded to make the binary classification.

Parameters:
example - The example object.
Returns:
The score for the given example vector.

score

public double score(int[] exampleFeatures,
                    double[] exampleValues)
Computes the score for the specified example vector which will be thresholded to make the binary classification.

Parameters:
exampleFeatures - The example's array of feature indices
exampleValues - The example's array of feature values
Returns:
The score for the given example vector.

forget

public void forget()
Resets the weight vector to associate the default weight with all features.

Overrides:
forget in class Learner

promote

public abstract void promote(int[] exampleFeatures,
                             double[] exampleValues,
                             double rate)
If the LinearThresholdUnit is mistake driven, this method should be overridden and used to update the internal representation when a mistake is made on a positive example.

Parameters:
exampleFeatures - The example's array of feature indices
exampleValues - The example's array of feature values
rate - The learning rate at which the weights are updated.

demote

public abstract void demote(int[] exampleFeatures,
                            double[] exampleValues,
                            double rate)
If the LinearThresholdUnit is mistake driven, this method should be overridden and used to update the internal representation when a mistake is made on a negative example.

Parameters:
exampleFeatures - The example's array of feature indices
exampleValues - The example's array of feature values
rate - The learning rate at which the weights are updated.

write

public void write(ExceptionlessOutputStream out)
Writes the learned function's internal representation in binary form.

Overrides:
write in class Learner
Parameters:
out - The output stream.

read

public void read(ExceptionlessInputStream in)
Reads the binary representation of a learner with this object's run-time type, overwriting any and all learned or manually specified parameters as well as the label lexicon but without modifying the feature lexicon.

Overrides:
read in class Learner
Parameters:
in - The input stream.

clone

public java.lang.Object clone()
Returns a deep clone of this learning algorithm.

Overrides:
clone in class Learner
Returns:
A shallow clone.