LBJ2.classify
Class DiscretePrimitiveStringFeature

java.lang.Object
  extended by LBJ2.classify.Feature
      extended by LBJ2.classify.DiscreteFeature
          extended by LBJ2.classify.DiscretePrimitiveStringFeature
All Implemented Interfaces:
java.io.Serializable, java.lang.Cloneable, java.lang.Comparable
Direct Known Subclasses:
DiscreteArrayStringFeature

public class DiscretePrimitiveStringFeature
extends DiscreteFeature

This feature is functionally equivalent to DiscretePrimitiveFeature, however its value is stored as a String instead of a ByteString. Discrete classifiers return features of this type (or DiscreteConjunctiveFeatures or DiscreteReferringFeatures that contain features of this type). Before storing these features in a lexicon, however, they are converted to DiscretePrimitiveFeatures using the specified encoding.

See Also:
Serialized Form

Field Summary
protected  java.lang.String identifier
          The identifier string distinguishes this Feature from other Features.
protected  java.lang.String value
          The discrete value is represented as a string.
 
Fields inherited from class LBJ2.classify.DiscreteFeature
BooleanValues, totalValues, valueIndex
 
Fields inherited from class LBJ2.classify.Feature
containingPackage, generatingClassifier
 
Constructor Summary
protected DiscretePrimitiveStringFeature()
          For internal use only.
  DiscretePrimitiveStringFeature(java.lang.String p, java.lang.String c, java.lang.String i, java.lang.String v)
          Sets both the identifier and the value.
  DiscretePrimitiveStringFeature(java.lang.String p, java.lang.String c, java.lang.String i, java.lang.String v, short vi, short t)
          Sets the identifier, value, value index, and total allowable values.
 
Method Summary
 boolean classEquivalent(Feature f)
          Some features are functionally equivalent, differing only in the encoding of their values; this method will return true iff the class of this feature and f are different, but they differ only because they encode their values differently.
 int compareTo(java.lang.Object o)
          Used to sort features into an order that is convenient both to page through and for the lexicon to read off disk.
 Feature encode(java.lang.String e)
          Returns a feature object in which any strings that are being used to represent an identifier or value have been encoded in byte strings.
 boolean equals(java.lang.Object o)
          Two DiscretePrimitiveStringFeatures are equivalent when their containing packages, identifiers, and values are equivalent.
 ByteString getByteStringIdentifier()
          Retrieves this feature's identifier as a byte string.
 ByteString getByteStringValue()
          Gives a string representation of the value of this feature.
 Feature getFeatureKey(Lexicon lexicon, boolean training, int label)
          Return the feature that should be used to index this feature into a lexicon.
 java.lang.String getStringIdentifier()
          Retrieves this feature's identifier as a string.
 java.lang.String getStringValue()
          Gives a string representation of the value of this feature.
 int hashCode()
          The hash code of a DiscretePrimitiveStringFeature is the sum of the hash codes of its containing package, identifier, and value.
 boolean hasStringIdentifier()
          Determines if this feature contains a string identifier field.
 boolean isPrimitive()
          Determines if this feature is primitive.
 void lexRead(ExceptionlessInputStream in, Lexicon lex, java.lang.String p, java.lang.String g, java.lang.String si, ByteString bi)
          Reads the representation of a feature with this object's run-time type as stored by a lexicon, overwriting the data in this object.
 java.lang.String lexWrite(ExceptionlessOutputStream out, Lexicon lex, java.lang.String c, java.lang.String p, java.lang.String g, java.lang.String si, ByteString bi)
          Writes a binary representation of the feature intended for use by a lexicon, omitting redundant information when possible.
 RealFeature makeReal()
          Returns a RealPrimitiveFeature whose value field is set to the strength of the current feature, and whose identifier field contains all the information necessary to distinguish this feature from other features.
 void read(ExceptionlessInputStream in)
          Reads the representation of a feaeture with this object's run-time type from the given stream, overwriting the data in this object.
 boolean valueEquals(java.lang.String v)
          Determines whether or not the parameter is equivalent to the string representation of the value of this feature.
 Feature withStrength(double s)
          Returns a new feature object that's identical to this feature except its strength is given by s.
 void write(ExceptionlessOutputStream out)
          Writes a complete binary representation of the feature.
 void write(java.lang.StringBuffer buffer)
          Writes a string representation of this Feature to the specified buffer.
 void writeNameString(java.lang.StringBuffer buffer)
          Writes a string representation of this Feature's package, generating classifier, and identifier information to the specified buffer.
 
Methods inherited from class LBJ2.classify.DiscreteFeature
conjunction, conjunctWith, getStrength, getValueIndex, isDiscrete, totalValues
 
Methods inherited from class LBJ2.classify.Feature
childLexiconLookup, clone, compareNameStrings, conjunctWith, depth, fromArray, getFeatureKey, getGeneratingClassifier, getPackage, hasByteStringIdentifier, isConjunctive, isReferrer, lexReadFeature, readFeature, removeFromChildLexicon, setArrayLength, toString, toStringNoPackage, writeNoPackage
 
Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

identifier

protected java.lang.String identifier
The identifier string distinguishes this Feature from other Features.


value

protected java.lang.String value
The discrete value is represented as a string.

Constructor Detail

DiscretePrimitiveStringFeature

protected DiscretePrimitiveStringFeature()
For internal use only.

See Also:
Feature.readFeature(ExceptionlessInputStream)

DiscretePrimitiveStringFeature

public DiscretePrimitiveStringFeature(java.lang.String p,
                                      java.lang.String c,
                                      java.lang.String i,
                                      java.lang.String v)
Sets both the identifier and the value. The value index and total allowable values, having not been specified, default to -1 and 0 respectively.

Parameters:
p - The new discrete feature's package.
c - The name of the classifier that produced this feature.
i - The new discrete feature's identifier.
v - The new discrete feature's value.

DiscretePrimitiveStringFeature

public DiscretePrimitiveStringFeature(java.lang.String p,
                                      java.lang.String c,
                                      java.lang.String i,
                                      java.lang.String v,
                                      short vi,
                                      short t)
Sets the identifier, value, value index, and total allowable values.

Parameters:
p - The new discrete feature's package.
c - The name of the classifier that produced this feature.
i - The new discrete feature's identifier.
v - The new discrete feature's value.
vi - The index corresponding to the value.
t - The total allowable values for this feature.
Method Detail

hasStringIdentifier

public boolean hasStringIdentifier()
Determines if this feature contains a string identifier field.

Overrides:
hasStringIdentifier in class Feature
Returns:
true iff this feature contains a string identifier field.

isPrimitive

public boolean isPrimitive()
Determines if this feature is primitive.

Overrides:
isPrimitive in class Feature
Returns:
true iff this is primitive.

getStringIdentifier

public java.lang.String getStringIdentifier()
Retrieves this feature's identifier as a string.

Specified by:
getStringIdentifier in class Feature
Returns:
This feature's identifier as a string.

getByteStringIdentifier

public ByteString getByteStringIdentifier()
Retrieves this feature's identifier as a byte string.

Specified by:
getByteStringIdentifier in class Feature
Returns:
This feature's identifier as a byte string.

getStringValue

public java.lang.String getStringValue()
Gives a string representation of the value of this feature.

Specified by:
getStringValue in class Feature
Returns:
value.

getByteStringValue

public ByteString getByteStringValue()
Gives a string representation of the value of this feature.

Specified by:
getByteStringValue in class Feature
Returns:
The byte string encoding of value.

valueEquals

public boolean valueEquals(java.lang.String v)
Determines whether or not the parameter is equivalent to the string representation of the value of this feature.

Specified by:
valueEquals in class Feature
Parameters:
v - The string to compare against.
Returns:
true iff the parameter is equivalent to the string representation of the value of this feature.

getFeatureKey

public Feature getFeatureKey(Lexicon lexicon,
                             boolean training,
                             int label)
Return the feature that should be used to index this feature into a lexicon. If it is a binary feature, we return the feature with an empty value so that the feature will be mapped to the same weight whether it is active or not. If the feature can take multiple values, then simply return the feature object as-is.

Specified by:
getFeatureKey in class Feature
Parameters:
lexicon - The lexicon into which this feature will be indexed.
training - Whether or not the learner is currently training.
label - The label of the example containing this feature, or -1 if we aren't doing per class feature counting.
Returns:
A feature object appropriate for use as the key of a map.

makeReal

public RealFeature makeReal()
Returns a RealPrimitiveFeature whose value field is set to the strength of the current feature, and whose identifier field contains all the information necessary to distinguish this feature from other features.

Specified by:
makeReal in class Feature

withStrength

public Feature withStrength(double s)
Returns a new feature object that's identical to this feature except its strength is given by s.

Specified by:
withStrength in class Feature
Parameters:
s - The strength of the new feature.
Returns:
A new feature object as above, or null if this feature cannot take the specified strength.

encode

public Feature encode(java.lang.String e)
Returns a feature object in which any strings that are being used to represent an identifier or value have been encoded in byte strings.

Specified by:
encode in class Feature
Parameters:
e - The encoding to use.
Returns:
A feature object as above; possible this object.

hashCode

public int hashCode()
The hash code of a DiscretePrimitiveStringFeature is the sum of the hash codes of its containing package, identifier, and value.

Overrides:
hashCode in class Feature
Returns:
The hash code of this feature.

equals

public boolean equals(java.lang.Object o)
Two DiscretePrimitiveStringFeatures are equivalent when their containing packages, identifiers, and values are equivalent.

Overrides:
equals in class Feature
Parameters:
o - The object with which to compare this feature.
Returns:
true iff the parameter is an equivalent feature.

classEquivalent

public boolean classEquivalent(Feature f)
Some features are functionally equivalent, differing only in the encoding of their values; this method will return true iff the class of this feature and f are different, but they differ only because they encode their values differently. This method does not compare the values themselves, however.

Overrides:
classEquivalent in class Feature
Parameters:
f - Another feature.
Returns:
See above.

compareTo

public int compareTo(java.lang.Object o)
Used to sort features into an order that is convenient both to page through and for the lexicon to read off disk.

Specified by:
compareTo in interface java.lang.Comparable
Specified by:
compareTo in class Feature
Parameters:
o - An object to compare with.
Returns:
Integers appropriate for sorting features first by package, then by identifier, then by value.

write

public void write(java.lang.StringBuffer buffer)
Writes a string representation of this Feature to the specified buffer.

Specified by:
write in class Feature
Parameters:
buffer - The buffer to write to.

writeNameString

public void writeNameString(java.lang.StringBuffer buffer)
Writes a string representation of this Feature's package, generating classifier, and identifier information to the specified buffer.

Overrides:
writeNameString in class Feature
Parameters:
buffer - The buffer to write to.

write

public void write(ExceptionlessOutputStream out)
Writes a complete binary representation of the feature.

Overrides:
write in class DiscreteFeature
Parameters:
out - The output stream.

read

public void read(ExceptionlessInputStream in)
Reads the representation of a feaeture with this object's run-time type from the given stream, overwriting the data in this object.

Overrides:
read in class DiscreteFeature
Parameters:
in - The input stream.

lexWrite

public java.lang.String lexWrite(ExceptionlessOutputStream out,
                                 Lexicon lex,
                                 java.lang.String c,
                                 java.lang.String p,
                                 java.lang.String g,
                                 java.lang.String si,
                                 ByteString bi)
Writes a binary representation of the feature intended for use by a lexicon, omitting redundant information when possible.

Overrides:
lexWrite in class DiscreteFeature
Parameters:
out - The output stream.
lex - The lexicon out of which this feature is being written.
c - The fully qualified name of the assumed class. The runtime class of this feature won't be written if it's equivalent to c.
p - The assumed package string. This feature's package string won't be written if it's equivalent to p.
g - The assumed classifier name string. This feature's classifier name string won't be written if it's equivalent to g.
si - The assumed identifier as a string. If this feature has a string identifier, it won't be written if it's equivalent to si.
bi - The assumed identifier as a byte string. If this feature has a byte string identifier, it won't be written if it's equivalent to bi.
Returns:
The name of the runtime type of this feature.

lexRead

public void lexRead(ExceptionlessInputStream in,
                    Lexicon lex,
                    java.lang.String p,
                    java.lang.String g,
                    java.lang.String si,
                    ByteString bi)
Reads the representation of a feature with this object's run-time type as stored by a lexicon, overwriting the data in this object.

This method is appropriate for reading features as written by lexWrite(ExceptionlessOutputStream,Lexicon,String,String,String,String,ByteString).

Overrides:
lexRead in class DiscreteFeature
Parameters:
in - The input stream.
lex - The lexicon we are reading in to.
p - The assumed package string. If no package name is given in the input stream, the instantiated feature is given this package.
g - The assumed classifier name string. If no classifier name is given in the input stream, the instantiated feature is given this classifier name.
si - The assumed identifier as a string. If the feature being read has a string identifier field and no identifier is given in the input stream, the feature is given this identifier.
bi - The assumed identifier as a byte string. If the feature being read has a byte string identifier field and no identifier is given in the input stream, the feature is given this identifier.