SNoW Learning Architecture


[ Download | User Guide | Key Publication | Questions/Comments ]

If you wish to cite this work, please use the following.

A. Carlson and C. Cumby and J. Rosen and D. Roth, The SNoW Learning Architecture.   (1999)

The SNoW (Sparse Network of Winnows) learning architecture is a multi-class classifier that is specifically tailored for large scale learning tasks and fpr domains in which the potential number of features taking part in decisions is very large, but may be unknown a priori. It learns a sparse network of linear functions in which the targets concepts (class labels) are represented as linear functions over a common feature space.

Several update rules, Winnow, Perceptron and naive Bayes, can be used within SNoW. The SNoW learning architecture inherits its generalization properties from the update rule being used. In this way, when using Winnow, it is a feature efficient learning algorithm, in that it scales linearlly with the number of relevant features, and linearly with the number of features active in the domain.

However, there are a few differences worth mentioning relative to simply using the basic update rule, which we briefly describe here in the context of the Winnow update rule, the most successful one in most applications.

  • Variable input size via the ``infinite attribute domain''.
  • Greater expressiveness than the basic Winnow rule. The basic Winnow update rule makes use of positive weights only. Standard augmentation, e.g., via the duplication trick [Littlestone88] are infeasible in high dimensional spaces since they diminish the gain from using variable size examples (since half of the features become active). More sophisticated approaches such as using the "balanced'' version of Winnow apply only to the case of two classes, while SNoW is a multi-class classifier.
  • Feature pruning methods.
  • A prediction confidence mechanism [Carlson, Rosen, Roth,2001].
  • Data driven allocation of features and links.
  • The decision support mechanism.
  • Integration with a relational feature extraction mechanism (FEX) which provides the ability to incorporate external information sources (features) in a flexible way.

SNoW has been used successfully on a variety of large scale learning tasks in the natural language domain and, recently, in the visual processing domain. It can learn and generalize from a small number of examples and thus adapts well to new environments. Some more details about the architecture, its interpretation as a relational system, the sparse update rules incorporated into it (Winnow, naive Bayes, Perceptron) and their theoretical justification, the relations to Valiant's neuroidal model and other computational properties are described in the following papers.