NaiveBayes (Spark 2.0.0 JavaDoc)

Object
- org.apache.spark.mllib.classification.NaiveBayes

All Implemented Interfaces:

java.io.Serializable
```
public class NaiveBayes
extends Object
implements scala.Serializable
```
Trains a Naive Bayes model given an RDD of (label, features) pairs.
This is the Multinomial NB (http://tinyurl.com/lsdw6p) which can handle all kinds of discrete data. For example, by converting documents into TF-IDF vectors, it can be used for document classification. By making every vector a 0-1 vector, it can also be used as Bernoulli NB (http://tinyurl.com/p7c96j6). The input feature values must be nonnegative.

See Also:
Serialized Form

Constructor Summary

Constructors
Constructor and Description

NaiveBayes()

NaiveBayes(double lambda)

Constructors
Constructor and Description
`NaiveBayes()`
`NaiveBayes(double lambda)`

Method Summary

Methods
Modifier and Type	Method and Description
`double`	`getLambda()` Get the smoothing parameter.
`String`	`getModelType()` Get the model type.
`NaiveBayesModel`	`run(RDD<LabeledPoint> data)` Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries.
`NaiveBayes`	`setLambda(double lambda)` Set the smoothing parameter.
`NaiveBayes`	`setModelType(String modelType)` Set the model type using a string (case-sensitive).
`static NaiveBayesModel`	`train(RDD<LabeledPoint> input)` Trains a Naive Bayes model given an RDD of `(label, features)` pairs.
`static NaiveBayesModel`	`train(RDD<LabeledPoint> input, double lambda)` Trains a Naive Bayes model given an RDD of `(label, features)` pairs.
`static NaiveBayesModel`	`train(RDD<LabeledPoint> input, double lambda, String modelType)` Trains a Naive Bayes model given an RDD of `(label, features)` pairs.

Methods inherited from class Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - NaiveBayes
```
public NaiveBayes(double lambda)
```
  - NaiveBayes
```
public NaiveBayes()
```
- Method Detail
  - train
```
public static NaiveBayesModel train(RDD<LabeledPoint> input)
```
    Trains a Naive Bayes model given an RDD of (label, features) pairs.
    This is the default Multinomial NB (http://tinyurl.com/lsdw6p) which can handle all kinds of discrete data. For example, by converting documents into TF-IDF vectors, it can be used for document classification.
    This version of the method uses a default smoothing parameter of 1.0.
    
    Parameters:
    input - RDD of (label, array of features) pairs. Every vector should be a frequency vector or a count vector.
    
    Returns:
    (undocumented)
  - train
```
public static NaiveBayesModel train(RDD<LabeledPoint> input,
                    double lambda)
```
    Trains a Naive Bayes model given an RDD of (label, features) pairs.
    This is the default Multinomial NB (http://tinyurl.com/lsdw6p) which can handle all kinds of discrete data. For example, by converting documents into TF-IDF vectors, it can be used for document classification.
    
    Parameters:
    input - RDD of (label, array of features) pairs. Every vector should be a frequency vector or a count vector.
    lambda - The smoothing parameter
    
    Returns:
    (undocumented)
  - train
```
public static NaiveBayesModel train(RDD<LabeledPoint> input,
                    double lambda,
                    String modelType)
```
    Trains a Naive Bayes model given an RDD of (label, features) pairs.
    The model type can be set to either Multinomial NB (http://tinyurl.com/lsdw6p) or Bernoulli NB (http://tinyurl.com/p7c96j6). The Multinomial NB can handle discrete count data and can be called by setting the model type to "multinomial". For example, it can be used with word counts or TF_IDF vectors of documents. The Bernoulli model fits presence or absence (0-1) counts. By making every vector a 0-1 vector and setting the model type to "bernoulli", the fits and predicts as Bernoulli NB.
    
    Parameters:
    input - RDD of (label, array of features) pairs. Every vector should be a frequency vector or a count vector.
    lambda - The smoothing parameter
    modelType - The type of NB model to fit from the enumeration NaiveBayesModels, can be multinomial or bernoulli
    
    Returns:
    (undocumented)
  - setLambda
```
public NaiveBayes setLambda(double lambda)
```
    Set the smoothing parameter. Default: 1.0.
  - getLambda
```
public double getLambda()
```
    Get the smoothing parameter.
  - setModelType
```
public NaiveBayes setModelType(String modelType)
```
    Set the model type using a string (case-sensitive). Supported options: "multinomial" (default) and "bernoulli".
    
    Parameters:
    modelType - (undocumented)
    
    Returns:
    (undocumented)
  - getModelType
```
public String getModelType()
```
    Get the model type.
  - run
```
public NaiveBayesModel run(RDD<LabeledPoint> data)
```
    Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries.
    
    Parameters:
    data - RDD of LabeledPoint.
    
    Returns:
    (undocumented)

Class NaiveBayes

Constructor Summary

Method Summary

Methods inherited from class Object

Constructor Detail

NaiveBayes

NaiveBayes

Method Detail

train

train

train

setLambda

getLambda

setModelType

getModelType

run