public class MLLibUtil extends Object
| Constructor and Description |
|---|
MLLibUtil() |
| Modifier and Type | Method and Description |
|---|---|
static org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> |
fromBinary(org.apache.spark.api.java.JavaPairRDD<String,org.apache.spark.input.PortableDataStream> binaryFiles,
org.canova.api.records.reader.RecordReader reader)
Convert a traditional sc.binaryFiles
in to something usable for machine learning
|
static org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> |
fromBinary(org.apache.spark.api.java.JavaRDD<scala.Tuple2<String,org.apache.spark.input.PortableDataStream>> binaryFiles,
org.canova.api.records.reader.RecordReader reader)
Convert a traditional sc.binaryFiles
in to something usable for machine learning
|
static org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> |
fromDataSet(org.apache.spark.api.java.JavaSparkContext sc,
org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> data)
Convert an rdd of data set in to labeled point
|
static org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> |
fromLabeledPoint(org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> data,
int numPossibleLabels,
int batchSize)
Convert an rdd
of labeled point
based on the specified batch size
in to data set
|
static org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> |
fromLabeledPoint(org.apache.spark.api.java.JavaSparkContext sc,
org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> data,
int numPossibleLabels)
From labeled point
|
static org.apache.spark.mllib.regression.LabeledPoint |
pointOf(Collection<org.canova.api.writable.Writable> writables)
Returns a labeled point of the writables
where the final item is the point and the rest of the items are
features
|
static double |
toClassifierPrediction(org.apache.spark.mllib.linalg.Vector vector)
This is for the edge case where
you have a single output layer
and need to convert the output layer to
an index
|
static org.apache.spark.mllib.linalg.Matrix |
toMatrix(org.nd4j.linalg.api.ndarray.INDArray arr)
Convert an ndarray to a matrix.
|
static org.nd4j.linalg.api.ndarray.INDArray |
toMatrix(org.apache.spark.mllib.linalg.Matrix arr)
Convert an ndarray to a matrix.
|
static org.apache.spark.mllib.linalg.Vector |
toVector(org.nd4j.linalg.api.ndarray.INDArray arr)
Convert an ndarray to a vector
|
static org.nd4j.linalg.api.ndarray.INDArray |
toVector(org.apache.spark.mllib.linalg.Vector arr)
Convert an ndarray to a vector
|
public static double toClassifierPrediction(org.apache.spark.mllib.linalg.Vector vector)
vector - the vector to get the classifier prediction forpublic static org.nd4j.linalg.api.ndarray.INDArray toMatrix(org.apache.spark.mllib.linalg.Matrix arr)
arr - the arraypublic static org.nd4j.linalg.api.ndarray.INDArray toVector(org.apache.spark.mllib.linalg.Vector arr)
arr - the arraypublic static org.apache.spark.mllib.linalg.Matrix toMatrix(org.nd4j.linalg.api.ndarray.INDArray arr)
arr - the arraypublic static org.apache.spark.mllib.linalg.Vector toVector(org.nd4j.linalg.api.ndarray.INDArray arr)
arr - the arraypublic static org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> fromBinary(org.apache.spark.api.java.JavaPairRDD<String,org.apache.spark.input.PortableDataStream> binaryFiles, org.canova.api.records.reader.RecordReader reader)
binaryFiles - the binary files to convertreader - the reader to usepublic static org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> fromBinary(org.apache.spark.api.java.JavaRDD<scala.Tuple2<String,org.apache.spark.input.PortableDataStream>> binaryFiles, org.canova.api.records.reader.RecordReader reader)
binaryFiles - the binary files to convertreader - the reader to usepublic static org.apache.spark.mllib.regression.LabeledPoint pointOf(Collection<org.canova.api.writable.Writable> writables)
writables - the writablespublic static org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> fromLabeledPoint(org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> data,
int numPossibleLabels,
int batchSize)
data - the data to convertnumPossibleLabels - the number of possible labelsbatchSize - the batch sizepublic static org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> fromLabeledPoint(org.apache.spark.api.java.JavaSparkContext sc,
org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> data,
int numPossibleLabels)
sc - the org.deeplearning4j.spark context used for creating the rdddata - the data to convertnumPossibleLabels - the number of possible labelspublic static org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> fromDataSet(org.apache.spark.api.java.JavaSparkContext sc,
org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> data)
sc - the spark context to usedata - the dataset to convertCopyright © 2016. All Rights Reserved.