Class CSVFrequencySampler
java.lang.Object
io.nosqlbench.virtdata.library.basics.shared.distributions.CSVFrequencySampler
- All Implemented Interfaces:
java.util.function.LongFunction<java.lang.String>
public class CSVFrequencySampler
extends java.lang.Object
implements java.util.function.LongFunction<java.lang.String>
Takes a CSV with sample data and generates random values based on the
relative frequencies of the values in the file.
The CSV file must have headers which can
be used to find the named columns.
I.E. take the following imaginary `animals.csv` file:
animal,count,country
puppy,1,usa
puppy,2,colombia
puppy,3,senegal
kitten,2,colombia
`CSVFrequencySampler('animals.csv', animal)` will return `puppy` or `kitten` randomly. `puppy` will be 3x more frequent than `kitten`.
`CSVFrequencySampler('animals.csv', country)` will return `usa`, `colombia`, or `senegal` randomly. `colombia` will be 2x more frequent than `usa` or `senegal`.
Use this function to infer frequencies of categorical values from CSVs.
-
Constructor Summary
Constructors Constructor Description CSVFrequencySampler(java.lang.String filename, java.lang.String columnName)Create a sampler of strings from the given CSV file. -
Method Summary
Modifier and Type Method Description java.lang.Stringapply(long value)
-
Constructor Details
-
CSVFrequencySampler
public CSVFrequencySampler(java.lang.String filename, java.lang.String columnName)Create a sampler of strings from the given CSV file. The CSV file must have plain CSV headers as its first line.- Parameters:
filename- The name of the file to be read into the sampler buffercolumnName- The name of the column to be sampled
-
-
Method Details
-
apply
public java.lang.String apply(long value)- Specified by:
applyin interfacejava.util.function.LongFunction<java.lang.String>
-