Returns a stratified sample without replacement
sampleBy.RdReturns a stratified sample without replacement based on the fraction given on each stratum.
Usage
sampleBy(x, col, fractions, seed)
# S4 method for SparkDataFrame,character,list,numeric
sampleBy(x, col, fractions, seed)Arguments
- x
A SparkDataFrame
- col
column that defines strata
- fractions
A named list giving sampling fraction for each stratum. If a stratum is not specified, we treat its fraction as zero.
- seed
random seed
See also
Other stat functions:
approxQuantile(),
corr(),
cov(),
crosstab(),
freqItems()
Examples
if (FALSE) {
df <- read.json("/path/to/file.json")
sample <- sampleBy(df, "key", fractions, 36)
}