freeSpokenDigits

fun freeSpokenDigits(cacheDirectory: File = File("cache"), maxTestIndex: Int = 5): Pair<OnHeapDataset, OnHeapDataset>

Loads the Free Spoken Digits Dataset. This is a dataset of wav sound files of the 10 digits spoken by different people many times each. The test set officially consists of the first 10% of the recordings. Recordings numbered 0-4 (inclusive) are in the test and 5-49 are in the training set.

As the input data files have different number of channels of data we split every input file into separate samples that are threatened as separate samples with the same label.

Free Spoken Digits Dataset is made available under the terms of the Creative Commons Attribution-ShareAlike 4.0 International.

Return

Train and test datasets. Each dataset includes X and Y data. X data are float arrays of sound data with shapes (num_samples, FSDD_SOUND_DATA_SIZE) where FSDD_SOUND_DATA_SIZE is at least as long as the longest input sequence and all sequences are padded with zeros to have equal length. Y data float arrays of digit labels (integers in range 0-9) with shapes (num_samples,).

Parameters

cacheDirectory

Cache directory to cached models and datasets.

maxTestIndex

Index of max sample to be selected to test part of data.