src.corpora.tokenization_utils.SeededShufflerIterDataPipe

class SeededShufflerIterDataPipe(datapipe: IterDataPipe[T_co], seed: int, *, buffer_size: int = 10000)[source]

Bases: IterDataPipe[T_co]

Very similar to ShufflerIterDataPipe, but with a seed, and it ignores the set_shuffle_settings stuff. If you don’t want to shuffle, then don’t use the shuffle combinator…

Methods

buffer_replace

register_datapipe_as_function

register_function

set_getstate_hook

set_reduce_ex_hook

Attributes

functions

getstate_hook

reduce_ex_hook

datapipe

buffer_size