streaming.text#

Natively supported NLP datasets.

Classes

C4

Implementation of the C4 (Colossal Cleaned Common Crawl) dataset using streaming Dataset.

EnWiki

Implementation of the English Wikipedia 2020-01-01 streaming dataset.