TSVWriter#

class streaming.TSVWriter(dirname, columns, compression=None, hashes=None, size_limit=67108864, newline='\n')[source]#

Writes a streaming TSV dataset.

Parameters
  • dirname (str) – Local dataset directory.

  • columns (Dict[str, str]) – Sample columns.

  • compression (str, optional) – Optional compression or compression:level. Defaults to None.

  • hashes (List[str], optional) – Optional list of hash algorithms to apply to shard files. Defaults to None.

  • size_limit (int, optional) – Optional shard size limit, after which point to start a new shard. If None, puts everything in one shard. Defaults to None.

  • newline (str) – Newline character inserted between samples. Defaults to \\n.

get_config()[source]#

Get object describing shard-writing configuration.

Returns

Dict[str, Any] – JSON object.