data_base ❭ IO ❭ LoaderDumper ❭ dask_to_parquet ❭ dump

dump¶

data_base.IO.LoaderDumper.dask_to_parquet.dump(obj, savedir, schema=None, client=None, repartition=10000)¶

Save a dask dataframe to one or more parquet files.

One parquet file per partition is created. Each partition is written to a file named ‘pandas_to_parquet.<n_partitions>.<partition>.parquet’. The writing of these files is parallelized using the dask client if one is provided.

In addition to the dask dataframe itself, meta information is saved in the form of a JSON file.