Mode must be one of 'a', 'w' or 'r+'
Package:
dask
8734
Exception Class:
ValueError
Raise code
if "*" in key:
single_node = False
if "format" in kwargs and kwargs["format"] not in ["t", "table"]:
raise ValueError("Dask only support 'table' format in hdf files.")
if mode not in ("a", "w", "r+"):
raise ValueError("Mode must be one of 'a', 'w' or 'r+'")
if name_function is None:
name_function = build_name_function(df.npartitions - 1)
# we guarantee partition order is preserved when its saved and read
# so we enforce name_function to maintain the order of its input.
if not (single_file and single_node):
🙏 Scream for help to Ukraine
Today, 25th May 2022, Russia continues bombing and firing Ukraine. Don't trust Russia, they are bombing us and brazenly lying in same time they are not doing this 😠, civilians and children are dying too!
We are screaming and asking exactly you to help us, we want to survive, our families, children, older ones.
Please spread the information, and ask your governemnt to stop Russia by any means. We promise to work extrahard after survival to make the world safer place for all.
Please spread the information, and ask your governemnt to stop Russia by any means. We promise to work extrahard after survival to make the world safer place for all.
Links to the raise (1)
https://github.com/dask/dask/blob/af34908a3056e7f7f3d52ab7c502e2bb9139ebdc/dask/dataframe/io/hdf.py#L167Ways to fix
The to_hdf method is used to store Dask Dataframe to Hierarchical Data Format (HDF) files. When doing this parameter mode
is used to specify the access mode of the file. The valid values of this parameter are given as follows.
'a', 'w','r+'
Any other value causes this exception.
Steps to reproduce the exception.
First install the necessary libraries
pip install pandas dask dask[dataframe]
import pandas as pd
import dask
import dask.array as dask
import dask.dataframe as dd
from dask.utils import tmpfile
df = pd.DataFrame({"x": ["a", "b", "c", "d"], "y": [1, 2, 3, 4]}, index=[1.0, 2.0, 3.0, 4.0])
b = dd.from_pandas(df, 2)
with tmpfile("h5") as fn:
b.to_hdf(fn, "/data*",mode="aw")
out = dd.read_hdf(fn, "/data*")
print(out)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-30-b635f7fb4ee6> in <module>() 8 b = dd.from_pandas(df, 2) 9 with tmpfile("h5") as fn: ---> 10 b.to_hdf(fn, "/data*",mode="aw") 11 out = dd.read_hdf(fn, "/data*") 12 print(out)
/usr/local/lib/python3.7/dist-packages/dask/dataframe/core.py in to_hdf(self, path_or_buf, key, mode, append, **kwargs) 1338 from .io import to_hdf 1339 -> 1340 return to_hdf(self, path_or_buf, key, mode, append, **kwargs) 1341 1342 def to_csv(self, filename, **kwargs):
/usr/local/lib/python3.7/dist-packages/dask/dataframe/io/hdf.py in to_hdf(df, path, key, mode, append, scheduler, name_function, compute, lock, dask_kwargs, **kwargs) 167 168 if mode not in ("a", "w", "r+"): --> 169 raise ValueError("Mode must be one of 'a', 'w' or 'r+'") 170 171 if name_function is None:
ValueError: Mode must be one of 'a', 'w' or 'r+'
Fixed version of the code:
import pandas as pd
import dask
import dask.array as dask
import dask.dataframe as dd
from dask.utils import tmpfile
df = pd.DataFrame({"x": ["a", "b", "c", "d"], "y": [1, 2, 3, 4]}, index=[1.0, 2.0, 3.0, 4.0])
b = dd.from_pandas(df, 2)
with tmpfile("h5") as fn:
b.to_hdf(fn, "/data*",mode="a")
out = dd.read_hdf(fn, "/data*")
print(out)
Add a possible fix
Please authorize to post fix