votes up 10

Bin labels must be one fewer than the number of bin edges

Package:
pandas
github stars 30911
Exception Class:
ValueError

Raise code

      elif ordered and len(set(labels)) != len(labels):
            raise ValueError(
                "labels must be unique if ordered=True; pass ordered=False for duplicate labels"  # noqa
            )
        else:
            if len(labels) != len(bins) - 1:
                raise ValueError(
                    "Bin labels must be one fewer than the number of bin edges"
                )
        if not is_categorical_dtype(labels):
            labels = Categorical(
                labels,
                categories=labels if len(set(labels)) == len(labels) else None,
                ordered=ordered,
            )
  

Ways to fix

votes up 2 votes down

pandas.cut is used to segment and sort data values into bins. This function is also useful for going from a continuous variable to a categorical variable.

The second parameter bins should have the same value as the length of the labels parameter. If not this error is raised.

Reproducing the error:

pipenv install pandas

import pandas as pd

df=pd.cut(np.array([1, 7, 5, 4, 6, 3]),
          3, # this should be 4. bc the labels has 4 values
          labels=["bad", "medium", "good","very good",],
          ordered=False)
print(df)

The error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-27-cfc67b218327> in <module>()
      4           3,
      5           labels=["bad", "medium", "good","very good",],
----> 6           ordered=False)
      7 print(df)

/usr/local/lib/python3.7/dist-packages/pandas/core/reshape/tile.py in cut(x, bins, right, labels, retbins, precision, include_lowest, duplicates, ordered)
    282         dtype=dtype,
    283         duplicates=duplicates,
--> 284         ordered=ordered,
    285     )
    286 

/usr/local/lib/python3.7/dist-packages/pandas/core/reshape/tile.py in _bins_to_cuts(x, bins, right, labels, precision, include_lowest, dtype, duplicates, ordered)
    433             if len(labels) != len(bins) - 1:
    434                 raise ValueError(
--> 435                     "Bin labels must be one fewer than the number of bin edges"
    436                 )
    437         if not is_categorical_dtype(labels):

ValueError: Bin labels must be one fewer than the number of bin edges

Fixed version of the code:

import pandas as pd
df=pd.cut(np.array([1, 7, 5, 4, 6, 3]),
          4, 
          labels=["bad", "medium", "good","very good",],
          ordered=False)
print(df)

Output:

['bad', 'very good', 'good', 'medium', 'very good', 'medium']
Categories (4, object): ['bad', 'medium', 'good', 'very good']

Jul 17, 2021 kellemnegasi answer
kellemnegasi 11.7k

Add a possible fix

Please authorize to post fix