votes up 5

The Box-Cox transformation can only be applied to strictly positive data

Package:
Exception Class:
ValueError

Raise code

                reset=in_fit)

        with np.warnings.catch_warnings():
            np.warnings.filterwarnings(
                'ignore', r'All-NaN (slice|axis) encountered')
            if (check_positive and self.method == 'box-cox' and
                    np.nanmin(X) <= 0):
                raise ValueError("The Box-Cox transformation can only be "
                                 "applied to strictly positive data")

        if check_shape and not X.shape[1] == len(self.lambdas_):
            raise ValueError("Input data has a different number of features "
                             "than fitting data. Should have {n}, data has {m}"
                             .format(n=len(self.lambdas_), m=X.shape[1]))

        valid_me
😲 Agile task management is now easier than calling a taxi. #Tracklify
🙏 Scream for help to Ukraine
Today, 3rd July 2022, Russia continues bombing and firing Ukraine. Don't trust Russia, they are bombing us and brazenly lying in same time they are not doing this 😠, civilians and children are dying too! We are screaming and asking exactly you to help us, we want to survive, our families, children, older ones.
Please spread the information, and ask your governemnt to stop Russia by any means. We promise to work extrahard after survival to make the world safer place for all.

Ways to fix

votes up 1 votes down

sklearn.preprocessing.PowerTransformer is used to apply a power transform featurewise to make data more Gaussian-like. The parameter method specifies the power transform method. The value of this parameter can be one of the following.

‘yeo-johnson’, ‘box-cox’

If the "box-cox" is given to the method parameter then all the elements of the given data should be grater than 0.

i.e. the data array should satisfy the following condition.

numpy.nanmin(data) > 0

How to reproduce the error:

pip install --user pipenv

mkdir test_folder

cd test_folder

$ pipenv shell

pipenv install numpy sklearn

import numpy as np
from sklearn.preprocessing import PowerTransformer
pt = PowerTransformer(method = "box-cox")
data = [[1, 2], [3, 2], [0, 5]]
print(pt.fit(data))
print(pt.lambdas_)
print(pt.transform(data))

Output error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-20-6546b0af3328> in <module>()
      3 pt = PowerTransformer(method = "box-cox")
      4 data = [[1, 2], [3, 2], [0, 5]]
----> 5 print(pt.fit(data))
      6 print(pt.lambdas_)
      7 print(pt.transform(data))

/usr/local/lib/python3.7/dist-packages/sklearn/preprocessing/_data.py in fit(self, X, y)
   2791         self : object
   2792         """
-> 2793         self._fit(X, y=y, force_transform=False)
   2794         return self
   2795 

/usr/local/lib/python3.7/dist-packages/sklearn/preprocessing/_data.py in _fit(self, X, y, force_transform)
   2798 
   2799     def _fit(self, X, y=None, force_transform=False):
-> 2800         X = self._check_input(X, check_positive=True, check_method=True)
   2801 
   2802         if not self.copy and not force_transform:  # if call from fit()

/usr/local/lib/python3.7/dist-packages/sklearn/preprocessing/_data.py in _check_input(self, X, check_positive, check_shape, check_method)
   3017             if (check_positive and self.method == 'box-cox' and
   3018                     np.nanmin(X) <= 0):
-> 3019                 raise ValueError("The Box-Cox transformation can only be "
   3020                                  "applied to strictly positive data")
   3021 

ValueError: The Box-Cox transformation can only be applied to strictly positive data

How to fix the error:

If the method is "box-cox" make sure the given data is all greater than zero.

import numpy as np
from sklearn.preprocessing import PowerTransformer
pt = PowerTransformer(method = "box-cox")
data = [[1, 2], [3, 2], [4, 5]]
print(pt.fit(data))
print(pt.lambdas_)
print(pt.transform(data))

Corrected output:

PowerTransformer(copy=True, method='box-cox', standardize=True)
[ 1.05173074 -2.34546281]
[[-1.33269291 -0.70710678]
 [ 0.25653283 -0.70710678]
 [ 1.07616008  1.41421356]]

Jun 30, 2021 kellemnegasi answer
kellemnegasi 30.0k

Add a possible fix

Please authorize to post fix