votes up 5

The Box-Cox transformation can only be applied to strictly positive data

Package:
Exception Class:
ValueError

Raise code

                reset=in_fit)

        with np.warnings.catch_warnings():
            np.warnings.filterwarnings(
                'ignore', r'All-NaN (slice|axis) encountered')
            if (check_positive and self.method == 'box-cox' and
                    np.nanmin(X) <= 0):
                raise ValueError("The Box-Cox transformation can only be "
                                 "applied to strictly positive data")

        if check_shape and not X.shape[1] == len(self.lambdas_):
            raise ValueError("Input data has a different number of features "
                             "than fitting data. Should have {n}, data has {m}"
                             .format(n=len(self.lambdas_), m=X.shape[1]))

        valid_me
😲  Walkingbet is Android app that pays you real bitcoins for a walking. Withdrawable real money bonus is available now, hurry up! 🚶

Ways to fix

votes up 1 votes down

sklearn.preprocessing.PowerTransformer is used to apply a power transform featurewise to make data more Gaussian-like. The parameter method specifies the power transform method. The value of this parameter can be one of the following.

‘yeo-johnson’, ‘box-cox’

If the "box-cox" is given to the method parameter then all the elements of the given data should be grater than 0.

i.e. the data array should satisfy the following condition.

numpy.nanmin(data) > 0

How to reproduce the error:

pip install --user pipenv

mkdir test_folder

cd test_folder

$ pipenv shell

pipenv install numpy sklearn

import numpy as np
from sklearn.preprocessing import PowerTransformer
pt = PowerTransformer(method = "box-cox")
data = [[1, 2], [3, 2], [0, 5]]
print(pt.fit(data))
print(pt.lambdas_)
print(pt.transform(data))

Output error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-20-6546b0af3328> in <module>()
      3 pt = PowerTransformer(method = "box-cox")
      4 data = [[1, 2], [3, 2], [0, 5]]
----> 5 print(pt.fit(data))
      6 print(pt.lambdas_)
      7 print(pt.transform(data))

/usr/local/lib/python3.7/dist-packages/sklearn/preprocessing/_data.py in fit(self, X, y)
   2791         self : object
   2792         """
-> 2793         self._fit(X, y=y, force_transform=False)
   2794         return self
   2795 

/usr/local/lib/python3.7/dist-packages/sklearn/preprocessing/_data.py in _fit(self, X, y, force_transform)
   2798 
   2799     def _fit(self, X, y=None, force_transform=False):
-> 2800         X = self._check_input(X, check_positive=True, check_method=True)
   2801 
   2802         if not self.copy and not force_transform:  # if call from fit()

/usr/local/lib/python3.7/dist-packages/sklearn/preprocessing/_data.py in _check_input(self, X, check_positive, check_shape, check_method)
   3017             if (check_positive and self.method == 'box-cox' and
   3018                     np.nanmin(X) <= 0):
-> 3019                 raise ValueError("The Box-Cox transformation can only be "
   3020                                  "applied to strictly positive data")
   3021 

ValueError: The Box-Cox transformation can only be applied to strictly positive data

How to fix the error:

If the method is "box-cox" make sure the given data is all greater than zero.

import numpy as np
from sklearn.preprocessing import PowerTransformer
pt = PowerTransformer(method = "box-cox")
data = [[1, 2], [3, 2], [4, 5]]
print(pt.fit(data))
print(pt.lambdas_)
print(pt.transform(data))

Corrected output:

PowerTransformer(copy=True, method='box-cox', standardize=True)
[ 1.05173074 -2.34546281]
[[-1.33269291 -0.70710678]
 [ 0.25653283 -0.70710678]
 [ 1.07616008  1.41421356]]

Jun 30, 2021 kellemnegasi answer
kellemnegasi 31.6k

Add a possible fix

Please authorize to post fix