The Box-Cox transformation can only be applied to strictly positive data
Package:
scikit-learn
47032

Exception Class:
ValueError
Raise code
reset=in_fit)
with np.warnings.catch_warnings():
np.warnings.filterwarnings(
'ignore', r'All-NaN (slice|axis) encountered')
if (check_positive and self.method == 'box-cox' and
np.nanmin(X) <= 0):
raise ValueError("The Box-Cox transformation can only be "
"applied to strictly positive data")
if check_shape and not X.shape[1] == len(self.lambdas_):
raise ValueError("Input data has a different number of features "
"than fitting data. Should have {n}, data has {m}"
.format(n=len(self.lambdas_), m=X.shape[1]))
valid_me
Links to the raise (1)
https://github.com/scikit-learn/scikit-learn/blob/c67518350f91072f9d37ed09c5ef7edf555b6cf6/sklearn/preprocessing/_data.py#L3025Ways to fix
sklearn.preprocessing.PowerTransformer
is used to apply a power transform featurewise to make data more Gaussian-like. The parameter method
specifies the power transform method. The value of this parameter can be one of the following.
‘yeo-johnson’, ‘box-cox’
If the "box-cox" is given to the method parameter then all the elements of the given data should be grater than 0.
i.e. the data
array should satisfy the following condition.
numpy.nanmin(data) > 0
How to reproduce the error:
pip install --user pipenv
mkdir test_folder
cd test_folder
$ pipenv shell
pipenv install numpy sklearn
import numpy as np
from sklearn.preprocessing import PowerTransformer
pt = PowerTransformer(method = "box-cox")
data = [[1, 2], [3, 2], [0, 5]]
print(pt.fit(data))
print(pt.lambdas_)
print(pt.transform(data))
Output error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-20-6546b0af3328> in <module>()
3 pt = PowerTransformer(method = "box-cox")
4 data = [[1, 2], [3, 2], [0, 5]]
----> 5 print(pt.fit(data))
6 print(pt.lambdas_)
7 print(pt.transform(data))
/usr/local/lib/python3.7/dist-packages/sklearn/preprocessing/_data.py in fit(self, X, y)
2791 self : object
2792 """
-> 2793 self._fit(X, y=y, force_transform=False)
2794 return self
2795
/usr/local/lib/python3.7/dist-packages/sklearn/preprocessing/_data.py in _fit(self, X, y, force_transform)
2798
2799 def _fit(self, X, y=None, force_transform=False):
-> 2800 X = self._check_input(X, check_positive=True, check_method=True)
2801
2802 if not self.copy and not force_transform: # if call from fit()
/usr/local/lib/python3.7/dist-packages/sklearn/preprocessing/_data.py in _check_input(self, X, check_positive, check_shape, check_method)
3017 if (check_positive and self.method == 'box-cox' and
3018 np.nanmin(X) <= 0):
-> 3019 raise ValueError("The Box-Cox transformation can only be "
3020 "applied to strictly positive data")
3021
ValueError: The Box-Cox transformation can only be applied to strictly positive data
How to fix the error:
If the method is "box-cox"
make sure the given data is all greater than zero.
import numpy as np
from sklearn.preprocessing import PowerTransformer
pt = PowerTransformer(method = "box-cox")
data = [[1, 2], [3, 2], [4, 5]]
print(pt.fit(data))
print(pt.lambdas_)
print(pt.transform(data))
Corrected output:
PowerTransformer(copy=True, method='box-cox', standardize=True)
[ 1.05173074 -2.34546281]
[[-1.33269291 -0.70710678]
[ 0.25653283 -0.70710678]
[ 1.07616008 1.41421356]]
Add a possible fix
Please authorize to post fix