The Box-Cox transformation can only be applied to strictly positive data
Package:
scikit-learn
47032

Exception Class:
ValueError
Raise code
reset=in_fit)
with np.warnings.catch_warnings():
np.warnings.filterwarnings(
'ignore', r'All-NaN (slice|axis) encountered')
if (check_positive and self.method == 'box-cox' and
np.nanmin(X) <= 0):
raise ValueError("The Box-Cox transformation can only be "
"applied to strictly positive data")
if check_shape and not X.shape[1] == len(self.lambdas_):
raise ValueError("Input data has a different number of features "
"than fitting data. Should have {n}, data has {m}"
.format(n=len(self.lambdas_), m=X.shape[1]))
valid_me
🙏 Scream for help to Ukraine
Today, 3rd July 2022, Russia continues bombing and firing Ukraine. Don't trust Russia, they are bombing us and brazenly lying in same time they are not doing this 😠, civilians and children are dying too!
We are screaming and asking exactly you to help us, we want to survive, our families, children, older ones.
Please spread the information, and ask your governemnt to stop Russia by any means. We promise to work extrahard after survival to make the world safer place for all.
Please spread the information, and ask your governemnt to stop Russia by any means. We promise to work extrahard after survival to make the world safer place for all.
Links to the raise (1)
https://github.com/scikit-learn/scikit-learn/blob/c67518350f91072f9d37ed09c5ef7edf555b6cf6/sklearn/preprocessing/_data.py#L3025Ways to fix
sklearn.preprocessing.PowerTransformer
is used to apply a power transform featurewise to make data more Gaussian-like. The parameter method
specifies the power transform method. The value of this parameter can be one of the following.
‘yeo-johnson’, ‘box-cox’
If the "box-cox" is given to the method parameter then all the elements of the given data should be grater than 0.
i.e. the data
array should satisfy the following condition.
numpy.nanmin(data) > 0
How to reproduce the error:
pip install --user pipenv
mkdir test_folder
cd test_folder
$ pipenv shell
pipenv install numpy sklearn
import numpy as np
from sklearn.preprocessing import PowerTransformer
pt = PowerTransformer(method = "box-cox")
data = [[1, 2], [3, 2], [0, 5]]
print(pt.fit(data))
print(pt.lambdas_)
print(pt.transform(data))
Output error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-20-6546b0af3328> in <module>()
3 pt = PowerTransformer(method = "box-cox")
4 data = [[1, 2], [3, 2], [0, 5]]
----> 5 print(pt.fit(data))
6 print(pt.lambdas_)
7 print(pt.transform(data))
/usr/local/lib/python3.7/dist-packages/sklearn/preprocessing/_data.py in fit(self, X, y)
2791 self : object
2792 """
-> 2793 self._fit(X, y=y, force_transform=False)
2794 return self
2795
/usr/local/lib/python3.7/dist-packages/sklearn/preprocessing/_data.py in _fit(self, X, y, force_transform)
2798
2799 def _fit(self, X, y=None, force_transform=False):
-> 2800 X = self._check_input(X, check_positive=True, check_method=True)
2801
2802 if not self.copy and not force_transform: # if call from fit()
/usr/local/lib/python3.7/dist-packages/sklearn/preprocessing/_data.py in _check_input(self, X, check_positive, check_shape, check_method)
3017 if (check_positive and self.method == 'box-cox' and
3018 np.nanmin(X) <= 0):
-> 3019 raise ValueError("The Box-Cox transformation can only be "
3020 "applied to strictly positive data")
3021
ValueError: The Box-Cox transformation can only be applied to strictly positive data
How to fix the error:
If the method is "box-cox"
make sure the given data is all greater than zero.
import numpy as np
from sklearn.preprocessing import PowerTransformer
pt = PowerTransformer(method = "box-cox")
data = [[1, 2], [3, 2], [4, 5]]
print(pt.fit(data))
print(pt.lambdas_)
print(pt.transform(data))
Corrected output:
PowerTransformer(copy=True, method='box-cox', standardize=True)
[ 1.05173074 -2.34546281]
[[-1.33269291 -0.70710678]
[ 0.25653283 -0.70710678]
[ 1.07616008 1.41421356]]
Add a possible fix
Please authorize to post fix