samedi 25 juin 2016

How to use GradientBoostingClassifier init Parameters in sklearn

I want to use one sklearn.ensemble.GradientBoostingClassifier to init another sklearn.ensemble.GradientBoostingClassifier, but it raises error "IndexError: too many indices for array". It seams that it's an error in sklearn and also I've found pull request in sklearn on GitHub and sklearn on GitHub. If someone try to do this and have positive experience please let me know.

System info:
MacOS X 10.11.5 (15F34)
python -V : Python 2.7.11 :: Anaconda custom (x86_64)
sklearn.__version__ : '0.17.1'

Above is a sample code that shows this error.

from sklearn.datasets import load_iris
from sklearn import ensemble
from sklearn.cross_validation import train_test_split

iris = load_iris()
X, y = iris.data, iris.target
X, y = X[y < 2], y[y < 2]  # make it binary

X_train, X_test, y_train, y_test = train_test_split(X, y)

# Fit GBT init with RF
clf = ensemble.GradientBoostingClassifier()
clf.fit(X_train, y_train)
clf2 = ensemble.GradientBoostingClassifier(init=clf)

clf2.fit(X_train, y_train)
acc = clf2.score(X_test, y_test)
print("Accuracy: {:.4f}".format(acc2))

Aucun commentaire:

Enregistrer un commentaire