Menggunakan GridSearchCV untuk Mencari Parameter Optimal Pengklasifikasi Scikit-Learn

Muhammad Arslan 4 Januari 2017

Terkadang hasil akurasi dari pembuatan model sangat kurang dari target. Bukan hanya masalah dataset dan preprocessing yang kurang baik, tapi pemilihan parameter untuk pengklasifikasi pun dapat menjadi salah satu penyebabnya. Di Scikit-Learn, kamu dapat menggunakan GridSearchCV untuk mencari parameter terbaik untuk pengklasifikasi yang ingin kamu gunakan. Prosesnya akan dilakukan secara brute force dan melaporkan mana parameter yang memiliki akurasi paling baik.

Uuntuk lebih lanjutnya mari kita ikuti tutorial berikut :D.

Persiapan

Spesifikasi komputer yang diperlukan untuk tutorial ini adalah:

RAM 4 GB atau lebih
Intel Core i3 dengan QuadCore
Swap 4 GB (bila menggunakan Linux)

Sedangkan untuk modul aplikasi, kamu memerlukan beberapa barang berikut:

Python
Scikit-Learn
Scipy
Numpy
Dataset 20newsgroup

Menyiapkan contoh kode pengklasifikasi

Silahkan buat terlebih dahulu file bernama gridsearchcv-demo.py. Kemudian buat kode berikut di dalam file tersebut:

from sklearn.datasets import fetch_20newsgroups
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfTransformer
from sklearn.linear_model import SGDClassifier
from sklearn.pipeline import Pipeline
from sklearn.grid_search import GridSearchCV
import json
import datetime
menyiapkan dataset
dataset_train = fetch_20newsgroups(subset='train', shuffle=True, random_state=42)
mengatur classifier
clf = Pipeline([
('vect', CountVectorizer()),
('tfidf', TfidfTransformer()),
('clf', SGDClassifier())
])
params = {
'vect__max_df': (0.75, 1.0),
#'vect__max_features': (None, 5000, 10000, 50000),
# 'vect__ngram_range': ((1, 1), (1, 2)),
#'tfidf__use_idf': (True, False),
'tfidf__norm': ('l1', 'l2'),
'clf__alpha': (0.00001, 0.000001),
# 'clf__penalty': ('l2', 'elasticnet'),
#'clf__n_iter': (10, 50, 80),
}
grid = GridSearchCV(
clf,
params,
n_jobs=4,
cv=10,
verbose=4,
)
clf = grid.fit(dataset_train.data, dataset_train.target)
print "\nBest estmator:"
print
print clf.best_estimator_
print "\nGrid score:"
print
for params, mean_score, scores in clf.grid_scores_:
print "%0.3f (+/-%0.03f) for %r" % (mean_score, scores.std() / 2, params)
print

Pertama kita siapkan terlebih dahulu data latih yang diambil dari dataset 20 newsgroups. Kemudian kita siapkan pipeline yang berisi pengklasifikasi default yang terdiri dari CountVectorizer, TfidfTransformer, dan SGDClassifier. Kemudian kita tentukan juga kombinasi parameter yang ingin kita uji pada pipeline untuk mendapatkan hasil terbaik. DI tutorial ini kita teliti parameter max_df, norm, dan alpha. Lalu kita buat instans GridSearchCV yang menerima parameter pengklasifikasi, parameter yang mau dicari, n_jobs sebanyak 4, cross validation sebanyak 10, dan output di konsol dengan tingkat kejelasan 4.

Setelah itu kita masukkan dataset kedalam GridSearchCV untuk diperiksa dan laporan pun akan diberikan setelah selesai melakukan pencarian parameter.

Mulai Menggunakan GridSearchCV

Sekarang mari kita jalankan skrip tersebut. Setelah menjalankannya dalam kurun waktu 6 menit, berikut adalah hasil output yang diberikan selama proses pencarian parameter untuk pengklasifikasi yang akan kita bangun dengan menggunakan GridSearchCV:

$ python gridsearchcv-demo.py 
Fitting 10 folds for each of 8 candidates, totalling 80 fits
[CV] clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=0.75 .............
[CV] clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=0.75 .............
[CV] clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=0.75 .............
[CV] clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=0.75, score=0.888596 -  13.4s
[CV] clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=0.75, score=0.890158 -  14.0s
[CV] clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=0.75, score=0.886643 -  15.1s
[CV] clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=0.75, score=0.901408 -  16.2s
[CV] clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=0.75, score=0.894876 -  14.6s
[CV] clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=0.75, score=0.877768 -  15.1s
[CV]  clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=0.75, score=0.889184 -  14.8s
[CV] clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=0.75 .............
[CV] clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=0.75, score=0.888099 -  13.7s
[CV] clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=0.75, score=0.887011 -  15.1s
[CV]  clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=0.75, score=0.888691 -  13.6s
[CV] clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=1.0 ..............
[CV] clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=1.0, score=0.886842 -  16.0s
[CV]  clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=1.0, score=0.891916 -  15.2s
[CV] clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=1.0 ..............
[CV] clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=1.0, score=0.894552 -  13.3s
[CV] clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=1.0, score=0.893486 -  14.2s
[CV] clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=1.0, score=0.877768 -  14.4s
[CV] clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=1.0, score=0.886042 -  15.3s
[CV] clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=1.0, score=0.887411 -  13.9s
[Parallel(n_jobs=4)]: Done  17 tasks      | elapsed:  1.2min
[CV] clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=1.0, score=0.886323 -  14.3s
[CV] clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=1.0, score=0.882562 -  13.7s
[CV] clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-05, tfidf__norm=l1, vect__max_df=1.0, score=0.888691 -  14.0s
[CV] clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=0.75, score=0.907895 -  15.4s
[CV] clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=0.75, score=0.913884 -  14.3s
[CV] clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=0.75, score=0.920035 -  13.9s
[CV] clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=0.75, score=0.930458 -  14.7s
[CV] clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=0.75, score=0.904340 -  14.0s
[CV] clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=0.75, score=0.939929 -  15.4s
[CV] clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=0.75, score=0.921099 -  15.1s
[CV] clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=0.75, score=0.909414 -  16.1s
[CV] clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=0.75, score=0.919858 -  14.6s
[CV] clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=0.75, score=0.922598 -  17.3s
[CV] clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=1.0, score=0.911404 -  15.0s
[CV] clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=1.0, score=0.925308 -  14.6s
[CV] clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=1.0, score=0.918278 -  11.8s
[CV] clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=1.0, score=0.923415 -  13.1s
[CV] clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=1.0, score=0.936396 -  13.0s
[CV] clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=1.0, score=0.908769 -  12.5s
[CV] clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=1.0, score=0.921099 -  12.5s
[CV] clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=1.0, score=0.922735 -  12.7s
[CV]  clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=1.0, score=0.920819 -  12.3s
[CV] clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=0.75 .............
[CV] clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-05, tfidf__norm=l2, vect__max_df=1.0, score=0.918967 -  12.2s
[CV] clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=0.75, score=0.912281 -  14.4s
[CV] clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=0.75, score=0.915641 -  13.2s
[CV] clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=0.75, score=0.914763 -  13.3s
[CV] clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=0.75, score=0.929577 -  13.5s
[CV] clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=0.75, score=0.931095 -  12.1s
[CV] clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=0.75, score=0.906997 -  11.4s
[CV] clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=0.75, score=0.921099 -  12.0s
[CV] clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=0.75, score=0.917407 -  12.2s
[CV] clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=0.75, score=0.923488 -  11.4s
[CV] clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=0.75, score=0.913624 -  11.5s
[CV] clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=1.0, score=0.907895 -  11.7s
[CV] clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=1.0, score=0.921793 -  12.1s
[CV] clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=1.0, score=0.907733 -  11.4s
[CV] clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=1.0, score=0.926056 -  12.5s
[CV] clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=1.0, score=0.929329 -  12.1s
[CV] clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=1.0, score=0.902569 -  11.3s
[CV] clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=1.0, score=0.909574 -  11.4s
[CV] clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=1.0, score=0.912966 -  12.3s
[CV]  clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=1.0, score=0.925267 -  11.7s
[CV] clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=0.75 .............
[CV] clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-06, tfidf__norm=l1, vect__max_df=1.0, score=0.911843 -  11.4s
[CV] clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=0.75, score=0.890351 -  11.6s
[CV] clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=0.75, score=0.901582 -  11.6s
[CV]  clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=0.75, score=0.903339 -  12.2s
[CV] clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=0.75 .............
[CV] clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=0.75, score=0.916373 -  11.9s
[CV] clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=0.75, score=0.921378 -  11.0s
[CV] clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=0.75, score=0.885740 -  11.2s
[CV] clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=0.75 .............
[CV]  clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=0.75, score=0.898936 -  11.9s
[CV] clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=0.75, score=0.899645 -  11.9s
[CV] clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=0.75, score=0.903025 -  11.4s
[CV] clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=0.75, score=0.910062 -  11.8s
[CV] clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=1.0, score=0.895614 -  11.9s
[CV] clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=1.0, score=0.898946 -  11.7s
[CV] clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=1.0, score=0.905097 -  11.6s
[CV] clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=1.0, score=0.919014 -  11.3s
[CV] clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=1.0, score=0.917845 -  11.6s
[CV] clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=1.0, score=0.888397 -  11.8s
[CV] clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=1.0 ..............
[CV]  clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=1.0, score=0.901596 -  11.3s
[CV]  clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=1.0, score=0.894316 -  10.5s
[CV]  clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=1.0, score=0.903915 -   9.5s
[CV]  clf__alpha=1e-06, tfidf__norm=l2, vect__max_df=1.0, score=0.910062 -   9.6s
[Parallel(n_jobs=4)]: Done  80 out of  80 | elapsed:  4.5min finished
Best estmator:
Pipeline(steps=[('vect', CountVectorizer(analyzer=u'word', binary=False, decode_error=u'strict',
dtype=<type 'numpy.int64'>, encoding=u'utf-8', input=u'content',
lowercase=True, max_df=1.0, max_features=None, min_df=1,
ngram_range=(1, 1), preprocessor=None, stop_words=None,
st...   penalty='l2', power_t=0.5, random_state=None, shuffle=True,
verbose=0, warm_start=False))])
Grid score:
0.889 (+/-0.003) for {'vect__max_df': 0.75, 'tfidf__norm': 'l1', 'clf__alpha': 1e-05}
0.888 (+/-0.002) for {'vect__max_df': 1.0, 'tfidf__norm': 'l1', 'clf__alpha': 1e-05}
0.919 (+/-0.005) for {'vect__max_df': 0.75, 'tfidf__norm': 'l2', 'clf__alpha': 1e-05}
0.921 (+/-0.004) for {'vect__max_df': 1.0, 'tfidf__norm': 'l2', 'clf__alpha': 1e-05}
0.919 (+/-0.004) for {'vect__max_df': 0.75, 'tfidf__norm': 'l1', 'clf__alpha': 1e-06}
0.916 (+/-0.004) for {'vect__max_df': 1.0, 'tfidf__norm': 'l1', 'clf__alpha': 1e-06}
0.903 (+/-0.005) for {'vect__max_df': 0.75, 'tfidf__norm': 'l2', 'clf__alpha': 1e-06}
0.903 (+/-0.005) for {'vect__max_df': 1.0, 'tfidf__norm': 'l2', 'clf__alpha': 1e-06}

Setelah melalui proses GridSearchCV, kita dapat memilih parameter yang terbaik. Pada hasil diatas kita dapat memilih parameter {'vect__max_df': 1.0, 'tfidf__norm': 'l2', 'clf__alpha': 1e-05} dengan skor 0.921 untuk dilewatkan kedalam pipeline yang telah kita buat. Kita dapat melewatkan parameter max_df=1.0 ke dalam CountVectorizer(), norm='l2' ke dalam TfidfTransformer, dan alpha=1e-05 ke dalam SGDClassifier. Dengan demikian akurasi pun dapat meningkat lebih signifikan ketimbang tidak menggunakan GridSearchCV.

Bila kamu mempunyai sumber daya hardware yang lebih besar dan tangkas, kamu dapat mencabut semua tanda komentar pada bagian params. Sehingga kamu dapat melihat berbagai kombinasi yang lebih baik untuk mendapatkan akurasi yang lebih tinggi.

Referensi

Scikit-Learn Official Documentation
Python Official Documentation

(arslan/scikit-learn/python)

Tags
Tutorial

Share to
Facebook Twitter LinkedIn

Menggunakan GridSearchCV untuk Mencari Parameter Optimal Pengklasifikasi Scikit-Learn

Persiapan

Menyiapkan contoh kode pengklasifikasi

menyiapkan dataset

mengatur classifier

Mulai Menggunakan GridSearchCV

Referensi

Belajar Kalkulus dalam Teknik Informatik...

Apa Itu Deployment : Pengertian, Manfaat...

Huruf Pertama di Dunia, Aksara Tertua Ca...

Jurusan Rekayasa Perangkat Lunak: Matkul...

Siapa Penemu Huruf Abjad Alfabet? Carita...