Jul-31-2020, 01:40 PM
Hello
I am implementing a pipeline with GridSearch
I'm using the Boston housing dataset
Here is my code
First error message:
File "<tokenize>", line 58
grid_search = GridSearchCV(estimator = pipe,
^
IndentationError: unindent does not match any outer indentation level
I am implementing a pipeline with GridSearch
I'm using the Boston housing dataset
Here is my code
X, y = load_boston(return_X_y=True)
poly_params = {"degree": 2,
"interaction_only": False,
"include_bias": True
}
# pre-instantiation
ridge_shrinkage = np.linspace(0.00001, 0.4, num=200)
df_metrics = pd.DataFrame(index=[0], columns=["Fold", "Shrinkage", "Metric", "Train", "Test"])
# main loop
f = 0
for (train, test) in rkf.split(X):
f += 1
print(f)
# separate variables and folds
x_train = X.values[train]
x_test = X.values[test]
y_train = y.values[train]
y_test = y.values[test]
# fit model
model_ridge = make_pipeline(StandardScaler(), PolynomialFeatures(**poly_params), Ridge()) # poly-params has been defined above on line 5
model_lasso = make_pipeline(StandardScaler(), PolynomialFeatures(**poly_params), Lasso())
model_SVR = make_pipeline(StandardScaler(), SVR())
## List of pipelines
pipelines = [model_ridge, model_lasso, model_SVR]
pipe_dict = {1: 'Ridge', 2: 'Lasso', 3: 'SVR'}
# Apply the fit method to the pipelines
for pipe in pipelines: # pipe can be replaced by any other word
pipe.fit(X_train, y_train)
pipe.predict(x_train)
pipe.predict(x_test)
for i,model in enumerate(pipelines):
print('Model score:{}'.format(pipe_dict[best_model]))
#I am not sure whether this specification would work.
parameters = [ {'model-ridge__alpha': np.arange(0, 0.5, 0.01) },
{'model-lasso__alpha': np.arange(0, 0.5, 0.01) },
{'model-SVR__'
'C': [0.1, 1, 100, 1000],
'epsilon': [0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10],
'gamma': [0.0001, 0.001, 0.005, 0.1, 1, 3, 5]
}]
scoring_func = make_scorer(mean_squared_error)
# I would like to have the best model for each model in the pipelines
grid_search = GridSearchCV(estimator = pipe,
param_grid = parameters,
scoring = scoring_func,
cv = 10,
n_jobs = -1)
best_params = grid_result.best_params_
best_svr = SVR(kernel='rbf', C=best_params["C"], epsilon=best_params["epsilon"], gamma=best_params["gamma"],
coef0=0.1, shrinking=True,
tol=0.001, cache_size=200, verbose=False, max_iter=-1)
grid_search = grid_search.fit(X_train, y_train)I don't know how to get the best model for each element of the pipelines. Thank you for your help!First error message:
File "<tokenize>", line 58
grid_search = GridSearchCV(estimator = pipe,
^
IndentationError: unindent does not match any outer indentation level
