I am currently working on a binary classification problem with about 2000 data and have tried to implement nested cross-validation. I would like to know if this implementation is correct.

As an example I will use the algorithm SVC. I will first run the nested CV and get a realistic performance estimate of the algorithm. (This performance estimation is the selection criterion for building the final ensemble, in which 5 of about 9 algorithms will be used in the end).

Consequently, I use regular cross-validation to obtain the best hyperparameters taking into account all training data, and finally form an ensemble of the 3-5 best parameters of the SVC to minimise variance. Finally, different algorithms (SVM, AdaBoost, LogReg, XGBoost) are combined into an ensemble (regular voting and/or stacking). The monetary score is an custom scoring function based on a confusion matrix (fraud detection).

Below you can see my code.

**Nested CV**

```
#Prepare nested CV
cv_outer = StratifiedKFold(n_splits=5, shuffle=True, random_state=17)
cv_inner = StratifiedKFold(n_splits=5, shuffle=True, random_state=17)
model = SVC(kernel="linear", random_state=rs)
params = {"C": (0.08, 0.09, 0.1, 0.11, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9)}
#Define grid search as outer cv
grid = GridSearchCV(estimator=model, param_grid=params, scoring=monetary_score, cv=cv_outer, n_jobs=-1)
#Get the mean nested score with inner cv
nested_cv = cross_validate(estimator=grid, X=X, y=y, scoring=monetary_score, cv=cv_inner, return_estimator=True, n_jobs=-1)
nested_score = nested_cv("test_score").mean()
```

```
print(nested_score)
0.3361904761904762
```

**Regular GridSearch**

```
grid.fit(X, y)
means = grid.cv_results_("mean_test_score")
stds = grid.cv_results_("std_test_score")
ranks = grid.cv_results_("rank_test_score")
```

```
for rank, mean, params in zip(ranks, means, grid.cv_results_("params")):
print(rank, "t", mean, "t", params)
print(f"nBest params:t{grid.best_params_}")
print(f"Best score:t{grid.best_score_}n")
9 0.35428571428571426 {'C': 0.08}
9 0.35428571428571426 {'C': 0.09}
12 0.27904761904761904 {'C': 0.1}
11 0.31714285714285717 {'C': 0.11}
7 0.38619047619047614 {'C': 0.2}
6 0.39571428571428574 {'C': 0.3}
2 0.41238095238095235 {'C': 0.4}
2 0.41238095238095235 {'C': 0.5}
2 0.41238095238095235 {'C': 0.6}
2 0.41238095238095235 {'C': 0.7}
8 0.38142857142857145 {'C': 0.8}
1 0.4514285714285714 {'C': 0.9}
Best params: {'C': 0.9}
Best score: 0.4514285714285714
```

**Build an ensemble of the best 5 parameters**

```
estimators = (
("svc1", SVC(C=0.9, kernel="linear", random_state=17)),
("svc2", SVC(C=0.7, kernel="linear", random_state=17)),
("svc3", SVC(C=0.6, kernel="linear", random_state=17)),
("svc4", SVC(C=0.4, kernel="linear", random_state=17)),
("svc5", SVC(C=0.5, kernel="linear", random_state=17))
)
final_clf = VotingClassifier(estimators, voting="hard")
```

```
#How good does our model perform based on cross validation?
scoring = {'monetary_score': monetary_score,
'accuracy': 'accuracy',
'f1': 'f1',
'auc': make_scorer(roc_auc_score)
}
scores = cross_validate(final_clf, X, y, cv=cv_inner, scoring=scoring, n_jobs=-1)
print({scores('test_monetary_score').mean()})
0.41238095238095235
```