Let's import iris dataset and create dataframe x of independent variables & a series of target variable

In [1]:
import pandas as pd

from sklearn.datasets import load_iris

colnames = ['sepallength', 'sepalwidth', 'petallength', 'petalwidth']

iris = load_iris()

x = iris.data
y = iris.target

x = pd.DataFrame(x, columns=colnames)
y = pd.Series(y, name='class')

iris_data = pd.concat([x, y], axis=1)

Let's now directly use cross_val_score function and pass LogisticRegression model into it to get k accuracies for each k-fold dataset. Ignore the LogisticRegression for now. We have a complete chapter on that.

In [2]:
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import cross_val_score, KFold

kfold = KFold(n_splits=5, random_state=7)

model = LogisticRegression(solver='lbfgs', multi_class='ovr')

results = cross_val_score(model, x, y, cv=kfold, scoring='accuracy')

print("Accuracy of for each 5-fold dataset : ", results)
print("Average accuracy of 5-fold cross validation : ", results.mean())
Accuracy of for each 5-fold dataset :  [1.         0.9        0.5        0.93333333 0.63333333]
Average accuracy of 5-fold cross validation :  0.7933333333333332