.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/classification/plot_digits_classification.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. or to run this example in your browser via Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_classification_plot_digits_classification.py: ================================ Recognizing hand-written digits ================================ This example shows how scikit-learn can be used to recognize images of hand-written digits, from 0-9. .. GENERATED FROM PYTHON SOURCE LINES 10-21 .. code-block:: Python # Authors: The scikit-learn developers # SPDX-License-Identifier: BSD-3-Clause # Standard scientific Python imports import matplotlib.pyplot as plt # Import datasets, classifiers and performance metrics from sklearn import datasets, metrics, svm from sklearn.model_selection import train_test_split .. GENERATED FROM PYTHON SOURCE LINES 22-34 Digits dataset -------------- The digits dataset consists of 8x8 pixel images of digits. The ``images`` attribute of the dataset stores 8x8 arrays of grayscale values for each image. We will use these arrays to visualize the first 4 images. The ``target`` attribute of the dataset stores the digit each image represents and this is included in the title of the 4 plots below. Note: if we were working from image files (e.g., 'png' files), we would load them using :func:`matplotlib.pyplot.imread`. .. GENERATED FROM PYTHON SOURCE LINES 34-43 .. code-block:: Python digits = datasets.load_digits() _, axes = plt.subplots(nrows=1, ncols=4, figsize=(10, 3)) for ax, image, label in zip(axes, digits.images, digits.target): ax.set_axis_off() ax.imshow(image, cmap=plt.cm.gray_r, interpolation="nearest") ax.set_title("Training: %i" % label) .. image-sg:: /auto_examples/classification/images/sphx_glr_plot_digits_classification_001.png :alt: Training: 0, Training: 1, Training: 2, Training: 3 :srcset: /auto_examples/classification/images/sphx_glr_plot_digits_classification_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 44-57 Classification -------------- To apply a classifier on this data, we need to flatten the images, turning each 2-D array of grayscale values from shape ``(8, 8)`` into shape ``(64,)``. Subsequently, the entire dataset will be of shape ``(n_samples, n_features)``, where ``n_samples`` is the number of images and ``n_features`` is the total number of pixels in each image. We can then split the data into train and test subsets and fit a support vector classifier on the train samples. The fitted classifier can subsequently be used to predict the value of the digit for the samples in the test subset. .. GENERATED FROM PYTHON SOURCE LINES 57-76 .. code-block:: Python # flatten the images n_samples = len(digits.images) data = digits.images.reshape((n_samples, -1)) # Create a classifier: a support vector classifier clf = svm.SVC(gamma=0.001) # Split data into 50% train and 50% test subsets X_train, X_test, y_train, y_test = train_test_split( data, digits.target, test_size=0.5, shuffle=False ) # Learn the digits on the train subset clf.fit(X_train, y_train) # Predict the value of the digit on the test subset predicted = clf.predict(X_test) .. GENERATED FROM PYTHON SOURCE LINES 77-79 Below we visualize the first 4 test samples and show their predicted digit value in the title. .. GENERATED FROM PYTHON SOURCE LINES 79-87 .. code-block:: Python _, axes = plt.subplots(nrows=1, ncols=4, figsize=(10, 3)) for ax, image, prediction in zip(axes, X_test, predicted): ax.set_axis_off() image = image.reshape(8, 8) ax.imshow(image, cmap=plt.cm.gray_r, interpolation="nearest") ax.set_title(f"Prediction: {prediction}") .. image-sg:: /auto_examples/classification/images/sphx_glr_plot_digits_classification_002.png :alt: Prediction: 8, Prediction: 8, Prediction: 4, Prediction: 9 :srcset: /auto_examples/classification/images/sphx_glr_plot_digits_classification_002.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 88-90 :func:`~sklearn.metrics.classification_report` builds a text report showing the main classification metrics. .. GENERATED FROM PYTHON SOURCE LINES 90-96 .. code-block:: Python print( f"Classification report for classifier {clf}:\n" f"{metrics.classification_report(y_test, predicted)}\n" ) .. rst-class:: sphx-glr-script-out .. code-block:: none Classification report for classifier SVC(gamma=0.001): precision recall f1-score support 0 1.00 0.99 0.99 88 1 0.99 0.97 0.98 91 2 0.99 0.99 0.99 86 3 0.98 0.87 0.92 91 4 0.99 0.96 0.97 92 5 0.95 0.97 0.96 91 6 0.99 0.99 0.99 91 7 0.96 0.99 0.97 89 8 0.94 1.00 0.97 88 9 0.93 0.98 0.95 92 accuracy 0.97 899 macro avg 0.97 0.97 0.97 899 weighted avg 0.97 0.97 0.97 899 .. GENERATED FROM PYTHON SOURCE LINES 97-99 We can also plot a :ref:`confusion matrix ` of the true digit values and the predicted digit values. .. GENERATED FROM PYTHON SOURCE LINES 99-106 .. code-block:: Python disp = metrics.ConfusionMatrixDisplay.from_predictions(y_test, predicted) disp.figure_.suptitle("Confusion Matrix") print(f"Confusion matrix:\n{disp.confusion_matrix}") plt.show() .. image-sg:: /auto_examples/classification/images/sphx_glr_plot_digits_classification_003.png :alt: Confusion Matrix :srcset: /auto_examples/classification/images/sphx_glr_plot_digits_classification_003.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none Confusion matrix: [[87 0 0 0 1 0 0 0 0 0] [ 0 88 1 0 0 0 0 0 1 1] [ 0 0 85 1 0 0 0 0 0 0] [ 0 0 0 79 0 3 0 4 5 0] [ 0 0 0 0 88 0 0 0 0 4] [ 0 0 0 0 0 88 1 0 0 2] [ 0 1 0 0 0 0 90 0 0 0] [ 0 0 0 0 0 1 0 88 0 0] [ 0 0 0 0 0 0 0 0 88 0] [ 0 0 0 1 0 1 0 0 0 90]] .. GENERATED FROM PYTHON SOURCE LINES 107-111 If the results from evaluating a classifier are stored in the form of a :ref:`confusion matrix ` and not in terms of `y_true` and `y_pred`, one can still build a :func:`~sklearn.metrics.classification_report` as follows: .. GENERATED FROM PYTHON SOURCE LINES 111-129 .. code-block:: Python # The ground truth and predicted lists y_true = [] y_pred = [] cm = disp.confusion_matrix # For each cell in the confusion matrix, add the corresponding ground truths # and predictions to the lists for gt in range(len(cm)): for pred in range(len(cm)): y_true += [gt] * cm[gt][pred] y_pred += [pred] * cm[gt][pred] print( "Classification report rebuilt from confusion matrix:\n" f"{metrics.classification_report(y_true, y_pred)}\n" ) .. rst-class:: sphx-glr-script-out .. code-block:: none Classification report rebuilt from confusion matrix: precision recall f1-score support 0 1.00 0.99 0.99 88 1 0.99 0.97 0.98 91 2 0.99 0.99 0.99 86 3 0.98 0.87 0.92 91 4 0.99 0.96 0.97 92 5 0.95 0.97 0.96 91 6 0.99 0.99 0.99 91 7 0.96 0.99 0.97 89 8 0.94 1.00 0.97 88 9 0.93 0.98 0.95 92 accuracy 0.97 899 macro avg 0.97 0.97 0.97 899 weighted avg 0.97 0.97 0.97 899 .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.367 seconds) .. _sphx_glr_download_auto_examples_classification_plot_digits_classification.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/main?urlpath=lab/tree/notebooks/auto_examples/classification/plot_digits_classification.ipynb :alt: Launch binder :width: 150 px .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_digits_classification.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_digits_classification.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_digits_classification.zip ` .. include:: plot_digits_classification.recommendations .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_