.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/datasets/plot_iris_dataset.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. or to run this example in your browser via Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_datasets_plot_iris_dataset.py: ================ The Iris Dataset ================ This data sets consists of 3 different types of irises' (Setosa, Versicolour, and Virginica) petal and sepal length, stored in a 150x4 numpy.ndarray The rows being the samples and the columns being: Sepal Length, Sepal Width, Petal Length and Petal Width. The below plot uses the first two features. See `here `_ for more information on this dataset. .. GENERATED FROM PYTHON SOURCE LINES 17-21 .. code-block:: Python # Authors: The scikit-learn developers # SPDX-License-Identifier: BSD-3-Clause .. GENERATED FROM PYTHON SOURCE LINES 22-24 Loading the iris dataset ------------------------ .. GENERATED FROM PYTHON SOURCE LINES 24-29 .. code-block:: Python from sklearn import datasets iris = datasets.load_iris() .. GENERATED FROM PYTHON SOURCE LINES 30-32 Scatter Plot of the Iris dataset -------------------------------- .. GENERATED FROM PYTHON SOURCE LINES 32-41 .. code-block:: Python import matplotlib.pyplot as plt _, ax = plt.subplots() scatter = ax.scatter(iris.data[:, 0], iris.data[:, 1], c=iris.target) ax.set(xlabel=iris.feature_names[0], ylabel=iris.feature_names[1]) _ = ax.legend( scatter.legend_elements()[0], iris.target_names, loc="lower right", title="Classes" ) .. image-sg:: /auto_examples/datasets/images/sphx_glr_plot_iris_dataset_001.png :alt: plot iris dataset :srcset: /auto_examples/datasets/images/sphx_glr_plot_iris_dataset_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 42-49 Each point in the scatter plot refers to one of the 150 iris flowers in the dataset, with the color indicating their respective type (Setosa, Versicolour, and Virginica). You can already see a pattern regarding the Setosa type, which is easily identifiable based on its short and wide sepal. Only considering these 2 dimensions, sepal width and length, there's still overlap between the Versicolor and Virginica types. .. GENERATED FROM PYTHON SOURCE LINES 51-56 Plot a PCA representation ------------------------- Let's apply a Principal Component Analysis (PCA) to the iris dataset and then plot the irises across the first three PCA dimensions. This will allow us to better differentiate between the three types! .. GENERATED FROM PYTHON SOURCE LINES 56-84 .. code-block:: Python # unused but required import for doing 3d projections with matplotlib < 3.2 import mpl_toolkits.mplot3d # noqa: F401 from sklearn.decomposition import PCA fig = plt.figure(1, figsize=(8, 6)) ax = fig.add_subplot(111, projection="3d", elev=-150, azim=110) X_reduced = PCA(n_components=3).fit_transform(iris.data) ax.scatter( X_reduced[:, 0], X_reduced[:, 1], X_reduced[:, 2], c=iris.target, s=40, ) ax.set_title("First three PCA dimensions") ax.set_xlabel("1st Eigenvector") ax.xaxis.set_ticklabels([]) ax.set_ylabel("2nd Eigenvector") ax.yaxis.set_ticklabels([]) ax.set_zlabel("3rd Eigenvector") ax.zaxis.set_ticklabels([]) plt.show() .. image-sg:: /auto_examples/datasets/images/sphx_glr_plot_iris_dataset_002.png :alt: First three PCA dimensions :srcset: /auto_examples/datasets/images/sphx_glr_plot_iris_dataset_002.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 85-89 PCA will create 3 new features that are a linear combination of the 4 original features. In addition, this transform maximizes the variance. With this transformation, we see that we can identify each species using only the first feature (i.e. first eigenvalues). .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.161 seconds) .. _sphx_glr_download_auto_examples_datasets_plot_iris_dataset.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/main?urlpath=lab/tree/notebooks/auto_examples/datasets/plot_iris_dataset.ipynb :alt: Launch binder :width: 150 px .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_iris_dataset.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_iris_dataset.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_iris_dataset.zip ` .. include:: plot_iris_dataset.recommendations .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_