.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/manifold/plot_compare_methods.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. or to run this example in your browser via Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_manifold_plot_compare_methods.py: ========================================= Comparison of Manifold Learning methods ========================================= An illustration of dimensionality reduction on the S-curve dataset with various manifold learning methods. For a discussion and comparison of these algorithms, see the :ref:`manifold module page ` For a similar example, where the methods are applied to a sphere dataset, see :ref:`sphx_glr_auto_examples_manifold_plot_manifold_sphere.py` Note that the purpose of the MDS is to find a low-dimensional representation of the data (here 2D) in which the distances respect well the distances in the original high-dimensional space, unlike other manifold-learning algorithms, it does not seeks an isotropic representation of the data in the low-dimensional space. .. GENERATED FROM PYTHON SOURCE LINES 22-26 .. code-block:: Python # Authors: The scikit-learn developers # SPDX-License-Identifier: BSD-3-Clause .. GENERATED FROM PYTHON SOURCE LINES 27-31 Dataset preparation ------------------- We start by generating the S-curve dataset. .. GENERATED FROM PYTHON SOURCE LINES 31-43 .. code-block:: Python import matplotlib.pyplot as plt # unused but required import for doing 3d projections with matplotlib < 3.2 import mpl_toolkits.mplot3d # noqa: F401 from matplotlib import ticker from sklearn import datasets, manifold n_samples = 1500 S_points, S_color = datasets.make_s_curve(n_samples, random_state=0) .. GENERATED FROM PYTHON SOURCE LINES 44-46 Let's look at the original data. Also define some helping functions, which we will use further on. .. GENERATED FROM PYTHON SOURCE LINES 46-85 .. code-block:: Python def plot_3d(points, points_color, title): x, y, z = points.T fig, ax = plt.subplots( figsize=(6, 6), facecolor="white", tight_layout=True, subplot_kw={"projection": "3d"}, ) fig.suptitle(title, size=16) col = ax.scatter(x, y, z, c=points_color, s=50, alpha=0.8) ax.view_init(azim=-60, elev=9) ax.xaxis.set_major_locator(ticker.MultipleLocator(1)) ax.yaxis.set_major_locator(ticker.MultipleLocator(1)) ax.zaxis.set_major_locator(ticker.MultipleLocator(1)) fig.colorbar(col, ax=ax, orientation="horizontal", shrink=0.6, aspect=60, pad=0.01) plt.show() def plot_2d(points, points_color, title): fig, ax = plt.subplots(figsize=(3, 3), facecolor="white", constrained_layout=True) fig.suptitle(title, size=16) add_2d_scatter(ax, points, points_color) plt.show() def add_2d_scatter(ax, points, points_color, title=None): x, y = points.T ax.scatter(x, y, c=points_color, s=50, alpha=0.8) ax.set_title(title) ax.xaxis.set_major_formatter(ticker.NullFormatter()) ax.yaxis.set_major_formatter(ticker.NullFormatter()) plot_3d(S_points, S_color, "Original S-curve samples") .. image-sg:: /auto_examples/manifold/images/sphx_glr_plot_compare_methods_001.png :alt: Original S-curve samples :srcset: /auto_examples/manifold/images/sphx_glr_plot_compare_methods_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 86-94 Define algorithms for the manifold learning ------------------------------------------- Manifold learning is an approach to non-linear dimensionality reduction. Algorithms for this task are based on the idea that the dimensionality of many data sets is only artificially high. Read more in the :ref:`User Guide `. .. GENERATED FROM PYTHON SOURCE LINES 94-98 .. code-block:: Python n_neighbors = 12 # neighborhood which is used to recover the locally linear structure n_components = 2 # number of coordinates for the manifold .. GENERATED FROM PYTHON SOURCE LINES 99-106 Locally Linear Embeddings ^^^^^^^^^^^^^^^^^^^^^^^^^ Locally linear embedding (LLE) can be thought of as a series of local Principal Component Analyses which are globally compared to find the best non-linear embedding. Read more in the :ref:`User Guide `. .. GENERATED FROM PYTHON SOURCE LINES 106-126 .. code-block:: Python params = { "n_neighbors": n_neighbors, "n_components": n_components, "eigen_solver": "auto", "random_state": 0, } lle_standard = manifold.LocallyLinearEmbedding(method="standard", **params) S_standard = lle_standard.fit_transform(S_points) lle_ltsa = manifold.LocallyLinearEmbedding(method="ltsa", **params) S_ltsa = lle_ltsa.fit_transform(S_points) lle_hessian = manifold.LocallyLinearEmbedding(method="hessian", **params) S_hessian = lle_hessian.fit_transform(S_points) lle_mod = manifold.LocallyLinearEmbedding(method="modified", **params) S_mod = lle_mod.fit_transform(S_points) .. GENERATED FROM PYTHON SOURCE LINES 127-144 .. code-block:: Python fig, axs = plt.subplots( nrows=2, ncols=2, figsize=(7, 7), facecolor="white", constrained_layout=True ) fig.suptitle("Locally Linear Embeddings", size=16) lle_methods = [ ("Standard locally linear embedding", S_standard), ("Local tangent space alignment", S_ltsa), ("Hessian eigenmap", S_hessian), ("Modified locally linear embedding", S_mod), ] for ax, method in zip(axs.flat, lle_methods): name, points = method add_2d_scatter(ax, points, S_color, name) plt.show() .. image-sg:: /auto_examples/manifold/images/sphx_glr_plot_compare_methods_002.png :alt: Locally Linear Embeddings, Standard locally linear embedding, Local tangent space alignment, Hessian eigenmap, Modified locally linear embedding :srcset: /auto_examples/manifold/images/sphx_glr_plot_compare_methods_002.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 145-151 Isomap Embedding ^^^^^^^^^^^^^^^^ Non-linear dimensionality reduction through Isometric Mapping. Isomap seeks a lower-dimensional embedding which maintains geodesic distances between all points. Read more in the :ref:`User Guide `. .. GENERATED FROM PYTHON SOURCE LINES 151-157 .. code-block:: Python isomap = manifold.Isomap(n_neighbors=n_neighbors, n_components=n_components, p=1) S_isomap = isomap.fit_transform(S_points) plot_2d(S_isomap, S_color, "Isomap Embedding") .. image-sg:: /auto_examples/manifold/images/sphx_glr_plot_compare_methods_003.png :alt: Isomap Embedding :srcset: /auto_examples/manifold/images/sphx_glr_plot_compare_methods_003.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 158-165 Multidimensional scaling ^^^^^^^^^^^^^^^^^^^^^^^^ Multidimensional scaling (MDS) seeks a low-dimensional representation of the data in which the distances respect well the distances in the original high-dimensional space. Read more in the :ref:`User Guide `. .. GENERATED FROM PYTHON SOURCE LINES 165-177 .. code-block:: Python md_scaling = manifold.MDS( n_components=n_components, max_iter=50, n_init=4, random_state=0, normalized_stress=False, ) S_scaling = md_scaling.fit_transform(S_points) plot_2d(S_scaling, S_color, "Multidimensional scaling") .. image-sg:: /auto_examples/manifold/images/sphx_glr_plot_compare_methods_004.png :alt: Multidimensional scaling :srcset: /auto_examples/manifold/images/sphx_glr_plot_compare_methods_004.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 178-184 Spectral embedding for non-linear dimensionality reduction ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This implementation uses Laplacian Eigenmaps, which finds a low dimensional representation of the data using a spectral decomposition of the graph Laplacian. Read more in the :ref:`User Guide `. .. GENERATED FROM PYTHON SOURCE LINES 184-192 .. code-block:: Python spectral = manifold.SpectralEmbedding( n_components=n_components, n_neighbors=n_neighbors, random_state=42 ) S_spectral = spectral.fit_transform(S_points) plot_2d(S_spectral, S_color, "Spectral Embedding") .. image-sg:: /auto_examples/manifold/images/sphx_glr_plot_compare_methods_005.png :alt: Spectral Embedding :srcset: /auto_examples/manifold/images/sphx_glr_plot_compare_methods_005.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 193-201 T-distributed Stochastic Neighbor Embedding ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ It converts similarities between data points to joint probabilities and tries to minimize the Kullback-Leibler divergence between the joint probabilities of the low-dimensional embedding and the high-dimensional data. t-SNE has a cost function that is not convex, i.e. with different initializations we can get different results. Read more in the :ref:`User Guide `. .. GENERATED FROM PYTHON SOURCE LINES 201-213 .. code-block:: Python t_sne = manifold.TSNE( n_components=n_components, perplexity=30, init="random", max_iter=250, random_state=0, ) S_t_sne = t_sne.fit_transform(S_points) plot_2d(S_t_sne, S_color, "T-distributed Stochastic \n Neighbor Embedding") .. image-sg:: /auto_examples/manifold/images/sphx_glr_plot_compare_methods_006.png :alt: T-distributed Stochastic Neighbor Embedding :srcset: /auto_examples/manifold/images/sphx_glr_plot_compare_methods_006.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 11.966 seconds) .. _sphx_glr_download_auto_examples_manifold_plot_compare_methods.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/main?urlpath=lab/tree/notebooks/auto_examples/manifold/plot_compare_methods.ipynb :alt: Launch binder :width: 150 px .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_compare_methods.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_compare_methods.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_compare_methods.zip ` .. include:: plot_compare_methods.recommendations .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_