.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/model_selection/plot_grid_search_refit_callable.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_model_selection_plot_grid_search_refit_callable.py>`
        to download the full example code. or to run this example in your browser via Binder

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_model_selection_plot_grid_search_refit_callable.py:


==================================================
Balance model complexity and cross-validated score
==================================================

This example demonstrates how to balance model complexity and cross-validated score by
finding a decent accuracy within 1 standard deviation of the best accuracy score while
minimising the number of :class:`~sklearn.decomposition.PCA` components [1]. It uses
:class:`~sklearn.model_selection.GridSearchCV` with a custom refit callable to select
the optimal model.

The figure shows the trade-off between cross-validated score and the number
of PCA components. The balanced case is when `n_components=10` and `accuracy=0.88`,
which falls into the range within 1 standard deviation of the best accuracy
score.

[1] Hastie, T., Tibshirani, R.,, Friedman, J. (2001). Model Assessment and
Selection. The Elements of Statistical Learning (pp. 219-260). New York,
NY, USA: Springer New York Inc..

.. GENERATED FROM PYTHON SOURCE LINES 21-35

.. code-block:: Python


    # Authors: The scikit-learn developers
    # SPDX-License-Identifier: BSD-3-Clause

    import matplotlib.pyplot as plt
    import numpy as np
    import polars as pl

    from sklearn.datasets import load_digits
    from sklearn.decomposition import PCA
    from sklearn.linear_model import LogisticRegression
    from sklearn.model_selection import GridSearchCV, ShuffleSplit
    from sklearn.pipeline import Pipeline


.. GENERATED FROM PYTHON SOURCE LINES 36-44

Introduction
------------

When tuning hyperparameters, we often want to balance model complexity and
performance. The "one-standard-error" rule is a common approach: select the simplest
model whose performance is within one standard error of the best model's performance.
This helps to avoid overfitting by preferring simpler models when their performance is
statistically comparable to more complex ones.

.. GENERATED FROM PYTHON SOURCE LINES 46-54

Helper functions
----------------

We define two helper functions:
1. `lower_bound`: Calculates the threshold for acceptable performance
(best score - 1 std)
2. `best_low_complexity`: Selects the model with the fewest PCA components that
exceeds this threshold

.. GENERATED FROM PYTHON SOURCE LINES 54-104

.. code-block:: Python


    def lower_bound(cv_results):
        """
        Calculate the lower bound within 1 standard deviation
        of the best `mean_test_scores`.

        Parameters
        ----------
        cv_results : dict of numpy(masked) ndarrays
            See attribute cv_results_ of `GridSearchCV`

        Returns
        -------
        float
            Lower bound within 1 standard deviation of the
            best `mean_test_score`.
        """
        best_score_idx = np.argmax(cv_results["mean_test_score"])

        return (
            cv_results["mean_test_score"][best_score_idx]
            - cv_results["std_test_score"][best_score_idx]
        )


    def best_low_complexity(cv_results):
        """
        Balance model complexity with cross-validated score.

        Parameters
        ----------
        cv_results : dict of numpy(masked) ndarrays
            See attribute cv_results_ of `GridSearchCV`.

        Return
        ------
        int
            Index of a model that has the fewest PCA components
            while has its test score within 1 standard deviation of the best
            `mean_test_score`.
        """
        threshold = lower_bound(cv_results)
        candidate_idx = np.flatnonzero(cv_results["mean_test_score"] >= threshold)
        best_idx = candidate_idx[
            cv_results["param_reduce_dim__n_components"][candidate_idx].argmin()
        ]
        return best_idx


.. GENERATED FROM PYTHON SOURCE LINES 105-113

Set up the pipeline and parameter grid
--------------------------------------

We create a pipeline with two steps:
1. Dimensionality reduction using PCA
2. Classification using LogisticRegression

We'll search over different numbers of PCA components to find the optimal complexity.

.. GENERATED FROM PYTHON SOURCE LINES 113-123

.. code-block:: Python


    pipe = Pipeline(
        [
            ("reduce_dim", PCA(random_state=42)),
            ("classify", LogisticRegression(random_state=42, C=0.01, max_iter=1000)),
        ]
    )

    param_grid = {"reduce_dim__n_components": [6, 8, 10, 15, 20, 25, 35, 45, 55]}


.. GENERATED FROM PYTHON SOURCE LINES 124-130

Perform the search with GridSearchCV
------------------------------------

We use `GridSearchCV` with our custom `best_low_complexity` function as the refit
parameter. This function will select the model with the fewest PCA components that
still performs within one standard deviation of the best model.

.. GENERATED FROM PYTHON SOURCE LINES 130-143

.. code-block:: Python


    grid = GridSearchCV(
        pipe,
        # Use a non-stratified CV strategy to make sure that the inter-fold
        # standard deviation of the test scores is informative.
        cv=ShuffleSplit(n_splits=30, random_state=0),
        n_jobs=1,  # increase this on your machine to use more physical cores
        param_grid=param_grid,
        scoring="accuracy",
        refit=best_low_complexity,
        return_train_score=True,
    )


.. GENERATED FROM PYTHON SOURCE LINES 144-146

Load the digits dataset and fit the model
-----------------------------------------

.. GENERATED FROM PYTHON SOURCE LINES 146-150

.. code-block:: Python


    X, y = load_digits(return_X_y=True)
    grid.fit(X, y)


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <style>#sk-container-id-58 {
      /* Definition of color scheme common for light and dark mode */
      --sklearn-color-text: #000;
      --sklearn-color-text-muted: #666;
      --sklearn-color-line: gray;
      /* Definition of color scheme for unfitted estimators */
      --sklearn-color-unfitted-level-0: #fff5e6;
      --sklearn-color-unfitted-level-1: #f6e4d2;
      --sklearn-color-unfitted-level-2: #ffe0b3;
      --sklearn-color-unfitted-level-3: chocolate;
      /* Definition of color scheme for fitted estimators */
      --sklearn-color-fitted-level-0: #f0f8ff;
      --sklearn-color-fitted-level-1: #d4ebff;
      --sklearn-color-fitted-level-2: #b3dbfd;
      --sklearn-color-fitted-level-3: cornflowerblue;

      /* Specific color for light theme */
      --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));
      --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, white)));
      --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));
      --sklearn-color-icon: #696969;

      @media (prefers-color-scheme: dark) {
        /* Redefinition of color scheme for dark theme */
        --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));
        --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, #111)));
        --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));
        --sklearn-color-icon: #878787;
      }
    }

    #sk-container-id-58 {
      color: var(--sklearn-color-text);
    }

    #sk-container-id-58 pre {
      padding: 0;
    }

    #sk-container-id-58 input.sk-hidden--visually {
      border: 0;
      clip: rect(1px 1px 1px 1px);
      clip: rect(1px, 1px, 1px, 1px);
      height: 1px;
      margin: -1px;
      overflow: hidden;
      padding: 0;
      position: absolute;
      width: 1px;
    }

    #sk-container-id-58 div.sk-dashed-wrapped {
      border: 1px dashed var(--sklearn-color-line);
      margin: 0 0.4em 0.5em 0.4em;
      box-sizing: border-box;
      padding-bottom: 0.4em;
      background-color: var(--sklearn-color-background);
    }

    #sk-container-id-58 div.sk-container {
      /* jupyter's `normalize.less` sets `[hidden] { display: none; }`
         but bootstrap.min.css set `[hidden] { display: none !important; }`
         so we also need the `!important` here to be able to override the
         default hidden behavior on the sphinx rendered scikit-learn.org.
         See: https://github.com/scikit-learn/scikit-learn/issues/21755 */
      display: inline-block !important;
      position: relative;
    }

    #sk-container-id-58 div.sk-text-repr-fallback {
      display: none;
    }

    div.sk-parallel-item,
    div.sk-serial,
    div.sk-item {
      /* draw centered vertical line to link estimators */
      background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));
      background-size: 2px 100%;
      background-repeat: no-repeat;
      background-position: center center;
    }

    /* Parallel-specific style estimator block */

    #sk-container-id-58 div.sk-parallel-item::after {
      content: "";
      width: 100%;
      border-bottom: 2px solid var(--sklearn-color-text-on-default-background);
      flex-grow: 1;
    }

    #sk-container-id-58 div.sk-parallel {
      display: flex;
      align-items: stretch;
      justify-content: center;
      background-color: var(--sklearn-color-background);
      position: relative;
    }

    #sk-container-id-58 div.sk-parallel-item {
      display: flex;
      flex-direction: column;
    }

    #sk-container-id-58 div.sk-parallel-item:first-child::after {
      align-self: flex-end;
      width: 50%;
    }

    #sk-container-id-58 div.sk-parallel-item:last-child::after {
      align-self: flex-start;
      width: 50%;
    }

    #sk-container-id-58 div.sk-parallel-item:only-child::after {
      width: 0;
    }

    /* Serial-specific style estimator block */

    #sk-container-id-58 div.sk-serial {
      display: flex;
      flex-direction: column;
      align-items: center;
      background-color: var(--sklearn-color-background);
      padding-right: 1em;
      padding-left: 1em;
    }


    /* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is
    clickable and can be expanded/collapsed.
    - Pipeline and ColumnTransformer use this feature and define the default style
    - Estimators will overwrite some part of the style using the `sk-estimator` class
    */

    /* Pipeline and ColumnTransformer style (default) */

    #sk-container-id-58 div.sk-toggleable {
      /* Default theme specific background. It is overwritten whether we have a
      specific estimator or a Pipeline/ColumnTransformer */
      background-color: var(--sklearn-color-background);
    }

    /* Toggleable label */
    #sk-container-id-58 label.sk-toggleable__label {
      cursor: pointer;
      display: flex;
      width: 100%;
      margin-bottom: 0;
      padding: 0.5em;
      box-sizing: border-box;
      text-align: center;
      align-items: start;
      justify-content: space-between;
      gap: 0.5em;
    }

    #sk-container-id-58 label.sk-toggleable__label .caption {
      font-size: 0.6rem;
      font-weight: lighter;
      color: var(--sklearn-color-text-muted);
    }

    #sk-container-id-58 label.sk-toggleable__label-arrow:before {
      /* Arrow on the left of the label */
      content: "▸";
      float: left;
      margin-right: 0.25em;
      color: var(--sklearn-color-icon);
    }

    #sk-container-id-58 label.sk-toggleable__label-arrow:hover:before {
      color: var(--sklearn-color-text);
    }

    /* Toggleable content - dropdown */

    #sk-container-id-58 div.sk-toggleable__content {
      display: none;
      text-align: left;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-58 div.sk-toggleable__content.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-58 div.sk-toggleable__content pre {
      margin: 0.2em;
      border-radius: 0.25em;
      color: var(--sklearn-color-text);
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-58 div.sk-toggleable__content.fitted pre {
      /* unfitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-58 input.sk-toggleable__control:checked~div.sk-toggleable__content {
      /* Expand drop-down */
      display: block;
      width: 100%;
      overflow: visible;
    }

    #sk-container-id-58 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {
      content: "▾";
    }

    /* Pipeline/ColumnTransformer-specific style */

    #sk-container-id-58 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-58 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator-specific style */

    /* Colorize estimator box */
    #sk-container-id-58 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-58 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    #sk-container-id-58 div.sk-label label.sk-toggleable__label,
    #sk-container-id-58 div.sk-label label {
      /* The background is the default theme color */
      color: var(--sklearn-color-text-on-default-background);
    }

    /* On hover, darken the color of the background */
    #sk-container-id-58 div.sk-label:hover label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    /* Label box, darken color on hover, fitted */
    #sk-container-id-58 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator label */

    #sk-container-id-58 div.sk-label label {
      font-family: monospace;
      font-weight: bold;
      display: inline-block;
      line-height: 1.2em;
    }

    #sk-container-id-58 div.sk-label-container {
      text-align: center;
    }

    /* Estimator-specific */
    #sk-container-id-58 div.sk-estimator {
      font-family: monospace;
      border: 1px dotted var(--sklearn-color-border-box);
      border-radius: 0.25em;
      box-sizing: border-box;
      margin-bottom: 0.5em;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-58 div.sk-estimator.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    /* on hover */
    #sk-container-id-58 div.sk-estimator:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-58 div.sk-estimator.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Specification for estimator info (e.g. "i" and "?") */

    /* Common style for "i" and "?" */

    .sk-estimator-doc-link,
    a:link.sk-estimator-doc-link,
    a:visited.sk-estimator-doc-link {
      float: right;
      font-size: smaller;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-background);
      border-radius: 1em;
      height: 1em;
      width: 1em;
      text-decoration: none !important;
      margin-left: 0.5em;
      text-align: center;
      /* unfitted */
      border: var(--sklearn-color-unfitted-level-1) 1pt solid;
      color: var(--sklearn-color-unfitted-level-1);
    }

    .sk-estimator-doc-link.fitted,
    a:link.sk-estimator-doc-link.fitted,
    a:visited.sk-estimator-doc-link.fitted {
      /* fitted */
      border: var(--sklearn-color-fitted-level-1) 1pt solid;
      color: var(--sklearn-color-fitted-level-1);
    }

    /* On hover */
    div.sk-estimator:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover,
    div.sk-label-container:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      color: var(--sklearn-color-background);
      text-decoration: none;
    }

    div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover,
    div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
      color: var(--sklearn-color-background);
      text-decoration: none;
    }

    /* Span, style for the box shown on hovering the info icon */
    .sk-estimator-doc-link span {
      display: none;
      z-index: 9999;
      position: relative;
      font-weight: normal;
      right: .2ex;
      padding: .5ex;
      margin: .5ex;
      width: min-content;
      min-width: 20ex;
      max-width: 50ex;
      color: var(--sklearn-color-text);
      box-shadow: 2pt 2pt 4pt #999;
      /* unfitted */
      background: var(--sklearn-color-unfitted-level-0);
      border: .5pt solid var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted span {
      /* fitted */
      background: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3);
    }

    .sk-estimator-doc-link:hover span {
      display: block;
    }

    /* "?"-specific style due to the `<a>` HTML tag */

    #sk-container-id-58 a.estimator_doc_link {
      float: right;
      font-size: 1rem;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-background);
      border-radius: 1rem;
      height: 1rem;
      width: 1rem;
      text-decoration: none;
      /* unfitted */
      color: var(--sklearn-color-unfitted-level-1);
      border: var(--sklearn-color-unfitted-level-1) 1pt solid;
    }

    #sk-container-id-58 a.estimator_doc_link.fitted {
      /* fitted */
      border: var(--sklearn-color-fitted-level-1) 1pt solid;
      color: var(--sklearn-color-fitted-level-1);
    }

    /* On hover */
    #sk-container-id-58 a.estimator_doc_link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      color: var(--sklearn-color-background);
      text-decoration: none;
    }

    #sk-container-id-58 a.estimator_doc_link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
    }

    .estimator-table summary {
        padding: .5rem;
        font-family: monospace;
        cursor: pointer;
    }

    .estimator-table details[open] {
        padding-left: 0.1rem;
        padding-right: 0.1rem;
        padding-bottom: 0.3rem;
    }

    .estimator-table .parameters-table {
        margin-left: auto !important;
        margin-right: auto !important;
    }

    .estimator-table .parameters-table tr:nth-child(odd) {
        background-color: #fff;
    }

    .estimator-table .parameters-table tr:nth-child(even) {
        background-color: #f6f6f6;
    }

    .estimator-table .parameters-table tr:hover {
        background-color: #e0e0e0;
    }

    .estimator-table table td {
        border: 1px solid rgba(106, 105, 104, 0.232);
    }

    .user-set td {
        color:rgb(255, 94, 0);
        text-align: left;
    }

    .user-set td.value pre {
        color:rgb(255, 94, 0) !important;
        background-color: transparent !important;
    }

    .default td {
        color: black;
        text-align: left;
    }

    .user-set td i,
    .default td i {
        color: black;
    }

    .copy-paste-icon {
        background-image: url(data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCA0NDggNTEyIj48IS0tIUZvbnQgQXdlc29tZSBGcmVlIDYuNy4yIGJ5IEBmb250YXdlc29tZSAtIGh0dHBzOi8vZm9udGF3ZXNvbWUuY29tIExpY2Vuc2UgLSBodHRwczovL2ZvbnRhd2Vzb21lLmNvbS9saWNlbnNlL2ZyZWUgQ29weXJpZ2h0IDIwMjUgRm9udGljb25zLCBJbmMuLS0+PHBhdGggZD0iTTIwOCAwTDMzMi4xIDBjMTIuNyAwIDI0LjkgNS4xIDMzLjkgMTQuMWw2Ny45IDY3LjljOSA5IDE0LjEgMjEuMiAxNC4xIDMzLjlMNDQ4IDMzNmMwIDI2LjUtMjEuNSA0OC00OCA0OGwtMTkyIDBjLTI2LjUgMC00OC0yMS41LTQ4LTQ4bDAtMjg4YzAtMjYuNSAyMS41LTQ4IDQ4LTQ4ek00OCAxMjhsODAgMCAwIDY0LTY0IDAgMCAyNTYgMTkyIDAgMC0zMiA2NCAwIDAgNDhjMCAyNi41LTIxLjUgNDgtNDggNDhMNDggNTEyYy0yNi41IDAtNDgtMjEuNS00OC00OEwwIDE3NmMwLTI2LjUgMjEuNS00OCA0OC00OHoiLz48L3N2Zz4=);
        background-repeat: no-repeat;
        background-size: 14px 14px;
        background-position: 0;
        display: inline-block;
        width: 14px;
        height: 14px;
        cursor: pointer;
    }
    </style><body><div id="sk-container-id-58" class="sk-top-container"><div class="sk-text-repr-fallback"><pre>GridSearchCV(cv=ShuffleSplit(n_splits=30, random_state=0, test_size=None, train_size=None),
                 estimator=Pipeline(steps=[(&#x27;reduce_dim&#x27;, PCA(random_state=42)),
                                           (&#x27;classify&#x27;,
                                            LogisticRegression(C=0.01,
                                                               max_iter=1000,
                                                               random_state=42))]),
                 n_jobs=1,
                 param_grid={&#x27;reduce_dim__n_components&#x27;: [6, 8, 10, 15, 20, 25, 35,
                                                          45, 55]},
                 refit=&lt;function best_low_complexity at 0x7fe6dc971480&gt;,
                 return_train_score=True, scoring=&#x27;accuracy&#x27;)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class="sk-container" hidden><div class="sk-item sk-dashed-wrapped"><div class="sk-label-container"><div class="sk-label fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-246" type="checkbox" ><label for="sk-estimator-id-246" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>GridSearchCV</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/dev/modules/generated/sklearn.model_selection.GridSearchCV.html">?<span>Documentation for GridSearchCV</span></a><span class="sk-estimator-doc-link fitted">i<span>Fitted</span></span></div></label><div class="sk-toggleable__content fitted" data-param-prefix="">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('estimator',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">estimator&nbsp;</td>
                <td class="value">Pipeline(step...m_state=42))])</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('param_grid',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">param_grid&nbsp;</td>
                <td class="value">{&#x27;reduce_dim__n_components&#x27;: [6, 8, ...]}</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('scoring',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">scoring&nbsp;</td>
                <td class="value">&#x27;accuracy&#x27;</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_jobs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">n_jobs&nbsp;</td>
                <td class="value">1</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('refit',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">refit&nbsp;</td>
                <td class="value">&lt;function bes...x7fe6dc971480&gt;</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('cv',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">cv&nbsp;</td>
                <td class="value">ShuffleSplit(...ain_size=None)</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('verbose',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">verbose&nbsp;</td>
                <td class="value">0</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('pre_dispatch',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">pre_dispatch&nbsp;</td>
                <td class="value">&#x27;2*n_jobs&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('error_score',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">error_score&nbsp;</td>
                <td class="value">nan</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('return_train_score',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">return_train_score&nbsp;</td>
                <td class="value">True</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div><div class="sk-parallel"><div class="sk-parallel-item"><div class="sk-item"><div class="sk-label-container"><div class="sk-label fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-247" type="checkbox" ><label for="sk-estimator-id-247" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>best_estimator_: Pipeline</div></div></label><div class="sk-toggleable__content fitted" data-param-prefix="best_estimator___"></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-serial"><div class="sk-item"><div class="sk-estimator fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-248" type="checkbox" ><label for="sk-estimator-id-248" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>PCA</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/dev/modules/generated/sklearn.decomposition.PCA.html">?<span>Documentation for PCA</span></a></div></label><div class="sk-toggleable__content fitted" data-param-prefix="best_estimator___reduce_dim__">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_components',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">n_components&nbsp;</td>
                <td class="value">25</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('copy',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">copy&nbsp;</td>
                <td class="value">True</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('whiten',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">whiten&nbsp;</td>
                <td class="value">False</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('svd_solver',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">svd_solver&nbsp;</td>
                <td class="value">&#x27;auto&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('tol',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">tol&nbsp;</td>
                <td class="value">0.0</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('iterated_power',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">iterated_power&nbsp;</td>
                <td class="value">&#x27;auto&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_oversamples',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">n_oversamples&nbsp;</td>
                <td class="value">10</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('power_iteration_normalizer',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">power_iteration_normalizer&nbsp;</td>
                <td class="value">&#x27;auto&#x27;</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('random_state',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">random_state&nbsp;</td>
                <td class="value">42</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div><div class="sk-item"><div class="sk-estimator fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-249" type="checkbox" ><label for="sk-estimator-id-249" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>LogisticRegression</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/dev/modules/generated/sklearn.linear_model.LogisticRegression.html">?<span>Documentation for LogisticRegression</span></a></div></label><div class="sk-toggleable__content fitted" data-param-prefix="best_estimator___classify__">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('penalty',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">penalty&nbsp;</td>
                <td class="value">&#x27;l2&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('dual',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">dual&nbsp;</td>
                <td class="value">False</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('tol',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">tol&nbsp;</td>
                <td class="value">0.0001</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('C',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">C&nbsp;</td>
                <td class="value">0.01</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('fit_intercept',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">fit_intercept&nbsp;</td>
                <td class="value">True</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('intercept_scaling',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">intercept_scaling&nbsp;</td>
                <td class="value">1</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('class_weight',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">class_weight&nbsp;</td>
                <td class="value">None</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('random_state',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">random_state&nbsp;</td>
                <td class="value">42</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('solver',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">solver&nbsp;</td>
                <td class="value">&#x27;lbfgs&#x27;</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('max_iter',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">max_iter&nbsp;</td>
                <td class="value">1000</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('multi_class',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">multi_class&nbsp;</td>
                <td class="value">&#x27;deprecated&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('verbose',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">verbose&nbsp;</td>
                <td class="value">0</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('warm_start',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">warm_start&nbsp;</td>
                <td class="value">False</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_jobs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">n_jobs&nbsp;</td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('l1_ratio',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">l1_ratio&nbsp;</td>
                <td class="value">None</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div></div></div></div></div></div></div></div></div></div><script>function copyToClipboard(text, element) {
        // Get the parameter prefix from the closest toggleable content
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const fullParamName = paramPrefix ? `${paramPrefix}${text}` : text;

        const originalStyle = element.style;
        const computedStyle = window.getComputedStyle(element);
        const originalWidth = computedStyle.width;
        const originalHTML = element.innerHTML.replace('Copied!', '');

        navigator.clipboard.writeText(fullParamName)
            .then(() => {
                element.style.width = originalWidth;
                element.style.color = 'green';
                element.innerHTML = "Copied!";

                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            })
            .catch(err => {
                console.error('Failed to copy:', err);
                element.style.color = 'red';
                element.innerHTML = "Failed!";
                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            });
        return false;
    }

    document.querySelectorAll('.fa-regular.fa-copy').forEach(function(element) {
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const paramName = element.parentElement.nextElementSibling.textContent.trim();
        const fullParamName = paramPrefix ? `${paramPrefix}${paramName}` : paramName;

        element.setAttribute('title', fullParamName);
    });
    </script></body>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 151-157

Visualize the results
---------------------

We'll create a bar chart showing the test scores for different numbers of PCA
components, along with horizontal lines indicating the best score and the
one-standard-deviation threshold.

.. GENERATED FROM PYTHON SOURCE LINES 157-322

.. code-block:: Python


    n_components = grid.cv_results_["param_reduce_dim__n_components"]
    test_scores = grid.cv_results_["mean_test_score"]

    # Create a polars DataFrame for better data manipulation and visualization
    results_df = pl.DataFrame(
        {
            "n_components": n_components,
            "mean_test_score": test_scores,
            "std_test_score": grid.cv_results_["std_test_score"],
            "mean_train_score": grid.cv_results_["mean_train_score"],
            "std_train_score": grid.cv_results_["std_train_score"],
            "mean_fit_time": grid.cv_results_["mean_fit_time"],
            "rank_test_score": grid.cv_results_["rank_test_score"],
        }
    )

    # Sort by number of components
    results_df = results_df.sort("n_components")

    # Calculate the lower bound threshold
    lower = lower_bound(grid.cv_results_)

    # Get the best model information
    best_index_ = grid.best_index_
    best_components = n_components[best_index_]
    best_score = grid.cv_results_["mean_test_score"][best_index_]

    # Add a column to mark the selected model
    results_df = results_df.with_columns(
        pl.when(pl.col("n_components") == best_components)
        .then(pl.lit("Selected"))
        .otherwise(pl.lit("Regular"))
        .alias("model_type")
    )

    # Get the number of CV splits from the results
    n_splits = sum(
        1
        for key in grid.cv_results_.keys()
        if key.startswith("split") and key.endswith("test_score")
    )

    # Extract individual scores for each split
    test_scores = np.array(
        [
            [grid.cv_results_[f"split{i}_test_score"][j] for i in range(n_splits)]
            for j in range(len(n_components))
        ]
    )
    train_scores = np.array(
        [
            [grid.cv_results_[f"split{i}_train_score"][j] for i in range(n_splits)]
            for j in range(len(n_components))
        ]
    )

    # Calculate mean and std of test scores
    mean_test_scores = np.mean(test_scores, axis=1)
    std_test_scores = np.std(test_scores, axis=1)

    # Find best score and threshold
    best_mean_score = np.max(mean_test_scores)
    threshold = best_mean_score - std_test_scores[np.argmax(mean_test_scores)]

    # Create a single figure for visualization
    fig, ax = plt.subplots(figsize=(12, 8))

    # Plot individual points
    for i, comp in enumerate(n_components):
        # Plot individual test points
        plt.scatter(
            [comp] * n_splits,
            test_scores[i],
            alpha=0.2,
            color="blue",
            s=20,
            label="Individual test scores" if i == 0 else "",
        )
        # Plot individual train points
        plt.scatter(
            [comp] * n_splits,
            train_scores[i],
            alpha=0.2,
            color="green",
            s=20,
            label="Individual train scores" if i == 0 else "",
        )

    # Plot mean lines with error bands
    plt.plot(
        n_components,
        np.mean(test_scores, axis=1),
        "-",
        color="blue",
        linewidth=2,
        label="Mean test score",
    )
    plt.fill_between(
        n_components,
        np.mean(test_scores, axis=1) - np.std(test_scores, axis=1),
        np.mean(test_scores, axis=1) + np.std(test_scores, axis=1),
        alpha=0.15,
        color="blue",
    )

    plt.plot(
        n_components,
        np.mean(train_scores, axis=1),
        "-",
        color="green",
        linewidth=2,
        label="Mean train score",
    )
    plt.fill_between(
        n_components,
        np.mean(train_scores, axis=1) - np.std(train_scores, axis=1),
        np.mean(train_scores, axis=1) + np.std(train_scores, axis=1),
        alpha=0.15,
        color="green",
    )

    # Add threshold lines
    plt.axhline(
        best_mean_score,
        color="#9b59b6",  # Purple
        linestyle="--",
        label="Best score",
        linewidth=2,
    )
    plt.axhline(
        threshold,
        color="#e67e22",  # Orange
        linestyle="--",
        label="Best score - 1 std",
        linewidth=2,
    )

    # Highlight selected model
    plt.axvline(
        best_components,
        color="#9b59b6",  # Purple
        alpha=0.2,
        linewidth=8,
        label="Selected model",
    )

    # Set titles and labels
    plt.xlabel("Number of PCA components", fontsize=12)
    plt.ylabel("Score", fontsize=12)
    plt.title("Model Selection: Balancing Complexity and Performance", fontsize=14)
    plt.grid(True, linestyle="--", alpha=0.7)
    plt.legend(
        bbox_to_anchor=(1.02, 1),
        loc="upper left",
        borderaxespad=0,
    )

    # Set axis properties
    plt.xticks(n_components)
    plt.ylim((0.85, 1.0))

    # # Adjust layout
    plt.tight_layout()


.. image-sg:: /auto_examples/model_selection/images/sphx_glr_plot_grid_search_refit_callable_001.png
   :alt: Model Selection: Balancing Complexity and Performance
   :srcset: /auto_examples/model_selection/images/sphx_glr_plot_grid_search_refit_callable_001.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 323-328

Print the results
-----------------

We print information about the selected model, including its complexity and
performance. We also show a summary table of all models using polars.

.. GENERATED FROM PYTHON SOURCE LINES 328-357

.. code-block:: Python


    print("Best model selected by the one-standard-error rule:")
    print(f"Number of PCA components: {best_components}")
    print(f"Accuracy score: {best_score:.4f}")
    print(f"Best possible accuracy: {np.max(test_scores):.4f}")
    print(f"Accuracy threshold (best - 1 std): {lower:.4f}")

    # Create a summary table with polars
    summary_df = results_df.select(
        pl.col("n_components"),
        pl.col("mean_test_score").round(4).alias("test_score"),
        pl.col("std_test_score").round(4).alias("test_std"),
        pl.col("mean_train_score").round(4).alias("train_score"),
        pl.col("std_train_score").round(4).alias("train_std"),
        pl.col("mean_fit_time").round(3).alias("fit_time"),
        pl.col("rank_test_score").alias("rank"),
    )

    # Add a column to mark the selected model
    summary_df = summary_df.with_columns(
        pl.when(pl.col("n_components") == best_components)
        .then(pl.lit("*"))
        .otherwise(pl.lit(""))
        .alias("selected")
    )

    print("\nModel comparison table:")
    print(summary_df)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Best model selected by the one-standard-error rule:
    Number of PCA components: 25
    Accuracy score: 0.9643
    Best possible accuracy: 0.9944
    Accuracy threshold (best - 1 std): 0.9623

    Model comparison table:
    shape: (9, 8)
    ┌──────────────┬────────────┬──────────┬─────────────┬───────────┬──────────┬──────┬──────────┐
    │ n_components ┆ test_score ┆ test_std ┆ train_score ┆ train_std ┆ fit_time ┆ rank ┆ selected │
    │ ---          ┆ ---        ┆ ---      ┆ ---         ┆ ---       ┆ ---      ┆ ---  ┆ ---      │
    │ i64          ┆ f64        ┆ f64      ┆ f64         ┆ f64       ┆ f64      ┆ i32  ┆ str      │
    ╞══════════════╪════════════╪══════════╪═════════════╪═══════════╪══════════╪══════╪══════════╡
    │ 6            ┆ 0.8631     ┆ 0.0241   ┆ 0.8697      ┆ 0.0047    ┆ 0.196    ┆ 9    ┆          │
    │ 8            ┆ 0.9039     ┆ 0.0193   ┆ 0.9146      ┆ 0.0028    ┆ 0.198    ┆ 8    ┆          │
    │ 10           ┆ 0.9341     ┆ 0.0148   ┆ 0.9493      ┆ 0.0023    ┆ 0.172    ┆ 7    ┆          │
    │ 15           ┆ 0.95       ┆ 0.0162   ┆ 0.9662      ┆ 0.0022    ┆ 0.18     ┆ 6    ┆          │
    │ 20           ┆ 0.9563     ┆ 0.0144   ┆ 0.9759      ┆ 0.0019    ┆ 0.177    ┆ 5    ┆          │
    │ 25           ┆ 0.9643     ┆ 0.0126   ┆ 0.9836      ┆ 0.0014    ┆ 0.161    ┆ 4    ┆ *        │
    │ 35           ┆ 0.9685     ┆ 0.0115   ┆ 0.9903      ┆ 0.0013    ┆ 0.474    ┆ 3    ┆          │
    │ 45           ┆ 0.9711     ┆ 0.0093   ┆ 0.9926      ┆ 0.001     ┆ 0.498    ┆ 2    ┆          │
    │ 55           ┆ 0.9717     ┆ 0.0093   ┆ 0.993       ┆ 0.001     ┆ 0.834    ┆ 1    ┆          │
    └──────────────┴────────────┴──────────┴─────────────┴───────────┴──────────┴──────┴──────────┘


.. GENERATED FROM PYTHON SOURCE LINES 358-378

Conclusion
----------

The one-standard-error rule helps us select a simpler model (fewer PCA components)
while maintaining performance statistically comparable to the best model.
This approach can help prevent overfitting and improve model interpretability
and efficiency.

In this example, we've seen how to implement this rule using a custom refit
callable with :class:`~sklearn.model_selection.GridSearchCV`.

Key takeaways:
1. The one-standard-error rule provides a good rule of thumb to select simpler models
2. Custom refit callables in :class:`~sklearn.model_selection.GridSearchCV` allow for
flexible model selection strategies
3. Visualizing both train and test scores helps identify potential overfitting

This approach can be applied to other model selection scenarios where balancing
complexity and performance is important, or in cases where a use-case specific
selection of the "best" model is desired.

.. GENERATED FROM PYTHON SOURCE LINES 378-381

.. code-block:: Python


    # Display the figure
    plt.show()


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (1 minutes 29.482 seconds)


.. _sphx_glr_download_auto_examples_model_selection_plot_grid_search_refit_callable.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: binder-badge

      .. image:: images/binder_badge_logo.svg
        :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/main?urlpath=lab/tree/notebooks/auto_examples/model_selection/plot_grid_search_refit_callable.ipynb
        :alt: Launch binder
        :width: 150 px

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_grid_search_refit_callable.ipynb <plot_grid_search_refit_callable.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_grid_search_refit_callable.py <plot_grid_search_refit_callable.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_grid_search_refit_callable.zip <plot_grid_search_refit_callable.zip>`


.. include:: plot_grid_search_refit_callable.recommendations


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_