.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/cluster/plot_feature_agglomeration_vs_univariate_selection.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_cluster_plot_feature_agglomeration_vs_univariate_selection.py>`
        to download the full example code. or to run this example in your browser via JupyterLite or Binder

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_cluster_plot_feature_agglomeration_vs_univariate_selection.py:


# مقارنة بين تجميع الميزات والاختيار أحادي المتغير

هذا المثال يقارن بين استراتيجيتين لخفض الأبعاد:

- اختيار الميزات أحادي المتغير باستخدام تحليل التباين (Anova)

- تجميع الميزات باستخدام التجميع الهرمي لطريقة وارد (Ward hierarchical clustering)

يتم مقارنة كلتا الطريقتين في مشكلة الانحدار باستخدام تقدير خوارزمية BayesianRidge.

.. GENERATED FROM PYTHON SOURCE LINES 12-15

.. code-block:: Python

    # المؤلفون: مطوري سكايت-ليرن
    # معرف الترخيص: BSD-3-Clause


.. GENERATED FROM PYTHON SOURCE LINES 16-31

.. code-block:: Python

    import shutil
    import tempfile

    import matplotlib.pyplot as plt
    import numpy as np
    from joblib import Memory
    from scipy import linalg, ndimage

    from sklearn import feature_selection
    from sklearn.cluster import FeatureAgglomeration
    from sklearn.feature_extraction.image import grid_to_graph
    from sklearn.linear_model import BayesianRidge
    from sklearn.model_selection import GridSearchCV, KFold
    from sklearn.pipeline import Pipeline


.. GENERATED FROM PYTHON SOURCE LINES 32-33

تعيين المعلمات

.. GENERATED FROM PYTHON SOURCE LINES 33-39

.. code-block:: Python

    n_samples = 200
    size = 40  # حجم الصورة
    roi_size = 15
    snr = 5.0
    np.random.seed(0)


.. GENERATED FROM PYTHON SOURCE LINES 40-41

توليد البيانات

.. GENERATED FROM PYTHON SOURCE LINES 41-53

.. code-block:: Python

    coef = np.zeros((size, size))
    coef[0:roi_size, 0:roi_size] = -1.0
    coef[-roi_size:, -roi_size:] = 1.0

    X = np.random.randn(n_samples, size**2)
    for x in X:  # تنعيم البيانات
        x[:] = ndimage.gaussian_filter(x.reshape(size, size), sigma=1.0).ravel()
    X -= X.mean(axis=0)
    X /= X.std(axis=0)

    y = np.dot(X, coef.ravel())


.. GENERATED FROM PYTHON SOURCE LINES 54-55

إضافة ضوضاء

.. GENERATED FROM PYTHON SOURCE LINES 55-59

.. code-block:: Python

    noise = np.random.randn(y.shape[0])
    noise_coef = (linalg.norm(y, 2) / np.exp(snr / 20.0)) / linalg.norm(noise, 2)
    y += noise_coef * noise


.. GENERATED FROM PYTHON SOURCE LINES 60-61

حساب معاملات خوارزمية Bayesian Ridge باستخدام GridSearch

.. GENERATED FROM PYTHON SOURCE LINES 61-66

.. code-block:: Python

    cv = KFold(2)  # مولد للتحقق المتقاطع لاختيار النموذج
    ridge = BayesianRidge()
    cachedir = tempfile.mkdtemp()
    mem = Memory(location=cachedir, verbose=1)


.. GENERATED FROM PYTHON SOURCE LINES 67-68

تجميع وارد يليه خوارزمية BayesianRidge

.. GENERATED FROM PYTHON SOURCE LINES 68-78

.. code-block:: Python

    connectivity = grid_to_graph(n_x=size, n_y=size)
    ward = FeatureAgglomeration(n_clusters=10, connectivity=connectivity, memory=mem)
    clf = Pipeline([("ward", ward), ("ridge", ridge)])
    # اختيار العدد الأمثل من المجموعات باستخدام Grid Search
    clf = GridSearchCV(clf, {"ward__n_clusters": [10, 20, 30]}, n_jobs=1, cv=cv)
    clf.fit(X, y)  # تعيين أفضل المعلمات
    coef_ = clf.best_estimator_.steps[-1][1].coef_
    coef_ = clf.best_estimator_.steps[0][1].inverse_transform(coef_)
    coef_agglomeration_ = coef_.reshape(size, size)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    ________________________________________________________________________________
    [Memory] Calling sklearn.cluster._agglomerative.ward_tree...
    ward_tree(array([[-0.451933, ..., -0.675318],
           ...,
           [ 0.275706, ..., -1.085711]]), connectivity=<COOrdinate sparse matrix of dtype 'int64'
            with 7840 stored elements and shape (1600, 1600)>, n_clusters=None, return_distance=False)
    ________________________________________________________ward_tree - 0.1s, 0.0min
    ________________________________________________________________________________
    [Memory] Calling sklearn.cluster._agglomerative.ward_tree...
    ward_tree(array([[ 0.905206, ...,  0.161245],
           ...,
           [-0.849835, ..., -1.091621]]), connectivity=<COOrdinate sparse matrix of dtype 'int64'
            with 7840 stored elements and shape (1600, 1600)>, n_clusters=None, return_distance=False)
    ________________________________________________________ward_tree - 0.0s, 0.0min
    ________________________________________________________________________________
    [Memory] Calling sklearn.cluster._agglomerative.ward_tree...
    ward_tree(array([[ 0.905206, ..., -0.675318],
           ...,
           [-0.849835, ..., -1.085711]]), connectivity=<COOrdinate sparse matrix of dtype 'int64'
            with 7840 stored elements and shape (1600, 1600)>, n_clusters=None, return_distance=False)
    ________________________________________________________ward_tree - 0.1s, 0.0min


.. GENERATED FROM PYTHON SOURCE LINES 79-80

اختيار الميزات أحادي المتغير باستخدام تحليل التباين يليه خوارزمية BayesianRidge

.. GENERATED FROM PYTHON SOURCE LINES 80-90

.. code-block:: Python

    f_regression = mem.cache(feature_selection.f_regression)  # تخزين الوظيفة في الذاكرة
    anova = feature_selection.SelectPercentile(f_regression)
    clf = Pipeline([("anova", anova), ("ridge", ridge)])
    # اختيار النسبة المئوية المثلى من الميزات باستخدام Grid Search
    clf = GridSearchCV(clf, {"anova__percentile": [5, 10, 20]}, cv=cv)
    clf.fit(X, y)  # تعيين أفضل المعلمات
    coef_ = clf.best_estimator_.steps[-1][1].coef_
    coef_ = clf.best_estimator_.steps[0][1].inverse_transform(coef_.reshape(1, -1))
    coef_selection_ = coef_.reshape(size, size)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    ________________________________________________________________________________
    [Memory] Calling sklearn.feature_selection._univariate_selection.f_regression...
    f_regression(array([[-0.451933, ...,  0.275706],
           ...,
           [-0.675318, ..., -1.085711]]), 
    array([ 25.267703, ..., -25.026711]))
    _____________________________________________________f_regression - 0.0s, 0.0min
    ________________________________________________________________________________
    [Memory] Calling sklearn.feature_selection._univariate_selection.f_regression...
    f_regression(array([[ 0.905206, ..., -0.849835],
           ...,
           [ 0.161245, ..., -1.091621]]), 
    array([ -27.447268, ..., -112.638768]))
    _____________________________________________________f_regression - 0.0s, 0.0min
    ________________________________________________________________________________
    [Memory] Calling sklearn.feature_selection._univariate_selection.f_regression...
    f_regression(array([[ 0.905206, ..., -0.849835],
           ...,
           [-0.675318, ..., -1.085711]]), 
    array([-27.447268, ..., -25.026711]))
    _____________________________________________________f_regression - 0.0s, 0.0min


.. GENERATED FROM PYTHON SOURCE LINES 91-92

عكس التحويل لعرض النتائج على صورة

.. GENERATED FROM PYTHON SOURCE LINES 92-106

.. code-block:: Python

    plt.close("all")
    plt.figure(figsize=(7.3, 2.7))
    plt.subplot(1, 3, 1)
    plt.imshow(coef, interpolation="nearest", cmap=plt.cm.RdBu_r)
    plt.title("True weights")
    plt.subplot(1, 3, 2)
    plt.imshow(coef_selection_, interpolation="nearest", cmap=plt.cm.RdBu_r)
    plt.title("Feature Selection")
    plt.subplot(1, 3, 3)
    plt.imshow(coef_agglomeration_, interpolation="nearest", cmap=plt.cm.RdBu_r)
    plt.title("Feature Agglomeration")
    plt.subplots_adjust(0.04, 0.0, 0.98, 0.94, 0.16, 0.26)
    plt.show()


.. image-sg:: /auto_examples/cluster/images/sphx_glr_plot_feature_agglomeration_vs_univariate_selection_001.png
   :alt: True weights, Feature Selection, Feature Agglomeration
   :srcset: /auto_examples/cluster/images/sphx_glr_plot_feature_agglomeration_vs_univariate_selection_001.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 107-108

محاولة إزالة المجلد المؤقت، لا تقلق إذا فشلت العملية

.. GENERATED FROM PYTHON SOURCE LINES 108-108

.. code-block:: Python

    shutil.rmtree(cachedir, ignore_errors=True)


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 0.839 seconds)


.. _sphx_glr_download_auto_examples_cluster_plot_feature_agglomeration_vs_univariate_selection.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: binder-badge

      .. image:: images/binder_badge_logo.svg
        :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/main?urlpath=lab/tree/notebooks/auto_examples/cluster/plot_feature_agglomeration_vs_univariate_selection.ipynb
        :alt: Launch binder
        :width: 150 px

    .. container:: lite-badge

      .. image:: images/jupyterlite_badge_logo.svg
        :target: ../../lite/lab/index.html?path=auto_examples/cluster/plot_feature_agglomeration_vs_univariate_selection.ipynb
        :alt: Launch JupyterLite
        :width: 150 px

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_feature_agglomeration_vs_univariate_selection.ipynb <plot_feature_agglomeration_vs_univariate_selection.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_feature_agglomeration_vs_univariate_selection.py <plot_feature_agglomeration_vs_univariate_selection.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_feature_agglomeration_vs_univariate_selection.zip <plot_feature_agglomeration_vs_univariate_selection.zip>`


.. include:: plot_feature_agglomeration_vs_univariate_selection.recommendations


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_