.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/cluster/plot_feature_agglomeration_vs_univariate_selection.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. or to run this example in your browser via JupyterLite or Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_cluster_plot_feature_agglomeration_vs_univariate_selection.py: # مقارنة بين تجميع الميزات والاختيار أحادي المتغير هذا المثال يقارن بين استراتيجيتين لخفض الأبعاد: - اختيار الميزات أحادي المتغير باستخدام تحليل التباين (Anova) - تجميع الميزات باستخدام التجميع الهرمي لطريقة وارد (Ward hierarchical clustering) يتم مقارنة كلتا الطريقتين في مشكلة الانحدار باستخدام تقدير خوارزمية BayesianRidge. .. GENERATED FROM PYTHON SOURCE LINES 12-15 .. code-block:: Python # المؤلفون: مطوري سكايت-ليرن # معرف الترخيص: BSD-3-Clause .. GENERATED FROM PYTHON SOURCE LINES 16-31 .. code-block:: Python import shutil import tempfile import matplotlib.pyplot as plt import numpy as np from joblib import Memory from scipy import linalg, ndimage from sklearn import feature_selection from sklearn.cluster import FeatureAgglomeration from sklearn.feature_extraction.image import grid_to_graph from sklearn.linear_model import BayesianRidge from sklearn.model_selection import GridSearchCV, KFold from sklearn.pipeline import Pipeline .. GENERATED FROM PYTHON SOURCE LINES 32-33 تعيين المعلمات .. GENERATED FROM PYTHON SOURCE LINES 33-39 .. code-block:: Python n_samples = 200 size = 40 # حجم الصورة roi_size = 15 snr = 5.0 np.random.seed(0) .. GENERATED FROM PYTHON SOURCE LINES 40-41 توليد البيانات .. GENERATED FROM PYTHON SOURCE LINES 41-53 .. code-block:: Python coef = np.zeros((size, size)) coef[0:roi_size, 0:roi_size] = -1.0 coef[-roi_size:, -roi_size:] = 1.0 X = np.random.randn(n_samples, size**2) for x in X: # تنعيم البيانات x[:] = ndimage.gaussian_filter(x.reshape(size, size), sigma=1.0).ravel() X -= X.mean(axis=0) X /= X.std(axis=0) y = np.dot(X, coef.ravel()) .. GENERATED FROM PYTHON SOURCE LINES 54-55 إضافة ضوضاء .. GENERATED FROM PYTHON SOURCE LINES 55-59 .. code-block:: Python noise = np.random.randn(y.shape[0]) noise_coef = (linalg.norm(y, 2) / np.exp(snr / 20.0)) / linalg.norm(noise, 2) y += noise_coef * noise .. GENERATED FROM PYTHON SOURCE LINES 60-61 حساب معاملات خوارزمية Bayesian Ridge باستخدام GridSearch .. GENERATED FROM PYTHON SOURCE LINES 61-66 .. code-block:: Python cv = KFold(2) # مولد للتحقق المتقاطع لاختيار النموذج ridge = BayesianRidge() cachedir = tempfile.mkdtemp() mem = Memory(location=cachedir, verbose=1) .. GENERATED FROM PYTHON SOURCE LINES 67-68 تجميع وارد يليه خوارزمية BayesianRidge .. GENERATED FROM PYTHON SOURCE LINES 68-78 .. code-block:: Python connectivity = grid_to_graph(n_x=size, n_y=size) ward = FeatureAgglomeration(n_clusters=10, connectivity=connectivity, memory=mem) clf = Pipeline([("ward", ward), ("ridge", ridge)]) # اختيار العدد الأمثل من المجموعات باستخدام Grid Search clf = GridSearchCV(clf, {"ward__n_clusters": [10, 20, 30]}, n_jobs=1, cv=cv) clf.fit(X, y) # تعيين أفضل المعلمات coef_ = clf.best_estimator_.steps[-1][1].coef_ coef_ = clf.best_estimator_.steps[0][1].inverse_transform(coef_) coef_agglomeration_ = coef_.reshape(size, size) .. rst-class:: sphx-glr-script-out .. code-block:: none ________________________________________________________________________________ [Memory] Calling sklearn.cluster._agglomerative.ward_tree... ward_tree(array([[-0.451933, ..., -0.675318], ..., [ 0.275706, ..., -1.085711]]), connectivity=, n_clusters=None, return_distance=False) ________________________________________________________ward_tree - 0.1s, 0.0min ________________________________________________________________________________ [Memory] Calling sklearn.cluster._agglomerative.ward_tree... ward_tree(array([[ 0.905206, ..., 0.161245], ..., [-0.849835, ..., -1.091621]]), connectivity=, n_clusters=None, return_distance=False) ________________________________________________________ward_tree - 0.0s, 0.0min ________________________________________________________________________________ [Memory] Calling sklearn.cluster._agglomerative.ward_tree... ward_tree(array([[ 0.905206, ..., -0.675318], ..., [-0.849835, ..., -1.085711]]), connectivity=, n_clusters=None, return_distance=False) ________________________________________________________ward_tree - 0.1s, 0.0min .. GENERATED FROM PYTHON SOURCE LINES 79-80 اختيار الميزات أحادي المتغير باستخدام تحليل التباين يليه خوارزمية BayesianRidge .. GENERATED FROM PYTHON SOURCE LINES 80-90 .. code-block:: Python f_regression = mem.cache(feature_selection.f_regression) # تخزين الوظيفة في الذاكرة anova = feature_selection.SelectPercentile(f_regression) clf = Pipeline([("anova", anova), ("ridge", ridge)]) # اختيار النسبة المئوية المثلى من الميزات باستخدام Grid Search clf = GridSearchCV(clf, {"anova__percentile": [5, 10, 20]}, cv=cv) clf.fit(X, y) # تعيين أفضل المعلمات coef_ = clf.best_estimator_.steps[-1][1].coef_ coef_ = clf.best_estimator_.steps[0][1].inverse_transform(coef_.reshape(1, -1)) coef_selection_ = coef_.reshape(size, size) .. rst-class:: sphx-glr-script-out .. code-block:: none ________________________________________________________________________________ [Memory] Calling sklearn.feature_selection._univariate_selection.f_regression... f_regression(array([[-0.451933, ..., 0.275706], ..., [-0.675318, ..., -1.085711]]), array([ 25.267703, ..., -25.026711])) _____________________________________________________f_regression - 0.0s, 0.0min ________________________________________________________________________________ [Memory] Calling sklearn.feature_selection._univariate_selection.f_regression... f_regression(array([[ 0.905206, ..., -0.849835], ..., [ 0.161245, ..., -1.091621]]), array([ -27.447268, ..., -112.638768])) _____________________________________________________f_regression - 0.0s, 0.0min ________________________________________________________________________________ [Memory] Calling sklearn.feature_selection._univariate_selection.f_regression... f_regression(array([[ 0.905206, ..., -0.849835], ..., [-0.675318, ..., -1.085711]]), array([-27.447268, ..., -25.026711])) _____________________________________________________f_regression - 0.0s, 0.0min .. GENERATED FROM PYTHON SOURCE LINES 91-92 عكس التحويل لعرض النتائج على صورة .. GENERATED FROM PYTHON SOURCE LINES 92-106 .. code-block:: Python plt.close("all") plt.figure(figsize=(7.3, 2.7)) plt.subplot(1, 3, 1) plt.imshow(coef, interpolation="nearest", cmap=plt.cm.RdBu_r) plt.title("True weights") plt.subplot(1, 3, 2) plt.imshow(coef_selection_, interpolation="nearest", cmap=plt.cm.RdBu_r) plt.title("Feature Selection") plt.subplot(1, 3, 3) plt.imshow(coef_agglomeration_, interpolation="nearest", cmap=plt.cm.RdBu_r) plt.title("Feature Agglomeration") plt.subplots_adjust(0.04, 0.0, 0.98, 0.94, 0.16, 0.26) plt.show() .. image-sg:: /auto_examples/cluster/images/sphx_glr_plot_feature_agglomeration_vs_univariate_selection_001.png :alt: True weights, Feature Selection, Feature Agglomeration :srcset: /auto_examples/cluster/images/sphx_glr_plot_feature_agglomeration_vs_univariate_selection_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 107-108 محاولة إزالة المجلد المؤقت، لا تقلق إذا فشلت العملية .. GENERATED FROM PYTHON SOURCE LINES 108-108 .. code-block:: Python shutil.rmtree(cachedir, ignore_errors=True) .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.839 seconds) .. _sphx_glr_download_auto_examples_cluster_plot_feature_agglomeration_vs_univariate_selection.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/main?urlpath=lab/tree/notebooks/auto_examples/cluster/plot_feature_agglomeration_vs_univariate_selection.ipynb :alt: Launch binder :width: 150 px .. container:: lite-badge .. image:: images/jupyterlite_badge_logo.svg :target: ../../lite/lab/index.html?path=auto_examples/cluster/plot_feature_agglomeration_vs_univariate_selection.ipynb :alt: Launch JupyterLite :width: 150 px .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_feature_agglomeration_vs_univariate_selection.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_feature_agglomeration_vs_univariate_selection.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_feature_agglomeration_vs_univariate_selection.zip ` .. include:: plot_feature_agglomeration_vs_univariate_selection.recommendations .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_