{
  "original_problem": {
    "instance_id": "scikit-learn__scikit-learn-14983",
    "repo": "scikit-learn/scikit-learn",
    "created_at": "2019-09-14T15:31:18Z",
    "problem_statement": "RepeatedKFold and RepeatedStratifiedKFold do not show correct __repr__ string\n#### Description\r\n\r\n`RepeatedKFold` and `RepeatedStratifiedKFold` do not show correct \\_\\_repr\\_\\_ string.\r\n\r\n#### Steps/Code to Reproduce\r\n\r\n```python\r\n>>> from sklearn.model_selection import RepeatedKFold, RepeatedStratifiedKFold\r\n>>> repr(RepeatedKFold())\r\n>>> repr(RepeatedStratifiedKFold())\r\n```\r\n\r\n#### Expected Results\r\n\r\n```python\r\n>>> repr(RepeatedKFold())\r\nRepeatedKFold(n_splits=5, n_repeats=10, random_state=None)\r\n>>> repr(RepeatedStratifiedKFold())\r\nRepeatedStratifiedKFold(n_splits=5, n_repeats=10, random_state=None)\r\n```\r\n\r\n#### Actual Results\r\n\r\n```python\r\n>>> repr(RepeatedKFold())\r\n'<sklearn.model_selection._split.RepeatedKFold object at 0x0000016421AA4288>'\r\n>>> repr(RepeatedStratifiedKFold())\r\n'<sklearn.model_selection._split.RepeatedStratifiedKFold object at 0x0000016420E115C8>'\r\n```\r\n\r\n#### Versions\r\n```\r\nSystem:\r\n    python: 3.7.4 (default, Aug  9 2019, 18:34:13) [MSC v.1915 64 bit (AMD64)]\r\nexecutable: D:\\anaconda3\\envs\\xyz\\python.exe\r\n   machine: Windows-10-10.0.16299-SP0\r\n\r\nBLAS:\r\n    macros:\r\n  lib_dirs:\r\ncblas_libs: cblas\r\n\r\nPython deps:\r\n       pip: 19.2.2\r\nsetuptools: 41.0.1\r\n   sklearn: 0.21.2\r\n     numpy: 1.16.4\r\n     scipy: 1.3.1\r\n    Cython: None\r\n    pandas: 0.24.2\r\n```\n",
    "patch": "diff --git a/sklearn/model_selection/_split.py b/sklearn/model_selection/_split.py\n--- a/sklearn/model_selection/_split.py\n+++ b/sklearn/model_selection/_split.py\n@@ -1163,6 +1163,9 @@ def get_n_splits(self, X=None, y=None, groups=None):\n                      **self.cvargs)\n         return cv.get_n_splits(X, y, groups) * self.n_repeats\n \n+    def __repr__(self):\n+        return _build_repr(self)\n+\n \n class RepeatedKFold(_RepeatedSplits):\n     \"\"\"Repeated K-Fold cross validator.\n@@ -2158,6 +2161,8 @@ def _build_repr(self):\n         try:\n             with warnings.catch_warnings(record=True) as w:\n                 value = getattr(self, key, None)\n+                if value is None and hasattr(self, 'cvargs'):\n+                    value = self.cvargs.get(key, None)\n             if len(w) and w[0].category == DeprecationWarning:\n                 # if the parameter is deprecated, don't show it\n                 continue\n"
  },
  "candidates_evaluated": 5,
  "judgment_result": {
    "candidates": [
      {
        "idx": 1,
        "id": "similar_11034",
        "decision": "Not useful",
        "confidence": "Medium",
        "reason": "The issue is about data type handling, which is unrelated to the representation string problem."
      },
      {
        "idx": 2,
        "id": "similar_5482",
        "decision": "Not useful",
        "confidence": "Medium",
        "reason": "The issue involves input handling with custom analyzers, which does not relate to representation string issues."
      },
      {
        "idx": 3,
        "id": "similar_13051",
        "decision": "Not useful",
        "confidence": "Medium",
        "reason": "The issue is about numerical precision in tests, unrelated to representation string problems."
      },
      {
        "idx": 4,
        "id": "similar_6848",
        "decision": "Not useful",
        "confidence": "Medium",
        "reason": "The issue involves incorrect probability calculations, which is not related to representation string issues."
      },
      {
        "idx": 5,
        "id": "similar_7346",
        "decision": "Not useful",
        "confidence": "Medium",
        "reason": "The issue is about list operations causing an error, unrelated to representation string problems."
      }
    ]
  },
  "raw_summaries": [
    {
      "similar_issue": {
        "issue_title": "OneHotEncoder does not output scipy sparse matrix of given dtype",
        "issue_body": "#### Description\r\nOneHotEncoder ignores the specified dtype in the construction of the sparse array when mixed input data are passed, i.e with both categorical and real data type\r\n\r\n#### Steps/Code to Reproduce\r\n```python\r\nimport numpy as np\r\n\r\nfrom sklearn.preprocessing import OneHotEncoder\r\nenc = OneHotEncoder(dtype=np.float32, categorical_features=[0, 1])\r\n\r\nx = np.array([[0, 1, 0, 0], [1, 2, 0, 0]], dtype=int)\r\nsparse = enc.fit(x).transform(x)\r\n```\r\n\r\n#### Expected Results\r\n```python\r\nsparse: <2x6 sparse matrix of type '<class 'numpy.float32'>'\r\n\twith 4 stored elements in COOrdinate format>\r\n```\r\n\r\n#### Actual Results\r\n```python\r\nsparse: <2x6 sparse matrix of type '<class 'numpy.float64'>'\r\n\twith 4 stored elements in COOrdinate format>\r\n```\r\n\r\n#### Versions\r\n__Platform__: Linux-4.13.0-38-generic-x86_64-with-debian-stretch-sid\r\n__Python__: 3.6.3 |Anaconda custom (64-bit)| (default, Oct 13 2017, 12:02:49) [GCC 7.2.0]\r\n__NumPy__: NumPy \r\n__SciPy__: SciPy 1.0.1\r\n__Scikit-Learn__: Scikit-Learn 0.19.1\r\n",
        "issue_id": 11034,
        "pr_number": 11042,
        "pr_title": "[MRG + 1] Ensuring that the OneHotEncoder outputs sparse matrix with given dtype #11034",
        "pr_body": "#### Reference Issues/PRs\r\nOriginal discussion at #11034\r\n\r\n#### What does this implement/fix? Explain your changes.\r\n",
        "issue_closed_at": "2018-06-06T09:03:02Z",
        "base_commit": "f049ec72eb70443ec8d7826066c4246035677c11"
      },
      "summary": "### Summary:\n\nThis issue is related to the `OneHotEncoder` functionality within the `scikit-learn` library, specifically concerning the handling of data types in the output sparse matrix. The problem arises when the `OneHotEncoder` is configured to produce a sparse matrix with a specified data type (`dtype`), but instead, it outputs a matrix with a default data type (`numpy.float64`), regardless of the user's specification. This behavior is observed when the input data contains both categorical and real data types, leading to the encoder ignoring the specified `dtype`.\n\nKey symptoms include the mismatch between the expected and actual data types of the output sparse matrix, as seen in the provided reproduction steps. The expected result is a sparse matrix of type `numpy.float32`, but the actual output is of type `numpy.float64`.\n\nThe affected component is the `OneHotEncoder` class within the `scikit-learn` library's preprocessing module, specifically its methods for transforming data and handling categorical features.\n\nThe potential impact of this issue is significant for users requiring control over the memory usage and precision of their data transformations, as the choice of `dtype` can affect both. The severity is tied to the necessity of precise data type control in various machine learning workflows, potentially leading to inefficient memory usage or unexpected behavior in model training and prediction.\n\nRelevant technical details include the need for the encoder to respect the `dtype` parameter during transformation, particularly when handling mixed data types, to ensure consistency and user expectations are met. The issue highlights a gap in the encoder's functionality that necessitates a patch to correct the handling of the `dtype` parameter across its methods.",
      "prompt_used": "You are an expert in software issue reasoning analysis.\nGiven the following problem report and its fixed code elements, generate a comprehensive summary based on the entire document. Your goal is to abstract the information in the problem description into a more general description.\n\n## Original Issue Report:\nTitle: OneHotEncoder does not output scipy sparse matrix of given dtype\n\nBody:\n#### Description\r\nOneHotEncoder ignores the specified dtype in the construction of the sparse array when mixed input data are passed, i.e with both categorical and real data type\r\n\r\n#### Steps/Code to Reproduce\r\n```python\r\nimport numpy as np\r\n\r\nfrom sklearn.preprocessing import OneHotEncoder\r\nenc = OneHotEncoder(dtype=np.float32, categorical_features=[0, 1])\r\n\r\nx = np.array([[0, 1, 0, 0], [1, 2, 0, 0]], dtype=int)\r\nsparse = enc.fit(x).transform(x)\r\n```\r\n\r\n#### Expected Results\r\n```python\r\nsparse: <2x6 sparse matrix of type '<class 'numpy.float32'>'\r\n\twith 4 stored elements in COOrdinate format>\r\n```\r\n\r\n#### Actual Results\r\n```python\r\nsparse: <2x6 sparse matrix of type '<class 'numpy.float64'>'\r\n\twith 4 stored elements in COOrdinate format>\r\n```\r\n\r\n#### Versions\r\n__Platform__: Linux-4.13.0-38-generic-x86_64-with-debian-stretch-sid\r\n__Python__: 3.6.3 |Anaconda custom (64-bit)| (default, Oct 13 2017, 12:02:49) [GCC 7.2.0]\r\n__NumPy__: NumPy \r\n__SciPy__: SciPy 1.0.1\r\n__Scikit-Learn__: Scikit-Learn 0.19.1\r\n\n\n## Code elements fixed by the patch:\n{FIXED_CODE_ELEMENTS}\n\nPlease analyze the above issue report and provide a structured summary that includes:\n1. Problem description in general terms\n2. Key symptoms and behaviors observed\n3. Affected components or systems\n4. Potential impact or severity\n5. Any relevant technical details abstracted for broader understanding\n\nPlease return the summary with “### Summary:\", For example:\n### Summary: This issue is ...\n\nChanges Summary:\nsklearn/preprocessing/data.py\n  function: add_dummy_feature\n  function: _transform_selected\n  function: _transform_selected\n  function: OneHotEncoder.fit_transform\n  function: CategoricalEncoder.transform\n"
    },
    {
      "similar_issue": {
        "issue_title": "CountVectorizer with custom analyzer ignores input argument",
        "issue_body": "Example:\n\n``` py\ncv = CountVectorizer(analyzer=lambda x: x.split(), input='filename')\ncv.fit(['hello world']).vocabulary_\n```\n\nSame for `input=\"file\"`. Not sure if this should be fixed or just documented; I don't like changing the behavior of the vectorizers yet again...\n",
        "issue_id": 5482,
        "pr_number": 13641,
        "pr_title": "[MRG+1] API make sure vectorizers read data from file before analyzing",
        "pr_body": "Fixes #5482\r\n\r\nIf the given analyzer is a calable, it seems reasonable to assume if `input='file'` or `input='filename'`, the data should be read from the file first, and then passed to the analyzer, the same way as it's done for non-callable analyzers.\r\n\r\nThis PR clarifies this in the docstrings, and passes the \"decoded\" input to the analyzer. It should be less of a concern regarding the input on the bytes vs str since we don't support python2 anymore.\r\n\r\nI'm not entirely sure if this is what we wanna do, it's more of a proposal to move it forward.\r\n\r\nAfter this PR, the following would result in a `FileNotFoundError` exception:\r\n\r\n```python\r\ncv = CountVectorizer(analyzer=lambda x: x.split(), input='filename')\r\ncv.fit(['hello world']).vocabulary_\r\n```",
        "issue_closed_at": "2019-04-23T03:50:24Z",
        "base_commit": "badaa153e67ffa56fb1a413b3b7b5b8507024291"
      },
      "summary": "### Summary:\nThis issue is centered around the behavior of the `CountVectorizer` class within the scikit-learn library, specifically when using a custom analyzer in conjunction with the `input` parameter set to 'filename' or 'file'. In general terms, the problem arises because the `CountVectorizer` fails to appropriately handle the 'input' argument when a custom analyzer function is provided. As a result, the vectorizer does not process input data as expected, which can lead to incorrect or unexpected outcomes when users attempt to fit data using filenames as input.\n\nKey symptoms and behaviors observed include the `CountVectorizer` not recognizing and processing the input files correctly when a custom analyzer is specified. The user provided an example where input data was intended to be read from a file, but the vectorizer did not execute as anticipated, suggesting a possible oversight in handling the input parameter.\n\nThe affected components are primarily the scikit-learn text vectorization classes, including `CountVectorizer`, and potentially related classes such as `HashingVectorizer` and `TfidfVectorizer`, as they share common functionalities and configurations.\n\nThe potential impact of this issue is moderate, considering it might lead to misinterpretation of input data when using custom analyzers with file inputs, thus affecting data preprocessing and subsequent machine learning model accuracy. The severity depends on the reliance of users on custom analyzers with file inputs in their workflows.\n\nRelevant technical details include the need for consistency checks within the vectorizer classes to ensure that input specifications are honored correctly, especially when custom logic via analyzers is applied. The changes in the file `sklearn/feature_extraction/text.py` indicate updates to the handling logic to address these inconsistencies, specifically within the `VectorizerMixin._check_stop_words_consistency` function and associated vectorizer classes.",
      "prompt_used": "You are an expert in software issue reasoning analysis.\nGiven the following problem report and its fixed code elements, generate a comprehensive summary based on the entire document. Your goal is to abstract the information in the problem description into a more general description.\n\n## Original Issue Report:\nTitle: CountVectorizer with custom analyzer ignores input argument\n\nBody:\nExample:\n\n``` py\ncv = CountVectorizer(analyzer=lambda x: x.split(), input='filename')\ncv.fit(['hello world']).vocabulary_\n```\n\nSame for `input=\"file\"`. Not sure if this should be fixed or just documented; I don't like changing the behavior of the vectorizers yet again...\n\n\n## Code elements fixed by the patch:\n{FIXED_CODE_ELEMENTS}\n\nPlease analyze the above issue report and provide a structured summary that includes:\n1. Problem description in general terms\n2. Key symptoms and behaviors observed\n3. Affected components or systems\n4. Potential impact or severity\n5. Any relevant technical details abstracted for broader understanding\n\nPlease return the summary with “### Summary:\", For example:\n### Summary: This issue is ...\n\nChanges Summary:\nsklearn/feature_extraction/text.py\n  line: line 31\n  function: VectorizerMixin._check_stop_words_consistency\n  class: HashingVectorizer\n  class: CountVectorizer\n  class: TfidfVectorizer\n"
    },
    {
      "similar_issue": {
        "issue_title": "[0.20.2] test_non_meta_estimators fails on Powerpc 64 bit little endian",
        "issue_body": "Splitting #13036:\r\nTest failure for `test_non_meta_estimators[KernelPCA-KernelPCA-check_pipeline_consistency]`:\r\n```\r\nname = 'KernelPCA'\r\nEstimator = <class 'sklearn.decomposition.kernel_pca.KernelPCA'>\r\ncheck = <function check_pipeline_consistency at 0x3fff8b056b18>\r\n\r\n    @pytest.mark.parametrize(\r\n            \"name, Estimator, check\",\r\n            _generate_checks_per_estimator(_yield_all_checks,\r\n                                           _tested_non_meta_estimators()),\r\n            ids=_rename_partial\r\n    )\r\n    def test_non_meta_estimators(name, Estimator, check):\r\n        # Common tests for non-meta estimators\r\n        with ignore_warnings(category=(DeprecationWarning, ConvergenceWarning,\r\n                                       UserWarning, FutureWarning)):\r\n            estimator = Estimator()\r\n            set_checking_parameters(estimator)\r\n>           check(name, estimator)\r\n\r\nsklearn/tests/test_common.py:101: \r\n_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ \r\nsklearn/utils/testing.py:350: in wrapper\r\n    return fn(*args, **kwargs)\r\nsklearn/utils/estimator_checks.py:1048: in check_pipeline_consistency\r\n    assert_allclose_dense_sparse(result, result_pipe)\r\n_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ \r\n\r\nx = array([[ 9.17543358e-01,  4.04410079e-02,  2.21585378e-02,\r\n        -2.69563828...,  2.47000231e-11,\r\n         2.81623715e-09,  1.36863809e-09,  1.28803077e-09]])\r\ny = array([[ 9.17543358e-01,  4.04410079e-02,  2.21585378e-02,\r\n        -2.69560026..., -6.90266524e-10,\r\n        -8.33997668e-11,  1.37018888e-09,  2.03087265e-09]])\r\nrtol = 1e-07, atol = 1e-09, err_msg = ''\r\n\r\n    def assert_allclose_dense_sparse(x, y, rtol=1e-07, atol=1e-9, err_msg=''):\r\n        \"\"\"[...]\"\"\"\r\n        if sp.sparse.issparse(x) and sp.sparse.issparse(y):\r\n            x = x.tocsr()\r\n            y = y.tocsr()\r\n            x.sum_duplicates()\r\n            y.sum_duplicates()\r\n            assert_array_equal(x.indices, y.indices, err_msg=err_msg)\r\n            assert_array_equal(x.indptr, y.indptr, err_msg=err_msg)\r\n            assert_allclose(x.data, y.data, rtol=rtol, atol=atol, err_msg=err_msg)\r\n        elif not sp.sparse.issparse(x) and not sp.sparse.issparse(y):\r\n            # both dense\r\n>           assert_allclose(x, y, rtol=rtol, atol=atol, err_msg=err_msg)\r\nE           AssertionError: \r\nE           Not equal to tolerance rtol=1e-07, atol=1e-09\r\nE           \r\nE           (mismatch 13.7777777778%)\r\nE            x: array([[ 9.175434e-01,  4.044101e-02,  2.215854e-02, -2.695638e-08,\r\nE                    3.128852e-08,  0.000000e+00,  0.000000e+00,  0.000000e+00,\r\nE                    0.000000e+00,  0.000000e+00,  0.000000e+00,  0.000000e+00,...\r\nE            y: array([[ 9.175434e-01,  4.044101e-02,  2.215854e-02, -2.695600e-08,\r\nE                    3.128926e-08,  0.000000e+00,  0.000000e+00,  0.000000e+00,\r\nE                    0.000000e+00,  0.000000e+00,  0.000000e+00,  0.000000e+00,...\r\n\r\nsklearn/utils/testing.py:464: AssertionError\r\n```\r\nSimilarly for  `test_non_meta_estimators[LocallyLinearEmbedding-LocallyLinearEmbedding-check_pipeline_consistency]`:\r\n```\r\n>           assert_allclose(x, y, rtol=rtol, atol=atol, err_msg=err_msg)\r\nE           AssertionError: \r\nE           Not equal to tolerance rtol=1e-07, atol=1e-09\r\nE           \r\nE           (mismatch 50.0%)\r\nE            x: array([[-2.011707e-01,  2.825862e-01],\r\nE                  [-1.618549e-01,  1.102424e-12],\r\nE                  [-1.618549e-01,  3.172043e-11],...\r\nE            y: array([[-1.753929e-01,  2.825862e-01],\r\nE                  [-1.894835e-01, -1.533793e-11],\r\nE                  [-1.894835e-01,  3.785336e-11],...\r\n```\r\nand for `test_non_meta_estimators[LocallyLinearEmbedding-LocallyLinearEmbedding-check_transformer_data_not_an_array]`:\r\n```\r\n>           assert_allclose(x, y, rtol=rtol, atol=atol, err_msg=err_msg)\r\nE           AssertionError: \r\nE           Not equal to tolerance rtol=1e-07, atol=0.01\r\nE           fit_transform and transform outcomes not consistent in LocallyLinearEmbedding(eigen_solver='auto', hessian_tol=0.0001, max_iter=5,\r\nE                       method='standard', modified_tol=1e-12, n_components=2,\r\nE                       n_jobs=None, n_neighbors=5, neighbors_algorithm='auto',\r\nE                       random_state=0, reg=0.001, tol=1e-06)\r\nE           (mismatch 50.0%)\r\nE            x: array([[-8.953081e-02,  2.983824e-01],\r\nE                  [-2.421795e-01,  4.555523e-13],\r\nE                  [-2.421795e-01,  3.751804e-12],...\r\nE            y: array([[-1.897519e-01,  2.981428e-01],\r\nE                  [-1.751025e-01,  6.989894e-12],\r\nE                  [-1.751025e-01,  9.856396e-12],...\r\n```\r\nand for `test_non_meta_estimators[LocallyLinearEmbedding-LocallyLinearEmbedding-check_transformer_general]`:\r\n```\r\n>           assert_allclose(x, y, rtol=rtol, atol=atol, err_msg=err_msg)\r\nE           AssertionError: \r\nE           Not equal to tolerance rtol=1e-07, atol=0.01\r\nE           fit_transform and transform outcomes not consistent in LocallyLinearEmbedding(eigen_solver='auto', hessian_tol=0.0001, max_iter=5,\r\nE                       method='standard', modified_tol=1e-12, n_components=2,\r\nE                       n_jobs=None, n_neighbors=5, neighbors_algorithm='auto',\r\nE                       random_state=0, reg=0.001, tol=1e-06)\r\nE           (mismatch 50.0%)\r\nE            x: array([[ 4.698100e-02, -2.983824e-01],\r\nE                  [ 2.538887e-01, -1.711825e-12],\r\nE                  [ 2.538887e-01, -8.653966e-12],...\r\nE            y: array([[-2.546916e-01, -2.981428e-01],\r\nE                  [ 4.241318e-02, -6.757535e-12],\r\nE                  [ 4.241318e-02, -2.107003e-11],...\r\n```",
        "issue_id": 13051,
        "pr_number": 13323,
        "pr_title": "Hotfix Skip non deterministic tests on PowerPC",
        "pr_body": "An attempt to work around https://github.com/scikit-learn/scikit-learn/issues/13051 by extending the tests currently considered non determinisitic on 32 bit arch to PowerPC as well.\r\n\r\nCloses #13051\r\n\r\nThe `platform.machine` settings can be seen e.g. [here](https://github.com/python/cpython/blob/f14c28f39766855420dd58d209da4ad847f3030e/Lib/test/test_sysconfig.py#L402).\r\n\r\nIdeally, it would be good to understand why these are non-deterministic (on 32 bit arch as well) but meanwhile since we are not officially supporting PowerPC this aims to avoid new bug reports from debian maintainers while packaging scikit-learn, for tests that we deem acceptable to skip on 32 bit arch.",
        "issue_closed_at": "2019-02-28T10:16:53Z",
        "base_commit": "9d211978741a7e23f8b7c0bf1315d7ac7a259861"
      },
      "summary": "### Summary:\nThis issue pertains to a test failure in the scikit-learn library specifically observed on PowerPC 64-bit little-endian architecture. The problem is identified in the `test_non_meta_estimators` test suite, which is designed to verify the consistency and correctness of non-meta estimators within scikit-learn, a popular machine learning library.\n\n1. **Problem Description in General Terms**: \n   The issue arises from inconsistencies in the test results of various non-meta estimators, notably `KernelPCA` and `LocallyLinearEmbedding`. These inconsistencies are rooted in discrepancies between expected and actual outcomes when the tests are run, leading to assertion errors.\n\n2. **Key Symptoms and Behaviors Observed**:\n   The tests fail due to mismatches in numerical precision between expected and actual results. For instance, the `assert_allclose` function, which checks if two arrays are element-wise equal within a tolerance, throws errors indicating that the results are not equal to the specified relative and absolute tolerances. This mismatch is highlighted by percentage differences that exceed allowable error margins.\n\n3. **Affected Components or Systems**:\n   The components primarily affected are the `KernelPCA` and `LocallyLinearEmbedding` estimators within the scikit-learn library. The issue is observed during the execution of tests related to pipeline consistency and data transformation behaviors.\n\n4. **Potential Impact or Severity**:\n   The severity of this issue is moderate as it impacts the reliability of scikit-learn's functionality on specific architectures, potentially leading to incorrect behavior or performance issues in applications relying on these estimators. This could affect developers and users who deploy machine learning models on the PowerPC 64-bit little-endian systems.\n\n5. **Relevant Technical Details Abstracted for Broader Understanding**:\n   - The issue is linked to floating-point precision discrepancies, which are common in numerical computations across different architectures.\n   - The tests use the `assert_allclose_dense_sparse` function to check numerical consistency, which is critical for ensuring the reliability of machine learning model transformations and pipeline operations.\n   - The specific errors highlight the importance of setting appropriate tolerances and understanding the limitations of numerical precision in diverse computational environments.\n\nThe fixed code elements, such as updates in `sklearn/base.py`, `sklearn/cross_decomposition/cca_.py`, and `sklearn/manifold/locally_linear.py`, likely address these precision issues by modifying how numerical stability and tolerance are handled in the library's functions and tests.",
      "prompt_used": "You are an expert in software issue reasoning analysis.\nGiven the following problem report and its fixed code elements, generate a comprehensive summary based on the entire document. Your goal is to abstract the information in the problem description into a more general description.\n\n## Original Issue Report:\nTitle: [0.20.2] test_non_meta_estimators fails on Powerpc 64 bit little endian\n\nBody:\nSplitting #13036:\r\nTest failure for `test_non_meta_estimators[KernelPCA-KernelPCA-check_pipeline_consistency]`:\r\n```\r\nname = 'KernelPCA'\r\nEstimator = <class 'sklearn.decomposition.kernel_pca.KernelPCA'>\r\ncheck = <function check_pipeline_consistency at 0x3fff8b056b18>\r\n\r\n    @pytest.mark.parametrize(\r\n            \"name, Estimator, check\",\r\n            _generate_checks_per_estimator(_yield_all_checks,\r\n                                           _tested_non_meta_estimators()),\r\n            ids=_rename_partial\r\n    )\r\n    def test_non_meta_estimators(name, Estimator, check):\r\n        # Common tests for non-meta estimators\r\n        with ignore_warnings(category=(DeprecationWarning, ConvergenceWarning,\r\n                                       UserWarning, FutureWarning)):\r\n            estimator = Estimator()\r\n            set_checking_parameters(estimator)\r\n>           check(name, estimator)\r\n\r\nsklearn/tests/test_common.py:101: \r\n_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ \r\nsklearn/utils/testing.py:350: in wrapper\r\n    return fn(*args, **kwargs)\r\nsklearn/utils/estimator_checks.py:1048: in check_pipeline_consistency\r\n    assert_allclose_dense_sparse(result, result_pipe)\r\n_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ \r\n\r\nx = array([[ 9.17543358e-01,  4.04410079e-02,  2.21585378e-02,\r\n        -2.69563828...,  2.47000231e-11,\r\n         2.81623715e-09,  1.36863809e-09,  1.28803077e-09]])\r\ny = array([[ 9.17543358e-01,  4.04410079e-02,  2.21585378e-02,\r\n        -2.69560026..., -6.90266524e-10,\r\n        -8.33997668e-11,  1.37018888e-09,  2.03087265e-09]])\r\nrtol = 1e-07, atol = 1e-09, err_msg = ''\r\n\r\n    def assert_allclose_dense_sparse(x, y, rtol=1e-07, atol=1e-9, err_msg=''):\r\n        \"\"\"[...]\"\"\"\r\n        if sp.sparse.issparse(x) and sp.sparse.issparse(y):\r\n            x = x.tocsr()\r\n            y = y.tocsr()\r\n            x.sum_duplicates()\r\n            y.sum_duplicates()\r\n            assert_array_equal(x.indices, y.indices, err_msg=err_msg)\r\n            assert_array_equal(x.indptr, y.indptr, err_msg=err_msg)\r\n            assert_allclose(x.data, y.data, rtol=rtol, atol=atol, err_msg=err_msg)\r\n        elif not sp.sparse.issparse(x) and not sp.sparse.issparse(y):\r\n            # both dense\r\n>           assert_allclose(x, y, rtol=rtol, atol=atol, err_msg=err_msg)\r\nE           AssertionError: \r\nE           Not equal to tolerance rtol=1e-07, atol=1e-09\r\nE           \r\nE           (mismatch 13.7777777778%)\r\nE            x: array([[ 9.175434e-01,  4.044101e-02,  2.215854e-02, -2.695638e-08,\r\nE                    3.128852e-08,  0.000000e+00,  0.000000e+00,  0.000000e+00,\r\nE                    0.000000e+00,  0.000000e+00,  0.000000e+00,  0.000000e+00,...\r\nE            y: array([[ 9.175434e-01,  4.044101e-02,  2.215854e-02, -2.695600e-08,\r\nE                    3.128926e-08,  0.000000e+00,  0.000000e+00,  0.000000e+00,\r\nE                    0.000000e+00,  0.000000e+00,  0.000000e+00,  0.000000e+00,...\r\n\r\nsklearn/utils/testing.py:464: AssertionError\r\n```\r\nSimilarly for  `test_non_meta_estimators[LocallyLinearEmbedding-LocallyLinearEmbedding-check_pipeline_consistency]`:\r\n```\r\n>           assert_allclose(x, y, rtol=rtol, atol=atol, err_msg=err_msg)\r\nE           AssertionError: \r\nE           Not equal to tolerance rtol=1e-07, atol=1e-09\r\nE           \r\nE           (mismatch 50.0%)\r\nE            x: array([[-2.011707e-01,  2.825862e-01],\r\nE                  [-1.618549e-01,  1.102424e-12],\r\nE                  [-1.618549e-01,  3.172043e-11],...\r\nE            y: array([[-1.753929e-01,  2.825862e-01],\r\nE                  [-1.894835e-01, -1.533793e-11],\r\nE                  [-1.894835e-01,  3.785336e-11],...\r\n```\r\nand for `test_non_meta_estimators[LocallyLinearEmbedding-LocallyLinearEmbedding-check_transformer_data_not_an_array]`:\r\n```\r\n>           assert_allclose(x, y, rtol=rtol, atol=atol, err_msg=err_msg)\r\nE           AssertionError: \r\nE           Not equal to tolerance rtol=1e-07, atol=0.01\r\nE           fit_transform and transform outcomes not consistent in LocallyLinearEmbedding(eigen_solver='auto', hessian_tol=0.0001, max_iter=5,\r\nE                       method='standard', modified_tol=1e-12, n_components=2,\r\nE                       n_jobs=None, n_neighbors=5, neighbors_algorithm='auto',\r\nE                       random_state=0, reg=0.001, tol=1e-06)\r\nE           (mismatch 50.0%)\r\nE            x: array([[-8.953081e-02,  2.983824e-01],\r\nE                  [-2.421795e-01,  4.555523e-13],\r\nE                  [-2.421795e-01,  3.751804e-12],...\r\nE            y: array([[-1.897519e-01,  2.981428e-01],\r\nE                  [-1.751025e-01,  6.989894e-12],\r\nE                  [-1.751025e-01,  9.856396e-12],...\r\n```\r\nand for `test_non_meta_estimators[LocallyLinearEmbedding-LocallyLinearEmbedding-check_transformer_general]`:\r\n```\r\n>           assert_allclose(x, y, rtol=rtol, atol=atol, err_msg=err_msg)\r\nE           AssertionError: \r\nE           Not equal to tolerance rtol=1e-07, atol=0.01\r\nE           fit_transform and transform outcomes not consistent in LocallyLinearEmbedding(eigen_solver='auto', hessian_tol=0.0001, max_iter=5,\r\nE                       method='standard', modified_tol=1e-12, n_components=2,\r\nE                       n_jobs=None, n_neighbors=5, neighbors_algorithm='auto',\r\nE                       random_state=0, reg=0.001, tol=1e-06)\r\nE           (mismatch 50.0%)\r\nE            x: array([[ 4.698100e-02, -2.983824e-01],\r\nE                  [ 2.538887e-01, -1.711825e-12],\r\nE                  [ 2.538887e-01, -8.653966e-12],...\r\nE            y: array([[-2.546916e-01, -2.981428e-01],\r\nE                  [ 4.241318e-02, -6.757535e-12],\r\nE                  [ 4.241318e-02, -2.107003e-11],...\r\n```\n\n## Code elements fixed by the patch:\n{FIXED_CODE_ELEMENTS}\n\nPlease analyze the above issue report and provide a structured summary that includes:\n1. Problem description in general terms\n2. Key symptoms and behaviors observed\n3. Affected components or systems\n4. Potential impact or severity\n5. Any relevant technical details abstracted for broader understanding\n\nPlease return the summary with “### Summary:\", For example:\n### Summary: This issue is ...\n\nChanges Summary:\nsklearn/base.py\n  line: line 6\n  function: _UnstableOn32BitMixin._more_tags\n\nsklearn/cross_decomposition/cca_.py\n  line: line 1\n\nsklearn/manifold/locally_linear.py\n  line: line 9\n  function: locally_linear_embedding\n"
    },
    {
      "similar_issue": {
        "issue_title": "LinearDiscriminantAnalysis predict probability bug",
        "issue_body": "I am pretty confident there is a bug introduced in commit\n7c1101d7c26ba0b77184cce9c0b9be79adb526de\n\nConcretely, line 518 of the current version \nhttps://github.com/scikit-learn/scikit-learn/blob/master/sklearn/discriminant_analysis.py\nshould be removed as it yields wrong results. \n\nThere is no reason why constant 1 should be added to the computed probability after exponentiation and before inversion. \n\nTo verify this, I have run a one-to-one comparison between the outcome of the method and MATLAB's builtin LDA classifier on the Iris dataset. Only after removal of line 518, results match (up to a tolerance).\n\nIf everyone agrees on that, I am happy to submit a PR.\n",
        "issue_id": 6848,
        "pr_number": 11796,
        "pr_title": "[MRG+2] Fix LDA predict_proba() ",
        "pr_body": "<!--\r\nThanks for contributing a pull request! Please ensure you have taken a look at\r\nthe contribution guidelines: https://github.com/scikit-learn/scikit-learn/blob/master/CONTRIBUTING.md#pull-request-checklist\r\n-->\r\n\r\n#### Reference Issues/PRs\r\nFixes #6848\r\ncloses #11727\r\ncloses #5149\r\n<!--\r\nExample: Fixes #1234. See also #3456.\r\nPlease use keywords (e.g., Fixes) to create link to the issues or pull requests\r\nyou resolved, so that they will automatically be closed when your pull request\r\nis merged. See https://github.com/blog/1506-closing-issues-via-pull-requests\r\n-->\r\n\r\n\r\n#### What does this implement/fix? Explain your changes.\r\nFixes the `predict_proba()` method of LinearDiscriminantAnalysis.\r\nAn `if` statement is used to differentiate between the binary and multi-class case, due to the different output format of the `decision_function` method implemented in the `LinearClassifierMixin` class.\r\n\r\n#### Any other comments?\r\nCopying from #6848:\r\nDo we perhaps want to include additional tests checking the output of predict_proba for LDA and QDA both for the binary and multi-class cases?\r\n\r\n<!--\r\nPlease be aware that we are a loose team of volunteers so patience is\r\nnecessary; assistance handling other issues is very welcome. We value\r\nall user contributions, no matter how minor they are. If we are slow to\r\nreview, either the pull request needs some benchmarking, tinkering,\r\nconvincing, etc. or more likely the reviewers are simply busy. In either\r\ncase, we ask for your understanding during the review process.\r\nFor more information, see our FAQ on this topic:\r\nhttp://scikit-learn.org/dev/faq.html#why-is-my-pull-request-not-getting-any-attention.\r\n\r\nThanks for contributing!\r\n-->\r\n",
        "issue_closed_at": "2019-03-07T16:44:18Z",
        "base_commit": "b73a51bcda362d94d8907915a382a8eb403554c8"
      },
      "summary": "### Summary: This issue pertains to an incorrect implementation in the probability prediction method of the Linear Discriminant Analysis (LDA) within the scikit-learn library. Specifically, an unnecessary constant is being added to the computed probability values during processing, leading to inaccurate results. The problem was identified through a discrepancy between the scikit-learn LDA outputs and MATLAB's built-in LDA classifier when tested on the Iris dataset. The inconsistency was traced back to a specific line of code, which, when removed, brought the results into alignment with expected values. This suggests that the presence of the additional constant is erroneous and should be eliminated to ensure accurate probability predictions. The affected component is the `predict_proba` function in the `LinearDiscriminantAnalysis` class, located in the `sklearn.discriminant_analysis.py` file. The issue could have significant implications for users relying on precise probabilistic outputs from the LDA, potentially affecting any downstream analyses or decisions based on these results.",
      "prompt_used": "You are an expert in software issue reasoning analysis.\nGiven the following problem report and its fixed code elements, generate a comprehensive summary based on the entire document. Your goal is to abstract the information in the problem description into a more general description.\n\n## Original Issue Report:\nTitle: LinearDiscriminantAnalysis predict probability bug\n\nBody:\nI am pretty confident there is a bug introduced in commit\n7c1101d7c26ba0b77184cce9c0b9be79adb526de\n\nConcretely, line 518 of the current version \nhttps://github.com/scikit-learn/scikit-learn/blob/master/sklearn/discriminant_analysis.py\nshould be removed as it yields wrong results. \n\nThere is no reason why constant 1 should be added to the computed probability after exponentiation and before inversion. \n\nTo verify this, I have run a one-to-one comparison between the outcome of the method and MATLAB's builtin LDA classifier on the Iris dataset. Only after removal of line 518, results match (up to a tolerance).\n\nIf everyone agrees on that, I am happy to submit a PR.\n\n\n## Code elements fixed by the patch:\n{FIXED_CODE_ELEMENTS}\n\nPlease analyze the above issue report and provide a structured summary that includes:\n1. Problem description in general terms\n2. Key symptoms and behaviors observed\n3. Affected components or systems\n4. Potential impact or severity\n5. Any relevant technical details abstracted for broader understanding\n\nPlease return the summary with “### Summary:\", For example:\n### Summary: This issue is ...\n\nChanges Summary:\nsklearn/discriminant_analysis.py\n  line: line 22\n  function: QuadraticDiscriminantAnalysis.predict_proba\n"
    },
    {
      "similar_issue": {
        "issue_title": "Pop from empty list coming from get_params()",
        "issue_body": "<!--\nIf your issue is a usage question, submit it here instead:\n- StackOverflow with the scikit-learn tag: http://stackoverflow.com/questions/tagged/scikit-learn\n- Mailing List: https://mail.python.org/mailman/listinfo/scikit-learn\nFor more information, see User Questions: http://scikit-learn.org/stable/support.html#user-questions\n-->\n\n<!-- Instructions For Filing a Bug: https://github.com/scikit-learn/scikit-learn/blob/master/CONTRIBUTING.md#filing-bugs -->\n#### Description\n\n I am getting a pop from empty list error from the warnings.filers.pop(0) call in get_params(). I am using Dask to parallelize the computation of fitting a bunch of MeanShift objects. I only get this error on one machine (a remote linux machine), but it works fine on my home compute (running ubuntu 14) \n#### Steps/Code to Reproduce\n\n<!--\n\n-->\n#### Expected Results\n\nShould just fit the MeanShifts and move on\n#### Actual Results\n\nTraceback (most recent call last):\n  File \"tda_profile.py\", line 34, in <module>\n    _tda.fit(train_features, train_targets)\n  File \"/home/ben/tda/tda_parallel_test.py\", line 652, in fit\n    fits = fits.compute()\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/dask/base.py\", line 86, in compute\n    return compute(self, *_kwargs)[0]\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/dask/base.py\", line 179, in compute\n    results = get(dsk, keys, *_kwargs)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/dask/threaded.py\", line 57, in get\n    **kwargs)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/dask/async.py\", line 484, in get_async\n    raise(remote_exception(res, tb))\ndask.async.IndexError: pop from empty list\n## Traceback\n\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/dask/async.py\", line 267, in execute_task\n    result = _execute_task(task, data)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/dask/async.py\", line 249, in _execute_task\n    return func(*args2)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/cluster/mean_shift_.py\", line 391, in fit\n    cluster_all=self.cluster_all, n_jobs=self.n_jobs)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/cluster/mean_shift_.py\", line 191, in mean_shift\n    (seed, X, nbrs, max_iter) for seed in seeds)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py\", line 800, in **call**\n    while self.dispatch_one_batch(iterator):\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py\", line 658, in dispatch_one_batch\n    self._dispatch(tasks)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py\", line 566, in _dispatch\n    job = ImmediateComputeBatch(batch)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py\", line 180, in __init__\n    self.results = batch()\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py\", line 72, in **call**\n    return [func(_args, *_kwargs) for func, args, kwargs in self.items]\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py\", line 72, in <listcomp>\n    return [func(_args, *_kwargs) for func, args, kwargs in self.items]\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/cluster/mean_shift_.py\", line 75, in _mean_shift_single_seed\n    bandwidth = nbrs.get_params()['radius']\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/base.py\", line 227, in get_params\n    warnings.filters.pop(0)\n#### Versions\n\n> > > import platform; print(platform.platform())\n> > > Linux-3.10.0-327.el7.x86_64-x86_64-with-centos-7.2.1511-Core\n> > > import sys; print(\"Python\", sys.version)\n> > > Python 3.5.2 |Anaconda 4.1.1 (64-bit)| (default, Jul  2 2016, 17:53:06) \n> > > [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]\n> > > import numpy; print(\"NumPy\", numpy.**version**)\n> > > NumPy 1.11.1\n> > > import scipy; print(\"SciPy\", scipy.**version**)\n> > > SciPy 0.17.1\n> > > import sklearn; print(\"Scikit-Learn\", sklearn.**version**)\n> > > Scikit-Learn 0.17.1\n\n<!-- Thanks for contributing! -->\n",
        "issue_id": 7346,
        "pr_number": 9569,
        "pr_title": "[MRG+2] remove modification of warning registry for no reason",
        "pr_body": "Fixes #9560. Fixes #2755. Fixes #7346.",
        "issue_closed_at": "2017-09-08T15:29:37Z",
        "base_commit": "e1fb03c86d2a2c47ef008ead958e1bc10fb06e77"
      },
      "summary": "### Summary:\n\nThis issue is a software bug related to the improper handling of a Python list operation within the scikit-learn library's `get_params()` function. The problem manifests as an \"IndexError: pop from empty list\" exception, which occurs when the code attempts to remove an element from an empty list. This error is triggered specifically within the `warnings.filters.pop(0)` call in the `get_params()` method of the `BaseEstimator` class. The problem is encountered when using Dask to parallelize computations for fitting multiple `MeanShift` clustering objects. Notably, the error is machine-specific and occurs on a remote Linux machine, but not on a local Ubuntu system, indicating potential environment-specific issues or differences in configurations.\n\n1. **Problem Description in General Terms**:\n   - The error arises from an attempt to pop an element from an empty list during the execution of a parallelized computation.\n\n2. **Key Symptoms and Behaviors Observed**:\n   - The primary symptom is an \"IndexError: pop from empty list\" exception.\n   - This error interrupts the execution of parallel computations involving the fitting of `MeanShift` clustering objects.\n\n3. **Affected Components or Systems**:\n   - The issue affects the `get_params()` function within the `BaseEstimator` class of the scikit-learn library.\n   - The problem appears to be linked to the use of Dask for parallel processing.\n\n4. **Potential Impact or Severity**:\n   - The severity is moderate, as it prevents successful execution of parallel fitting operations, which may impact users relying on scikit-learn's clustering capabilities in a parallelized environment.\n   - The localized occurrence suggests it may not affect all users, but those experiencing it are unable to perform necessary computations.\n\n5. **Relevant Technical Details Abstracted for Broader Understanding**:\n   - The error is environment-specific, possibly influenced by differences in Python or library versions, or system configurations.\n   - The traceback indicates the error is deeply nested within the Dask and scikit-learn processing stack, pointing to potential interactions between these systems.\n   - The issue was addressed by modifying code elements in `sklearn/base.py`, specifically within the `BaseEstimator.get_params` and `BaseEstimator.__setstate__` functions, indicating a need for better list management or conditional checks before list operations.\n\nThe fix likely involves ensuring that operations on the list are only attempted when the list contains elements, thus preventing the \"pop from empty list\" error.",
      "prompt_used": "You are an expert in software issue reasoning analysis.\nGiven the following problem report and its fixed code elements, generate a comprehensive summary based on the entire document. Your goal is to abstract the information in the problem description into a more general description.\n\n## Original Issue Report:\nTitle: Pop from empty list coming from get_params()\n\nBody:\n<!--\nIf your issue is a usage question, submit it here instead:\n- StackOverflow with the scikit-learn tag: http://stackoverflow.com/questions/tagged/scikit-learn\n- Mailing List: https://mail.python.org/mailman/listinfo/scikit-learn\nFor more information, see User Questions: http://scikit-learn.org/stable/support.html#user-questions\n-->\n\n<!-- Instructions For Filing a Bug: https://github.com/scikit-learn/scikit-learn/blob/master/CONTRIBUTING.md#filing-bugs -->\n#### Description\n\n I am getting a pop from empty list error from the warnings.filers.pop(0) call in get_params(). I am using Dask to parallelize the computation of fitting a bunch of MeanShift objects. I only get this error on one machine (a remote linux machine), but it works fine on my home compute (running ubuntu 14) \n#### Steps/Code to Reproduce\n\n<!--\n\n-->\n#### Expected Results\n\nShould just fit the MeanShifts and move on\n#### Actual Results\n\nTraceback (most recent call last):\n  File \"tda_profile.py\", line 34, in <module>\n    _tda.fit(train_features, train_targets)\n  File \"/home/ben/tda/tda_parallel_test.py\", line 652, in fit\n    fits = fits.compute()\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/dask/base.py\", line 86, in compute\n    return compute(self, *_kwargs)[0]\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/dask/base.py\", line 179, in compute\n    results = get(dsk, keys, *_kwargs)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/dask/threaded.py\", line 57, in get\n    **kwargs)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/dask/async.py\", line 484, in get_async\n    raise(remote_exception(res, tb))\ndask.async.IndexError: pop from empty list\n## Traceback\n\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/dask/async.py\", line 267, in execute_task\n    result = _execute_task(task, data)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/dask/async.py\", line 249, in _execute_task\n    return func(*args2)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/cluster/mean_shift_.py\", line 391, in fit\n    cluster_all=self.cluster_all, n_jobs=self.n_jobs)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/cluster/mean_shift_.py\", line 191, in mean_shift\n    (seed, X, nbrs, max_iter) for seed in seeds)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py\", line 800, in **call**\n    while self.dispatch_one_batch(iterator):\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py\", line 658, in dispatch_one_batch\n    self._dispatch(tasks)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py\", line 566, in _dispatch\n    job = ImmediateComputeBatch(batch)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py\", line 180, in __init__\n    self.results = batch()\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py\", line 72, in **call**\n    return [func(_args, *_kwargs) for func, args, kwargs in self.items]\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py\", line 72, in <listcomp>\n    return [func(_args, *_kwargs) for func, args, kwargs in self.items]\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/cluster/mean_shift_.py\", line 75, in _mean_shift_single_seed\n    bandwidth = nbrs.get_params()['radius']\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/base.py\", line 227, in get_params\n    warnings.filters.pop(0)\n#### Versions\n\n> > > import platform; print(platform.platform())\n> > > Linux-3.10.0-327.el7.x86_64-x86_64-with-centos-7.2.1511-Core\n> > > import sys; print(\"Python\", sys.version)\n> > > Python 3.5.2 |Anaconda 4.1.1 (64-bit)| (default, Jul  2 2016, 17:53:06) \n> > > [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]\n> > > import numpy; print(\"NumPy\", numpy.**version**)\n> > > NumPy 1.11.1\n> > > import scipy; print(\"SciPy\", scipy.**version**)\n> > > SciPy 0.17.1\n> > > import sklearn; print(\"Scikit-Learn\", sklearn.**version**)\n> > > Scikit-Learn 0.17.1\n\n<!-- Thanks for contributing! -->\n\n\n## Code elements fixed by the patch:\n{FIXED_CODE_ELEMENTS}\n\nPlease analyze the above issue report and provide a structured summary that includes:\n1. Problem description in general terms\n2. Key symptoms and behaviors observed\n3. Affected components or systems\n4. Potential impact or severity\n5. Any relevant technical details abstracted for broader understanding\n\nPlease return the summary with “### Summary:\", For example:\n### Summary: This issue is ...\n\nChanges Summary:\nsklearn/base.py\n  function: BaseEstimator.get_params\n  function: BaseEstimator.__setstate__\n"
    }
  ]
}