{
  "Selected_candidate": {
    "pr_number": 9569,
    "pr_title": "[MRG+2] remove modification of warning registry for no reason",
    "pr_body": "Fixes #9560. Fixes #2755. Fixes #7346.",
    "issue_id": 7346,
    "issue_title": "Pop from empty list coming from get_params()",
    "issue_body": "<!--\nIf your issue is a usage question, submit it here instead:\n- StackOverflow with the scikit-learn tag: http://stackoverflow.com/questions/tagged/scikit-learn\n- Mailing List: https://mail.python.org/mailman/listinfo/scikit-learn\nFor more information, see User Questions: http://scikit-learn.org/stable/support.html#user-questions\n-->\n\n<!-- Instructions For Filing a Bug: https://github.com/scikit-learn/scikit-learn/blob/master/CONTRIBUTING.md#filing-bugs -->\n#### Description\n\n I am getting a pop from empty list error from the warnings.filers.pop(0) call in get_params(). I am using Dask to parallelize the computation of fitting a bunch of MeanShift objects. I only get this error on one machine (a remote linux machine), but it works fine on my home compute (running ubuntu 14) \n#### Steps/Code to Reproduce\n\n<!--\n\n-->\n#### Expected Results\n\nShould just fit the MeanShifts and move on\n#### Actual Results\n\nTraceback (most recent call last):\n  File \"tda_profile.py\", line 34, in <module>\n    _tda.fit(train_features, train_targets)\n  File \"/home/ben/tda/tda_parallel_test.py\", line 652, in fit\n    fits = fits.compute()\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/dask/base.py\", line 86, in compute\n    return compute(self, *_kwargs)[0]\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/dask/base.py\", line 179, in compute\n    results = get(dsk, keys, *_kwargs)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/dask/threaded.py\", line 57, in get\n    **kwargs)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/dask/async.py\", line 484, in get_async\n    raise(remote_exception(res, tb))\ndask.async.IndexError: pop from empty list\n## Traceback\n\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/dask/async.py\", line 267, in execute_task\n    result = _execute_task(task, data)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/dask/async.py\", line 249, in _execute_task\n    return func(*args2)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/cluster/mean_shift_.py\", line 391, in fit\n    cluster_all=self.cluster_all, n_jobs=self.n_jobs)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/cluster/mean_shift_.py\", line 191, in mean_shift\n    (seed, X, nbrs, max_iter) for seed in seeds)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py\", line 800, in **call**\n    while self.dispatch_one_batch(iterator):\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py\", line 658, in dispatch_one_batch\n    self._dispatch(tasks)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py\", line 566, in _dispatch\n    job = ImmediateComputeBatch(batch)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py\", line 180, in __init__\n    self.results = batch()\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py\", line 72, in **call**\n    return [func(_args, *_kwargs) for func, args, kwargs in self.items]\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py\", line 72, in <listcomp>\n    return [func(_args, *_kwargs) for func, args, kwargs in self.items]\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/cluster/mean_shift_.py\", line 75, in _mean_shift_single_seed\n    bandwidth = nbrs.get_params()['radius']\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/base.py\", line 227, in get_params\n    warnings.filters.pop(0)\n#### Versions\n\n> > > import platform; print(platform.platform())\n> > > Linux-3.10.0-327.el7.x86_64-x86_64-with-centos-7.2.1511-Core\n> > > import sys; print(\"Python\", sys.version)\n> > > Python 3.5.2 |Anaconda 4.1.1 (64-bit)| (default, Jul  2 2016, 17:53:06) \n> > > [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]\n> > > import numpy; print(\"NumPy\", numpy.**version**)\n> > > NumPy 1.11.1\n> > > import scipy; print(\"SciPy\", scipy.**version**)\n> > > SciPy 0.17.1\n> > > import sklearn; print(\"Scikit-Learn\", sklearn.**version**)\n> > > Scikit-Learn 0.17.1\n\n<!-- Thanks for contributing! -->\n",
    "issue_closed_at": "2017-09-08T15:29:37Z",
    "base_commit": "e1fb03c86d2a2c47ef008ead958e1bc10fb06e77",
    "changes": [
      {
        "file": "sklearn/base.py",
        "type": "function",
        "name": "get_params",
        "class_name": "BaseEstimator",
        "code": "def get_params(self, deep=True):\n        \"\"\"Get parameters for this estimator.\n\n        Parameters\n        ----------\n        deep : boolean, optional\n            If True, will return the parameters for this estimator and\n            contained subobjects that are estimators.\n\n        Returns\n        -------\n        params : mapping of string to any\n            Parameter names mapped to their values.\n        \"\"\"\n        out = dict()\n        for key in self._get_param_names():\n            # We need deprecation warnings to always be on in order to\n            # catch deprecated param values.\n            # This is set in utils/__init__.py but it gets overwritten\n            # when running under python3 somehow.\n            warnings.simplefilter(\"always\", DeprecationWarning)\n            try:\n                with warnings.catch_warnings(record=True) as w:\n                    value = getattr(self, key, None)\n                if len(w) and w[0].category == DeprecationWarning:\n                    # if the parameter is deprecated, don't show it\n                    continue\n            finally:\n                warnings.filters.pop(0)\n\n            # XXX: should we rather test if instance of estimator?\n            if deep and hasattr(value, 'get_params'):\n                deep_items = value.get_params().items()\n                out.update((key + '__' + k, val) for k, val in deep_items)\n            out[key] = value\n        return out"
      },
      {
        "file": "sklearn/base.py",
        "type": "function",
        "name": "__setstate__",
        "class_name": "BaseEstimator",
        "code": "def __setstate__(self, state):\n        if type(self).__module__.startswith('sklearn.'):\n            pickle_version = state.pop(\"_sklearn_version\", \"pre-0.18\")\n            if pickle_version != __version__:\n                warnings.warn(\n                    \"Trying to unpickle estimator {0} from version {1} when \"\n                    \"using version {2}. This might lead to breaking code or \"\n                    \"invalid results. Use at your own risk.\".format(\n                        self.__class__.__name__, pickle_version, __version__),\n                    UserWarning)\n        try:\n            super(BaseEstimator, self).__setstate__(state)\n        except AttributeError:\n            self.__dict__.update(state)"
      }
    ]
  },
  "Justification": "Candidate E is the most helpful because it involves the get_params() method in the sklearn.base module, which is directly related to how model objects in sklearn, such as RepeatedKFold and RepeatedStratifiedKFold, are constructed and their internal representations. The issue with pop from an empty list could have a conceptual overlap with the issues in the __repr__ method, as it suggests a problem in how instances manage their attributes and methods. Given that both current and candidate bugs concern how sklearn classes handle their internal state, this candidate offers potentially useful insights and guidance for debugging the __repr__ problem effectively.",
  "instance_id": "scikit-learn__scikit-learn-14983",
  "repo": "scikit-learn/scikit-learn",
  "created_at": "2019-09-14T15:31:18Z",
  "problem_statement": "RepeatedKFold and RepeatedStratifiedKFold do not show correct __repr__ string\n#### Description\r\n\r\n`RepeatedKFold` and `RepeatedStratifiedKFold` do not show correct \\_\\_repr\\_\\_ string.\r\n\r\n#### Steps/Code to Reproduce\r\n\r\n```python\r\n>>> from sklearn.model_selection import RepeatedKFold, RepeatedStratifiedKFold\r\n>>> repr(RepeatedKFold())\r\n>>> repr(RepeatedStratifiedKFold())\r\n```\r\n\r\n#### Expected Results\r\n\r\n```python\r\n>>> repr(RepeatedKFold())\r\nRepeatedKFold(n_splits=5, n_repeats=10, random_state=None)\r\n>>> repr(RepeatedStratifiedKFold())\r\nRepeatedStratifiedKFold(n_splits=5, n_repeats=10, random_state=None)\r\n```\r\n\r\n#### Actual Results\r\n\r\n```python\r\n>>> repr(RepeatedKFold())\r\n'<sklearn.model_selection._split.RepeatedKFold object at 0x0000016421AA4288>'\r\n>>> repr(RepeatedStratifiedKFold())\r\n'<sklearn.model_selection._split.RepeatedStratifiedKFold object at 0x0000016420E115C8>'\r\n```\r\n\r\n#### Versions\r\n```\r\nSystem:\r\n    python: 3.7.4 (default, Aug  9 2019, 18:34:13) [MSC v.1915 64 bit (AMD64)]\r\nexecutable: D:\\anaconda3\\envs\\xyz\\python.exe\r\n   machine: Windows-10-10.0.16299-SP0\r\n\r\nBLAS:\r\n    macros:\r\n  lib_dirs:\r\ncblas_libs: cblas\r\n\r\nPython deps:\r\n       pip: 19.2.2\r\nsetuptools: 41.0.1\r\n   sklearn: 0.21.2\r\n     numpy: 1.16.4\r\n     scipy: 1.3.1\r\n    Cython: None\r\n    pandas: 0.24.2\r\n```\n",
  "patch": "diff --git a/sklearn/model_selection/_split.py b/sklearn/model_selection/_split.py\n--- a/sklearn/model_selection/_split.py\n+++ b/sklearn/model_selection/_split.py\n@@ -1163,6 +1163,9 @@ def get_n_splits(self, X=None, y=None, groups=None):\n                      **self.cvargs)\n         return cv.get_n_splits(X, y, groups) * self.n_repeats\n \n+    def __repr__(self):\n+        return _build_repr(self)\n+\n \n class RepeatedKFold(_RepeatedSplits):\n     \"\"\"Repeated K-Fold cross validator.\n@@ -2158,6 +2161,8 @@ def _build_repr(self):\n         try:\n             with warnings.catch_warnings(record=True) as w:\n                 value = getattr(self, key, None)\n+                if value is None and hasattr(self, 'cvargs'):\n+                    value = self.cvargs.get(key, None)\n             if len(w) and w[0].category == DeprecationWarning:\n                 # if the parameter is deprecated, don't show it\n                 continue\n"
}