{
  "instance_id": "scikit-learn__scikit-learn-11040",
  "repo": "scikit-learn/scikit-learn",
  "created_at": "2018-04-28T07:18:33Z",
  "problem_statement": "Missing parameter validation in Neighbors estimator for float n_neighbors\n```python\r\nfrom sklearn.neighbors import NearestNeighbors\r\nfrom sklearn.datasets import make_blobs\r\nX, y = make_blobs()\r\nneighbors = NearestNeighbors(n_neighbors=3.)\r\nneighbors.fit(X)\r\nneighbors.kneighbors(X)\r\n```\r\n```\r\n~/checkout/scikit-learn/sklearn/neighbors/binary_tree.pxi in sklearn.neighbors.kd_tree.NeighborsHeap.__init__()\r\n\r\nTypeError: 'float' object cannot be interpreted as an integer\r\n```\r\nThis should be caught earlier and a more helpful error message should be raised (or we could be lenient and cast to integer, but I think a better error might be better).\r\n\r\nWe need to make sure that \r\n```python\r\nneighbors.kneighbors(X, n_neighbors=3.)\r\n```\r\nalso works.\n",
  "patch": "diff --git a/sklearn/neighbors/base.py b/sklearn/neighbors/base.py\n--- a/sklearn/neighbors/base.py\n+++ b/sklearn/neighbors/base.py\n@@ -258,6 +258,12 @@ def _fit(self, X):\n                     \"Expected n_neighbors > 0. Got %d\" %\n                     self.n_neighbors\n                 )\n+            else:\n+                if not np.issubdtype(type(self.n_neighbors), np.integer):\n+                    raise TypeError(\n+                        \"n_neighbors does not take %s value, \"\n+                        \"enter integer value\" %\n+                        type(self.n_neighbors))\n \n         return self\n \n@@ -327,6 +333,17 @@ class from an array representing our data set and ask who's\n \n         if n_neighbors is None:\n             n_neighbors = self.n_neighbors\n+        elif n_neighbors <= 0:\n+            raise ValueError(\n+                \"Expected n_neighbors > 0. Got %d\" %\n+                n_neighbors\n+            )\n+        else:\n+            if not np.issubdtype(type(n_neighbors), np.integer):\n+                raise TypeError(\n+                    \"n_neighbors does not take %s value, \"\n+                    \"enter integer value\" %\n+                    type(n_neighbors))\n \n         if X is not None:\n             query_is_train = False\n",
  "similar_bug_items": [
    {
      "pr_number": 7632,
      "pr_title": "[MRG+1] Correcting length of explained_variance_ratio_, eigen solver, final PR",
      "pr_body": "<!--\nThanks for contributing a pull request! Please ensure you have taken a look at\nthe contribution guidelines: https://github.com/scikit-learn/scikit-learn/blob/master/CONTRIBUTING.md#Contributing-Pull-Requests\n-->\n#### Reference Issue\n\n<!-- Example: Fixes #1234 -->\n\nFix #6032 \n#### What does this implement/fix? Explain your changes.\n\nAttribute explained_variance_ratio_ from LinearDiscriminantAnalysis class will be of length n_components (eigen solver).\n#### Any other comments?\n\nThis PR follows PR 7616. I mixed up my git history, so it was easier to open a new PR.\n\n<!--\nPlease be aware that we are a loose team of volunteers so patience is\nnecessary; assistance handling other issues is very welcome. We value\nall user contributions, no matter how minor they are. If we are slow to\nreview, either the pull request needs some benchmarking, tinkering,\nconvincing, etc. or more likely the reviewers are simply busy. In either\ncase, we ask for your understanding during the review process.\nFor more information, see our FAQ on this topic:\nhttp://scikit-learn.org/dev/faq.html#why-is-my-pull-request-not-getting-any-attention.\n\nThanks for contributing!\n-->\n",
      "issue_id": 6032,
      "issue_title": "LDA.explained_variance_ratio_ is of the wrong size",
      "issue_body": "The docs say that <a href=\"http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html#sklearn.discriminant_analysis.LinearDiscriminantAnalysis\">LDA.explained_variance_ratio_</a> should have only `n_components_`. But it doesn't.\n\nIt looks like this bug only exists when we use the `eigen` solver, not the `svd` solver.\n\n```\n>>> import numpy as np\n>>> from sklearn.lda import LDA\n>>> from sklearn.utils.testing import assert_equal\n>>>\n>>> state = np.random.RandomState(0)\n>>> X = state.normal(loc=0, scale=100, size=(40, 20))\n>>> y = state.randint(0, 3, size=(40, 1))\n>>>\n>>> # Train the LDA classifier. Use the eigen solver\n>>> lda_eigen = LDA(solver='eigen', n_components=5)\n>>> lda_eigen.fit(X, y)\n>>> assert_equal(lda_eigen.explained_variance_ratio_.shape, (5,))\nAssertionError: Tuples differ: (20,) != (5,)\n\nFirst differing element 0:\n20\n5\n\n- (20,)\n+ (5,)\n```\n\nLooks like we fix either the docs or the code. Which one?\n\nPinging @JPFrancoia.\n\nAddresses an issue in #6031.\n",
      "issue_closed_at": "2016-10-25T12:52:13Z",
      "base_commit": "ee3e61754bd4bb10cea8065993e462fc7b112cb3",
      "changes": [
        {
          "file": "sklearn/discriminant_analysis.py",
          "type": "function",
          "name": "_solve_lsqr",
          "class_name": "LinearDiscriminantAnalysis",
          "code": "def _solve_lsqr(self, X, y, shrinkage):\n        \"\"\"Least squares solver.\n\n        The least squares solver computes a straightforward solution of the\n        optimal decision rule based directly on the discriminant functions. It\n        can only be used for classification (with optional shrinkage), because\n        estimation of eigenvectors is not performed. Therefore, dimensionality\n        reduction with the transform is not supported.\n\n        Parameters\n        ----------\n        X : array-like, shape (n_samples, n_features)\n            Training data.\n\n        y : array-like, shape (n_samples,) or (n_samples, n_classes)\n            Target values.\n\n        shrinkage : string or float, optional\n            Shrinkage parameter, possible values:\n              - None: no shrinkage (default).\n              - 'auto': automatic shrinkage using the Ledoit-Wolf lemma.\n              - float between 0 and 1: fixed shrinkage parameter.\n\n        Notes\n        -----\n        This solver is based on [1]_, section 2.6.2, pp. 39-41.\n\n        References\n        ----------\n        .. [1] R. O. Duda, P. E. Hart, D. G. Stork. Pattern Classification\n           (Second Edition). John Wiley & Sons, Inc., New York, 2001. ISBN\n           0-471-05669-3.\n        \"\"\"\n        self.means_ = _class_means(X, y)\n        self.covariance_ = _class_cov(X, y, self.priors_, shrinkage)\n        self.coef_ = linalg.lstsq(self.covariance_, self.means_.T)[0].T\n        self.intercept_ = (-0.5 * np.diag(np.dot(self.means_, self.coef_.T))\n                           + np.log(self.priors_))"
        },
        {
          "file": "sklearn/discriminant_analysis.py",
          "type": "function",
          "name": "_solve_svd",
          "class_name": "LinearDiscriminantAnalysis",
          "code": "def _solve_svd(self, X, y):\n        \"\"\"SVD solver.\n\n        Parameters\n        ----------\n        X : array-like, shape (n_samples, n_features)\n            Training data.\n\n        y : array-like, shape (n_samples,) or (n_samples, n_targets)\n            Target values.\n        \"\"\"\n        n_samples, n_features = X.shape\n        n_classes = len(self.classes_)\n\n        self.means_ = _class_means(X, y)\n        if self.store_covariance:\n            self.covariance_ = _class_cov(X, y, self.priors_)\n\n        Xc = []\n        for idx, group in enumerate(self.classes_):\n            Xg = X[y == group, :]\n            Xc.append(Xg - self.means_[idx])\n\n        self.xbar_ = np.dot(self.priors_, self.means_)\n\n        Xc = np.concatenate(Xc, axis=0)\n\n        # 1) within (univariate) scaling by with classes std-dev\n        std = Xc.std(axis=0)\n        # avoid division by zero in normalization\n        std[std == 0] = 1.\n        fac = 1. / (n_samples - n_classes)\n\n        # 2) Within variance scaling\n        X = np.sqrt(fac) * (Xc / std)\n        # SVD of centered (within)scaled data\n        U, S, V = linalg.svd(X, full_matrices=False)\n\n        rank = np.sum(S > self.tol)\n        if rank < n_features:\n            warnings.warn(\"Variables are collinear.\")\n        # Scaling of within covariance is: V' 1/S\n        scalings = (V[:rank] / std).T / S[:rank]\n\n        # 3) Between variance scaling\n        # Scale weighted centers\n        X = np.dot(((np.sqrt((n_samples * self.priors_) * fac)) *\n                    (self.means_ - self.xbar_).T).T, scalings)\n        # Centers are living in a space with n_classes-1 dim (maximum)\n        # Use SVD to find projection in the space spanned by the\n        # (n_classes) centers\n        _, S, V = linalg.svd(X, full_matrices=0)\n\n        self.explained_variance_ratio_ = (S**2 / np.sum(\n                S**2))[:self.n_components]\n        rank = np.sum(S > self.tol * S[0])\n        self.scalings_ = np.dot(scalings, V.T[:, :rank])\n        coef = np.dot(self.means_ - self.xbar_, self.scalings_)\n        self.intercept_ = (-0.5 * np.sum(coef ** 2, axis=1)\n                           + np.log(self.priors_))\n        self.coef_ = np.dot(coef, self.scalings_.T)\n        self.intercept_ -= np.dot(self.xbar_, self.coef_.T)"
        },
        {
          "file": "sklearn/discriminant_analysis.py",
          "type": "function",
          "name": "fit",
          "class_name": "QuadraticDiscriminantAnalysis",
          "code": "def fit(self, X, y, store_covariances=None, tol=None):\n        \"\"\"Fit the model according to the given training data and parameters.\n\n            .. versionchanged:: 0.17\n               Deprecated *store_covariance* have been moved to main constructor.\n\n            .. versionchanged:: 0.17\n               Deprecated *tol* have been moved to main constructor.\n\n        Parameters\n        ----------\n        X : array-like, shape = [n_samples, n_features]\n            Training vector, where n_samples in the number of samples and\n            n_features is the number of features.\n\n        y : array, shape = [n_samples]\n            Target values (integers)\n        \"\"\"\n        if store_covariances:\n            warnings.warn(\"The parameter 'store_covariances' is deprecated as \"\n                          \"of version 0.17 and will be removed in 0.19. The \"\n                          \"parameter is no longer necessary because the value \"\n                          \"is set via the estimator initialisation or \"\n                          \"set_params method.\", DeprecationWarning)\n            self.store_covariances = store_covariances\n        if tol:\n            warnings.warn(\"The parameter 'tol' is deprecated as of version \"\n                          \"0.17 and will be removed in 0.19. The parameter is \"\n                          \"no longer necessary because the value is set via \"\n                          \"the estimator initialisation or set_params method.\",\n                          DeprecationWarning)\n            self.tol = tol\n        X, y = check_X_y(X, y)\n        check_classification_targets(y)\n        self.classes_, y = np.unique(y, return_inverse=True)\n        n_samples, n_features = X.shape\n        n_classes = len(self.classes_)\n        if n_classes < 2:\n            raise ValueError('y has less than 2 classes')\n        if self.priors is None:\n            self.priors_ = bincount(y) / float(n_samples)\n        else:\n            self.priors_ = self.priors\n\n        cov = None\n        if self.store_covariances:\n            cov = []\n        means = []\n        scalings = []\n        rotations = []\n        for ind in xrange(n_classes):\n            Xg = X[y == ind, :]\n            meang = Xg.mean(0)\n            means.append(meang)\n            if len(Xg) == 1:\n                raise ValueError('y has only 1 sample in class %s, covariance '\n                                 'is ill defined.' % str(self.classes_[ind]))\n            Xgc = Xg - meang\n            # Xgc = U * S * V.T\n            U, S, Vt = np.linalg.svd(Xgc, full_matrices=False)\n            rank = np.sum(S > self.tol)\n            if rank < n_features:\n                warnings.warn(\"Variables are collinear\")\n            S2 = (S ** 2) / (len(Xg) - 1)\n            S2 = ((1 - self.reg_param) * S2) + self.reg_param\n            if self.store_covariances:\n                # cov = V * (S^2 / (n-1)) * V.T\n                cov.append(np.dot(S2 * Vt.T, Vt))\n            scalings.append(S2)\n            rotations.append(Vt.T)\n        if self.store_covariances:\n            self.covariances_ = cov\n        self.means_ = np.asarray(means)\n        self.scalings_ = scalings\n        self.rotations_ = rotations\n        return self"
        },
        {
          "file": "sklearn/discriminant_analysis.py",
          "type": "function",
          "name": "transform",
          "class_name": "LinearDiscriminantAnalysis",
          "code": "def transform(self, X):\n        \"\"\"Project data to maximize class separation.\n\n        Parameters\n        ----------\n        X : array-like, shape (n_samples, n_features)\n            Input data.\n\n        Returns\n        -------\n        X_new : array, shape (n_samples, n_components)\n            Transformed data.\n        \"\"\"\n        if self.solver == 'lsqr':\n            raise NotImplementedError(\"transform not implemented for 'lsqr' \"\n                                      \"solver (use 'svd' or 'eigen').\")\n        check_is_fitted(self, ['xbar_', 'scalings_'], all_or_any=any)\n\n        X = check_array(X)\n        if self.solver == 'svd':\n            X_new = np.dot(X - self.xbar_, self.scalings_)\n        elif self.solver == 'eigen':\n            X_new = np.dot(X, self.scalings_)\n        n_components = X.shape[1] if self.n_components is None \\\n            else self.n_components\n        return X_new[:, :n_components]"
        }
      ]
    },
    {
      "pr_number": 8936,
      "pr_title": "[MRG+1] fixed OOB_Score bug for bagging classifiers.",
      "pr_body": "Fixes #8933\r\n\r\n<!--\r\nThanks for contributing a pull request! Please ensure you have taken a look at\r\nthe contribution guidelines: https://github.com/scikit-learn/scikit-learn/blob/master/CONTRIBUTING.md#Contributing-Pull-Requests\r\n-->\r\n#### Reference Issue\r\n<!-- Example: Fixes #1234 -->\r\n\r\n\r\n#### What does this implement/fix? Explain your changes.\r\n\r\n\r\n#### Any other comments?\r\n\r\n\r\n<!--\r\nPlease be aware that we are a loose team of volunteers so patience is\r\nnecessary; assistance handling other issues is very welcome. We value\r\nall user contributions, no matter how minor they are. If we are slow to\r\nreview, either the pull request needs some benchmarking, tinkering,\r\nconvincing, etc. or more likely the reviewers are simply busy. In either\r\ncase, we ask for your understanding during the review process.\r\nFor more information, see our FAQ on this topic:\r\nhttp://scikit-learn.org/dev/faq.html#why-is-my-pull-request-not-getting-any-attention.\r\n\r\nThanks for contributing!\r\n-->\r\n",
      "issue_id": 8933,
      "issue_title": "BUG: BaggingClassifier.oob_score_ should not change with class label",
      "issue_body": "Let us compute the oob score of a bagged classifier.\r\n\r\n```python\r\nimport numpy as np\r\nimport pandas as pd\r\nfrom sklearn.ensemble import BaggingClassifier\r\nfrom sklearn.neighbors import KNeighborsClassifier\r\n\r\nN = 50\r\nrandState = 5\r\nlabel = 'Label'\r\nfeatures = ['A','B','C']\r\n\r\nlabels = np.random.randint(3, size = N) - 1\r\ndf = pd.DataFrame( labels , index=range(N), columns=[label] )\r\nfor col in features:\r\n    df[col] = df[label] + 0.01 * np.random.rand( N )\r\n\r\nclf = BaggingClassifier(base_estimator = KNeighborsClassifier(), n_estimators = 10, oob_score = True, random_state = randState )\r\nclf.fit(df[features], df[label])\r\nprint clf.oob_score_\r\n```\r\n\r\nHere, clf.oob_score_=0.0.\r\n\r\nNow, you would not expect that the OOB accuracy is a function of the class labels...\r\n\r\n```python\r\ndf.loc[ df[label] == -1 , label ] = 2\r\nclf = BaggingClassifier(base_estimator = KNeighborsClassifier(), n_estimators = 10, oob_score = True, random_state = randState )\r\nclf.fit(df[features], df[label])\r\nprint clf.oob_score_\r\n```\r\n\r\nNow, clf.oob_score_=1.0.\r\n\r\nClearly, OOB score should not be a function of the labels arbitrarily chosen for the classes.\r\n\r\nsklearn.__version__: '0.18.1'\r\nnumpy.__version__: '1.11.3'",
      "issue_closed_at": "2017-06-08T09:35:49Z",
      "base_commit": "9131f89e6c165fb27dadd37d3168c1ee5ea84f5a",
      "changes": [
        {
          "file": "sklearn/ensemble/bagging.py",
          "type": "function",
          "name": "_set_oob_score",
          "class_name": "BaggingRegressor",
          "code": "def _set_oob_score(self, X, y):\n        n_samples = y.shape[0]\n\n        predictions = np.zeros((n_samples,))\n        n_predictions = np.zeros((n_samples,))\n\n        for estimator, samples, features in zip(self.estimators_,\n                                                self.estimators_samples_,\n                                                self.estimators_features_):\n            # Create mask for OOB samples\n            mask = ~samples\n\n            predictions[mask] += estimator.predict((X[mask, :])[:, features])\n            n_predictions[mask] += 1\n\n        if (n_predictions == 0).any():\n            warn(\"Some inputs do not have OOB scores. \"\n                 \"This probably means too few estimators were used \"\n                 \"to compute any reliable oob estimates.\")\n            n_predictions[n_predictions == 0] = 1\n\n        predictions /= n_predictions\n\n        self.oob_prediction_ = predictions\n        self.oob_score_ = r2_score(y, predictions)"
        }
      ]
    },
    {
      "pr_number": 4146,
      "pr_title": "[MRG + 1] Fdr treshold bug",
      "pr_body": "Continues #2932. Fixes #2771.\nThese are some minor fixes on top of #2932, where @bthirion already gave his +1.\nMaybe @arjoly wants to have a look as he commented there.\nThis is a good bug fix that I think we should include asap.\n\nFYI tests take .5s.\n",
      "issue_id": 2771,
      "issue_title": "SelectFdr has serious thresholding bug",
      "issue_body": "The current code reads like:\n\n```\ndef _get_support_mask(self):\n    alpha = self.alpha\n    sv = np.sort(self.pvalues_)\n    threshold = sv[sv < alpha * np.arange(len(self.pvalues_))].max()\n    return self.pvalues_ <= threshold\n```\n\nBut this doesn't actually control FDR at all, the correct implementation should have:\n\n```\n    bf_alpha = alpha / len(self.pvalues_)\n    threshold = sv[sv < bf_alpha * np.arange(len(self.pvalues_))].max()\n```\n\nNote the k / m term in the equation at:\nhttp://en.wikipedia.org/wiki/False_discovery_rate#Benjamini.E2.80.93Hochberg_procedure\n",
      "issue_closed_at": "2015-02-24T22:15:19Z",
      "base_commit": "f6af4881a5a66fb21688379b39f9304898a11bc0",
      "changes": [
        {
          "file": "sklearn/feature_selection/univariate_selection.py",
          "type": "function",
          "name": "_get_support_mask",
          "class_name": "GenericUnivariateSelect",
          "code": "def _get_support_mask(self):\n        check_is_fitted(self, 'scores_')\n\n        selector = self._make_selector()\n        selector.pvalues_ = self.pvalues_\n        selector.scores_ = self.scores_\n        return selector._get_support_mask()"
        },
        {
          "file": "sklearn/feature_selection/univariate_selection.py",
          "type": "class",
          "name": "SelectFdr",
          "code": "class SelectFdr(_BaseFilter):\n    \"\"\"Filter: Select the p-values for an estimated false discovery rate\n\n    This uses the Benjamini-Hochberg procedure. ``alpha`` is the target false\n    discovery rate.\n\n    Parameters\n    ----------\n    score_func : callable\n        Function taking two arrays X and y, and returning a pair of arrays\n        (scores, pvalues).\n\n    alpha : float, optional\n        The highest uncorrected p-value for features to keep.\n\n\n    Attributes\n    ----------\n    scores_ : array-like, shape=(n_features,)\n        Scores of features.\n\n    pvalues_ : array-like, shape=(n_features,)\n        p-values of feature scores.\n    \"\"\"\n\n    def __init__(self, score_func=f_classif, alpha=5e-2):\n        super(SelectFdr, self).__init__(score_func)\n        self.alpha = alpha\n\n    def _get_support_mask(self):\n        check_is_fitted(self, 'scores_')\n\n        alpha = self.alpha\n        sv = np.sort(self.pvalues_)\n        selected = sv[sv < alpha * np.arange(len(self.pvalues_))]\n        if selected.size == 0:\n            return np.zeros_like(self.pvalues_, dtype=bool)\n        return self.pvalues_ <= selected.max()"
        },
        {
          "file": "sklearn/feature_selection/univariate_selection.py",
          "type": "function",
          "name": "__init__",
          "class_name": "GenericUnivariateSelect",
          "code": "def __init__(self, score_func=f_classif, mode='percentile', param=1e-5):\n        super(GenericUnivariateSelect, self).__init__(score_func)\n        self.mode = mode\n        self.param = param"
        }
      ]
    },
    {
      "pr_number": 9569,
      "pr_title": "[MRG+2] remove modification of warning registry for no reason",
      "pr_body": "Fixes #9560. Fixes #2755. Fixes #7346.",
      "issue_id": 7346,
      "issue_title": "Pop from empty list coming from get_params()",
      "issue_body": "<!--\nIf your issue is a usage question, submit it here instead:\n- StackOverflow with the scikit-learn tag: http://stackoverflow.com/questions/tagged/scikit-learn\n- Mailing List: https://mail.python.org/mailman/listinfo/scikit-learn\nFor more information, see User Questions: http://scikit-learn.org/stable/support.html#user-questions\n-->\n\n<!-- Instructions For Filing a Bug: https://github.com/scikit-learn/scikit-learn/blob/master/CONTRIBUTING.md#filing-bugs -->\n#### Description\n\n I am getting a pop from empty list error from the warnings.filers.pop(0) call in get_params(). I am using Dask to parallelize the computation of fitting a bunch of MeanShift objects. I only get this error on one machine (a remote linux machine), but it works fine on my home compute (running ubuntu 14) \n#### Steps/Code to Reproduce\n\n<!--\n\n-->\n#### Expected Results\n\nShould just fit the MeanShifts and move on\n#### Actual Results\n\nTraceback (most recent call last):\n  File \"tda_profile.py\", line 34, in <module>\n    _tda.fit(train_features, train_targets)\n  File \"/home/ben/tda/tda_parallel_test.py\", line 652, in fit\n    fits = fits.compute()\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/dask/base.py\", line 86, in compute\n    return compute(self, *_kwargs)[0]\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/dask/base.py\", line 179, in compute\n    results = get(dsk, keys, *_kwargs)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/dask/threaded.py\", line 57, in get\n    **kwargs)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/dask/async.py\", line 484, in get_async\n    raise(remote_exception(res, tb))\ndask.async.IndexError: pop from empty list\n## Traceback\n\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/dask/async.py\", line 267, in execute_task\n    result = _execute_task(task, data)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/dask/async.py\", line 249, in _execute_task\n    return func(*args2)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/cluster/mean_shift_.py\", line 391, in fit\n    cluster_all=self.cluster_all, n_jobs=self.n_jobs)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/cluster/mean_shift_.py\", line 191, in mean_shift\n    (seed, X, nbrs, max_iter) for seed in seeds)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py\", line 800, in **call**\n    while self.dispatch_one_batch(iterator):\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py\", line 658, in dispatch_one_batch\n    self._dispatch(tasks)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py\", line 566, in _dispatch\n    job = ImmediateComputeBatch(batch)\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py\", line 180, in __init__\n    self.results = batch()\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py\", line 72, in **call**\n    return [func(_args, *_kwargs) for func, args, kwargs in self.items]\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py\", line 72, in <listcomp>\n    return [func(_args, *_kwargs) for func, args, kwargs in self.items]\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/cluster/mean_shift_.py\", line 75, in _mean_shift_single_seed\n    bandwidth = nbrs.get_params()['radius']\n  File \"/home/ben/anaconda3/lib/python3.5/site-packages/sklearn/base.py\", line 227, in get_params\n    warnings.filters.pop(0)\n#### Versions\n\n> > > import platform; print(platform.platform())\n> > > Linux-3.10.0-327.el7.x86_64-x86_64-with-centos-7.2.1511-Core\n> > > import sys; print(\"Python\", sys.version)\n> > > Python 3.5.2 |Anaconda 4.1.1 (64-bit)| (default, Jul  2 2016, 17:53:06) \n> > > [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]\n> > > import numpy; print(\"NumPy\", numpy.**version**)\n> > > NumPy 1.11.1\n> > > import scipy; print(\"SciPy\", scipy.**version**)\n> > > SciPy 0.17.1\n> > > import sklearn; print(\"Scikit-Learn\", sklearn.**version**)\n> > > Scikit-Learn 0.17.1\n\n<!-- Thanks for contributing! -->\n",
      "issue_closed_at": "2017-09-08T15:29:37Z",
      "base_commit": "e1fb03c86d2a2c47ef008ead958e1bc10fb06e77",
      "changes": [
        {
          "file": "sklearn/base.py",
          "type": "function",
          "name": "get_params",
          "class_name": "BaseEstimator",
          "code": "def get_params(self, deep=True):\n        \"\"\"Get parameters for this estimator.\n\n        Parameters\n        ----------\n        deep : boolean, optional\n            If True, will return the parameters for this estimator and\n            contained subobjects that are estimators.\n\n        Returns\n        -------\n        params : mapping of string to any\n            Parameter names mapped to their values.\n        \"\"\"\n        out = dict()\n        for key in self._get_param_names():\n            # We need deprecation warnings to always be on in order to\n            # catch deprecated param values.\n            # This is set in utils/__init__.py but it gets overwritten\n            # when running under python3 somehow.\n            warnings.simplefilter(\"always\", DeprecationWarning)\n            try:\n                with warnings.catch_warnings(record=True) as w:\n                    value = getattr(self, key, None)\n                if len(w) and w[0].category == DeprecationWarning:\n                    # if the parameter is deprecated, don't show it\n                    continue\n            finally:\n                warnings.filters.pop(0)\n\n            # XXX: should we rather test if instance of estimator?\n            if deep and hasattr(value, 'get_params'):\n                deep_items = value.get_params().items()\n                out.update((key + '__' + k, val) for k, val in deep_items)\n            out[key] = value\n        return out"
        },
        {
          "file": "sklearn/base.py",
          "type": "function",
          "name": "__setstate__",
          "class_name": "BaseEstimator",
          "code": "def __setstate__(self, state):\n        if type(self).__module__.startswith('sklearn.'):\n            pickle_version = state.pop(\"_sklearn_version\", \"pre-0.18\")\n            if pickle_version != __version__:\n                warnings.warn(\n                    \"Trying to unpickle estimator {0} from version {1} when \"\n                    \"using version {2}. This might lead to breaking code or \"\n                    \"invalid results. Use at your own risk.\".format(\n                        self.__class__.__name__, pickle_version, __version__),\n                    UserWarning)\n        try:\n            super(BaseEstimator, self).__setstate__(state)\n        except AttributeError:\n            self.__dict__.update(state)"
        }
      ]
    },
    {
      "pr_number": 8348,
      "pr_title": "[MRG+1] Bug in BaseSearchCV.inverse_transform",
      "pr_body": "<!--\r\nThanks for contributing a pull request! Please ensure you have taken a look at\r\nthe contribution guidelines: https://github.com/scikit-learn/scikit-learn/blob/master/CONTRIBUTING.md#Contributing-Pull-Requests\r\n-->\r\n#### Reference Issue\r\n<!-- Example: Fixes #1234 -->\r\nFixes #8344 \r\n\r\n#### What does this implement/fix? Explain your changes.\r\nCode for inverse transform function in BaseSearchCV was written incorrect, I have changed it from `.transform(Xt)` to `.inverse_transform(Xt)` \r\n\r\n#### Any other comments?\r\n\r\n\r\n<!--\r\nPlease be aware that we are a loose team of volunteers so patience is\r\nnecessary; assistance handling other issues is very welcome. We value\r\nall user contributions, no matter how minor they are. If we are slow to\r\nreview, either the pull request needs some benchmarking, tinkering,\r\nconvincing, etc. or more likely the reviewers are simply busy. In either\r\ncase, we ask for your understanding during the review process.\r\nFor more information, see our FAQ on this topic:\r\nhttp://scikit-learn.org/dev/faq.html#why-is-my-pull-request-not-getting-any-attention.\r\n\r\nThanks for contributing!\r\n-->\r\n",
      "issue_id": 8344,
      "issue_title": "Bug in BaseSearchCV.inverse_transform",
      "issue_body": "The [delegating code](https://github.com/scikit-learn/scikit-learn/blob/e5ceda88f2a24b3dd4f9a94404828f982cdf52ad/sklearn/utils/validation.py#L650) for `inverse_transform` is\r\n\r\n```python\r\n    def inverse_transform(self, Xt):\r\n        self._check_is_fitted('inverse_transform')\r\n        return self.best_estimator_.transform(Xt)\r\n```\r\n\r\nUnless I'm mistaken, this should be `.inverse_transform(Xt)`",
      "issue_closed_at": "2017-02-17T16:12:52Z",
      "base_commit": "8694278c027d1017670e67cd3298fc5fd627d4c9",
      "changes": [
        {
          "file": "sklearn/model_selection/_search.py",
          "type": "function",
          "name": "inverse_transform",
          "class_name": "BaseSearchCV",
          "code": "def inverse_transform(self, Xt):\n        \"\"\"Call inverse_transform on the estimator with the best found params.\n\n        Only available if the underlying estimator implements\n        ``inverse_transform`` and ``refit=True``.\n\n        Parameters\n        -----------\n        Xt : indexable, length n_samples\n            Must fulfill the input assumptions of the\n            underlying estimator.\n\n        \"\"\"\n        self._check_is_fitted('inverse_transform')\n        return self.best_estimator_.transform(Xt)"
        }
      ]
    }
  ]
}