{
  "Selected_candidate": {
    "pr_number": 3028,
    "pr_title": "Add \"errors\" keyword argument to drop() and drop_dims() (#2994)",
    "pr_body": "<!-- Feel free to remove check-list items aren't relevant to your change -->\r\n\r\n - [x] Closes #2994 \r\n - [x] Tests added\r\n - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API\r\n\r\nThis addresses #2994 by adding an \"errors\" keyword argument to `Dataset.drop()`, `Dataset.drop_dims()`, and `DataArray.drop()`. \r\n\r\nI stuck with pandas' convention of using either `errors='raise'`, now the default that maintains previous behavior by raising an error if any passed label is not found in the dataset/array, or `errors='ignore'` in which case any missing labels are silently ignored. \r\n\r\nThis seems like a pretty straightforward change; mainly it is just skipping checks for missing labels when `errors == 'ignore'` and passing the errors keyword over to the pandas method when using `index.drop()`. Hopefully there are no subtleties that I've missed. \r\n\r\nI added documentation to the appropriate methods, although I have been struggling to build the docs locally and am unsure if they look right.\r\n\r\nAlso this is my first attempt to contribute to any project, so suggestions and feedback are welcome. ",
    "issue_id": 2994,
    "issue_title": "xr.Dataset.drop",
    "issue_body": "Currently, `drop` throws an error if one of the labels doesn't exist. It would be nice to have a parameter in the drop method for optionally ignoring errors like in the pandas.DataFrame.\r\nFrom the pandas [documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.drop.html):\r\n\r\n> errors : {‘ignore’, ‘raise’}, default ‘raise’\r\n>     If ‘ignore’, suppress error and only existing labels are dropped.\r\n",
    "issue_closed_at": "2019-06-20T15:48:00Z",
    "base_commit": "c2a2a6efcaf2d279c78da4ba3a87ea96afe78be0",
    "changes": [
      {
        "file": "xarray/core/dataarray.py",
        "type": "function",
        "name": "transpose",
        "class_name": "DataArray",
        "code": "def transpose(self, *dims, transpose_coords=None) -> 'DataArray':\n        \"\"\"Return a new DataArray object with transposed dimensions.\n\n        Parameters\n        ----------\n        *dims : str, optional\n            By default, reverse the dimensions. Otherwise, reorder the\n            dimensions to this order.\n        transpose_coords : boolean, optional\n            If True, also transpose the coordinates of this DataArray.\n\n        Returns\n        -------\n        transposed : DataArray\n            The returned DataArray's array is transposed.\n\n        Notes\n        -----\n        This operation returns a view of this array's data. It is\n        lazy for dask-backed DataArrays but not for numpy-backed DataArrays\n        -- the data will be fully loaded.\n\n        See Also\n        --------\n        numpy.transpose\n        Dataset.transpose\n        \"\"\"\n        if dims:\n            if set(dims) ^ set(self.dims):\n                raise ValueError('arguments to transpose (%s) must be '\n                                 'permuted array dimensions (%s)'\n                                 % (dims, tuple(self.dims)))\n\n        variable = self.variable.transpose(*dims)\n        if transpose_coords:\n            coords = {}\n            for name, coord in self.coords.items():\n                coord_dims = tuple(dim for dim in dims if dim in coord.dims)\n                coords[name] = coord.variable.transpose(*coord_dims)\n            return self._replace(variable, coords)\n        else:\n            if transpose_coords is None \\\n                    and any(self[c].ndim > 1 for c in self.coords):\n                warnings.warn('This DataArray contains multi-dimensional '\n                              'coordinates. In the future, these coordinates '\n                              'will be transposed as well unless you specify '\n                              'transpose_coords=False.',\n                              FutureWarning, stacklevel=2)\n            return self._replace(variable)"
      },
      {
        "file": "xarray/core/dataarray.py",
        "type": "function",
        "name": "drop",
        "class_name": "DataArray",
        "code": "def drop(self, labels, dim=None):\n        \"\"\"Drop coordinates or index labels from this DataArray.\n\n        Parameters\n        ----------\n        labels : scalar or list of scalars\n            Name(s) of coordinate variables or index labels to drop.\n        dim : str, optional\n            Dimension along which to drop index labels. By default (if\n            ``dim is None``), drops coordinates rather than index labels.\n\n        Returns\n        -------\n        dropped : DataArray\n        \"\"\"\n        if utils.is_scalar(labels):\n            labels = [labels]\n        ds = self._to_temp_dataset().drop(labels, dim)\n        return self._from_temp_dataset(ds)"
      },
      {
        "file": "xarray/core/dataset.py",
        "type": "function",
        "name": "_assert_all_in_dataset",
        "class_name": "Dataset",
        "code": "def _assert_all_in_dataset(self, names, virtual_okay=False):\n        bad_names = set(names) - set(self._variables)\n        if virtual_okay:\n            bad_names -= self.virtual_variables\n        if bad_names:\n            raise ValueError('One or more of the specified variables '\n                             'cannot be found in this dataset')"
      },
      {
        "file": "xarray/core/dataset.py",
        "type": "function",
        "name": "drop",
        "class_name": "Dataset",
        "code": "def drop(self, labels, dim=None):\n        \"\"\"Drop variables or index labels from this dataset.\n\n        Parameters\n        ----------\n        labels : scalar or list of scalars\n            Name(s) of variables or index labels to drop.\n        dim : None or str, optional\n            Dimension along which to drop index labels. By default (if\n            ``dim is None``), drops variables rather than index labels.\n\n        Returns\n        -------\n        dropped : Dataset\n        \"\"\"\n        if utils.is_scalar(labels):\n            labels = [labels]\n        if dim is None:\n            return self._drop_vars(labels)\n        else:\n            try:\n                index = self.indexes[dim]\n            except KeyError:\n                raise ValueError(\n                    'dimension %r does not have coordinate labels' % dim)\n            new_index = index.drop(labels)\n            return self.loc[{dim: new_index}]"
      },
      {
        "file": "xarray/core/dataset.py",
        "type": "function",
        "name": "drop_dims",
        "class_name": "Dataset",
        "code": "def drop_dims(self, drop_dims):\n        \"\"\"Drop dimensions and associated variables from this dataset.\n\n        Parameters\n        ----------\n        drop_dims : str or list\n            Dimension or dimensions to drop.\n\n        Returns\n        -------\n        obj : Dataset\n            The dataset without the given dimensions (or any variables\n            containing those dimensions)\n        \"\"\"\n        if utils.is_scalar(drop_dims):\n            drop_dims = [drop_dims]\n\n        missing_dimensions = [d for d in drop_dims if d not in self.dims]\n        if missing_dimensions:\n            raise ValueError('Dataset does not contain the dimensions: %s'\n                             % missing_dimensions)\n\n        drop_vars = set(k for k, v in self._variables.items()\n                        for d in v.dims if d in drop_dims)\n\n        variables = OrderedDict((k, v) for k, v in self._variables.items()\n                                if k not in drop_vars)\n        coord_names = set(k for k in self._coord_names if k in variables)\n\n        return self._replace_with_new_dims(variables, coord_names)"
      }
    ]
  },
  "Justification": "Candidate E is most helpful because it directly involves modifications to the `Dataset` class, which is central to the CURRENT bug report regarding enhancements to displaying units in datasets. The structural and module/component similarity is high, as both involve the `xarray.Dataset` and methods closely related to its functionality. While the primary symptom of the error differs, the underlying operations within the dataset representations share commonalities, especially regarding the handling of attributes. Moreover, the fix described in Candidate E, which introduces an option to handle errors when dropping dimensions, suggests thoughtful modifications in the dataset's management and error handling that could inspire similar approaches in handling and displaying units in the CURRENT issue.",
  "instance_id": "pydata__xarray-4248",
  "repo": "pydata/xarray",
  "created_at": "2020-07-22T14:54:03Z",
  "problem_statement": "Feature request: show units in dataset overview\nHere's a hypothetical dataset:\r\n\r\n```\r\n<xarray.Dataset>\r\nDimensions:  (time: 3, x: 988, y: 822)\r\nCoordinates:\r\n  * x         (x) float64 ...\r\n  * y         (y) float64 ...\r\n  * time      (time) datetime64[ns] ...\r\nData variables:\r\n    rainfall  (time, y, x) float32 ...\r\n    max_temp  (time, y, x) float32 ...\r\n```\r\n\r\nIt would be really nice if the units of the coordinates and of the data variables were shown in the `Dataset` repr, for example as:\r\n\r\n```\r\n<xarray.Dataset>\r\nDimensions:  (time: 3, x: 988, y: 822)\r\nCoordinates:\r\n  * x, in metres         (x)            float64 ...\r\n  * y, in metres         (y)            float64 ...\r\n  * time                 (time)         datetime64[ns] ...\r\nData variables:\r\n    rainfall, in mm      (time, y, x)   float32 ...\r\n    max_temp, in deg C   (time, y, x)   float32 ...\r\n```\n",
  "patch": "diff --git a/xarray/core/formatting.py b/xarray/core/formatting.py\n--- a/xarray/core/formatting.py\n+++ b/xarray/core/formatting.py\n@@ -261,6 +261,8 @@ def inline_variable_array_repr(var, max_width):\n         return inline_dask_repr(var.data)\n     elif isinstance(var._data, sparse_array_type):\n         return inline_sparse_repr(var.data)\n+    elif hasattr(var._data, \"_repr_inline_\"):\n+        return var._data._repr_inline_(max_width)\n     elif hasattr(var._data, \"__array_function__\"):\n         return maybe_truncate(repr(var._data).replace(\"\\n\", \" \"), max_width)\n     else:\n"
}