{
  "Selected_candidate": {
    "pr_number": 16315,
    "pr_title": "Fixed #34171 -- Fixed QuerySet.bulk_create() on fields with db_column in unique_fields/update_fields.",
    "pr_body": "[Ticket 34171](https://code.djangoproject.com/ticket/34171)\r\nI think we don't need separate test for this, just adding a ```db_column``` to one of the models in ```bulk_create``` should do the work.",
    "issue_id": 34171,
    "issue_title": "QuerySet.bulk_create() crashes on mixed case columns in unique_fields/update_fields.",
    "issue_body": "Not sure exactly how to phrase this, but when I I'm calling\nbulk_update\non the manager for a class with\ndb_column\nset on fields the SQL is invalid. Ellipses indicate other fields excluded for clarity.\nclass ActivityBlackListed(models.Model):\n    \"\"\"\n    Originally sourced from Activity_BlackListed in /home/josh/PNDS_Interim_MIS-Data.accdb (13 records)\n    \"\"\"\n\n    class Meta:\n        db_table = \"Activity_BlackListed\"\n\n    blacklistid = models.IntegerField(primary_key=True, db_column=\"BlacklistID\")\n    sectorid = models.IntegerField(null=True, blank=True, db_column=\"SectorID\")\n    ...\nqs.bulk_create(instances, update_conflicts=True, update_fields=[\"sectorid\", ...], unique_fields=[\"blacklistid\"])\nThe \"INSERT\" code does take into account the db_columns\nINSERT INTO \"Activity_BlackListed\" (\"BlacklistID\",...) VALUES (%s,  ...),\nThe code which is generated for \"ON CONFLICT\" uses the field name and not the db_column which leads to a syntax error\n'ON CONFLICT(\"blacklistid\") DO UPDATE SET \"sectorid\" = EXCLUDED.\"sectorid\", ...\nPostgreSQL returns\nERROR:  column \"blacklistid\" does not exist at character 1508\nWhat should be generated is I think:\n'ON CONFLICT(\"BlacklistID\") DO UPDATE SET \"SectorID\" = EXCLUDED.\"SectorID\", ...",
    "issue_closed_at": "2022-11-22T13:05:04",
    "base_commit": "7d5329852f19c6ae78c6f6f3d3e41835377bf295",
    "changes": [
      {
        "file": "django/db/models/query.py",
        "type": "function",
        "name": "_check_bulk_create_options",
        "class_name": "QuerySet",
        "code": "def _check_bulk_create_options(\n        self, ignore_conflicts, update_conflicts, update_fields, unique_fields\n    ):\n        if ignore_conflicts and update_conflicts:\n            raise ValueError(\n                \"ignore_conflicts and update_conflicts are mutually exclusive.\"\n            )\n        db_features = connections[self.db].features\n        if ignore_conflicts:\n            if not db_features.supports_ignore_conflicts:\n                raise NotSupportedError(\n                    \"This database backend does not support ignoring conflicts.\"\n                )\n            return OnConflict.IGNORE\n        elif update_conflicts:\n            if not db_features.supports_update_conflicts:\n                raise NotSupportedError(\n                    \"This database backend does not support updating conflicts.\"\n                )\n            if not update_fields:\n                raise ValueError(\n                    \"Fields that will be updated when a row insertion fails \"\n                    \"on conflicts must be provided.\"\n                )\n            if unique_fields and not db_features.supports_update_conflicts_with_target:\n                raise NotSupportedError(\n                    \"This database backend does not support updating \"\n                    \"conflicts with specifying unique fields that can trigger \"\n                    \"the upsert.\"\n                )\n            if not unique_fields and db_features.supports_update_conflicts_with_target:\n                raise ValueError(\n                    \"Unique fields that can trigger the upsert must be provided.\"\n                )\n            # Updating primary keys and non-concrete fields is forbidden.\n            update_fields = [self.model._meta.get_field(name) for name in update_fields]\n            if any(not f.concrete or f.many_to_many for f in update_fields):\n                raise ValueError(\n                    \"bulk_create() can only be used with concrete fields in \"\n                    \"update_fields.\"\n                )\n            if any(f.primary_key for f in update_fields):\n                raise ValueError(\n                    \"bulk_create() cannot be used with primary keys in \"\n                    \"update_fields.\"\n                )\n            if unique_fields:\n                unique_fields = [\n                    self.model._meta.get_field(name) for name in unique_fields\n                ]\n                if any(not f.concrete or f.many_to_many for f in unique_fields):\n                    raise ValueError(\n                        \"bulk_create() can only be used with concrete fields \"\n                        \"in unique_fields.\"\n                    )\n            return OnConflict.UPDATE\n        return None"
      },
      {
        "file": "django/db/models/query.py",
        "type": "function",
        "name": "_check_bulk_create_options",
        "class_name": "QuerySet",
        "code": "def _check_bulk_create_options(\n        self, ignore_conflicts, update_conflicts, update_fields, unique_fields\n    ):\n        if ignore_conflicts and update_conflicts:\n            raise ValueError(\n                \"ignore_conflicts and update_conflicts are mutually exclusive.\"\n            )\n        db_features = connections[self.db].features\n        if ignore_conflicts:\n            if not db_features.supports_ignore_conflicts:\n                raise NotSupportedError(\n                    \"This database backend does not support ignoring conflicts.\"\n                )\n            return OnConflict.IGNORE\n        elif update_conflicts:\n            if not db_features.supports_update_conflicts:\n                raise NotSupportedError(\n                    \"This database backend does not support updating conflicts.\"\n                )\n            if not update_fields:\n                raise ValueError(\n                    \"Fields that will be updated when a row insertion fails \"\n                    \"on conflicts must be provided.\"\n                )\n            if unique_fields and not db_features.supports_update_conflicts_with_target:\n                raise NotSupportedError(\n                    \"This database backend does not support updating \"\n                    \"conflicts with specifying unique fields that can trigger \"\n                    \"the upsert.\"\n                )\n            if not unique_fields and db_features.supports_update_conflicts_with_target:\n                raise ValueError(\n                    \"Unique fields that can trigger the upsert must be provided.\"\n                )\n            # Updating primary keys and non-concrete fields is forbidden.\n            update_fields = [self.model._meta.get_field(name) for name in update_fields]\n            if any(not f.concrete or f.many_to_many for f in update_fields):\n                raise ValueError(\n                    \"bulk_create() can only be used with concrete fields in \"\n                    \"update_fields.\"\n                )\n            if any(f.primary_key for f in update_fields):\n                raise ValueError(\n                    \"bulk_create() cannot be used with primary keys in \"\n                    \"update_fields.\"\n                )\n            if unique_fields:\n                unique_fields = [\n                    self.model._meta.get_field(name) for name in unique_fields\n                ]\n                if any(not f.concrete or f.many_to_many for f in unique_fields):\n                    raise ValueError(\n                        \"bulk_create() can only be used with concrete fields \"\n                        \"in unique_fields.\"\n                    )\n            return OnConflict.UPDATE\n        return None"
      },
      {
        "file": "django/db/models/query.py",
        "type": "function",
        "name": "bulk_create",
        "class_name": "QuerySet",
        "code": "def bulk_create(\n        self,\n        objs,\n        batch_size=None,\n        ignore_conflicts=False,\n        update_conflicts=False,\n        update_fields=None,\n        unique_fields=None,\n    ):\n        \"\"\"\n        Insert each of the instances into the database. Do *not* call\n        save() on each of the instances, do not send any pre/post_save\n        signals, and do not set the primary key attribute if it is an\n        autoincrement field (except if features.can_return_rows_from_bulk_insert=True).\n        Multi-table models are not supported.\n        \"\"\"\n        # When you bulk insert you don't get the primary keys back (if it's an\n        # autoincrement, except if can_return_rows_from_bulk_insert=True), so\n        # you can't insert into the child tables which references this. There\n        # are two workarounds:\n        # 1) This could be implemented if you didn't have an autoincrement pk\n        # 2) You could do it by doing O(n) normal inserts into the parent\n        #    tables to get the primary keys back and then doing a single bulk\n        #    insert into the childmost table.\n        # We currently set the primary keys on the objects when using\n        # PostgreSQL via the RETURNING ID clause. It should be possible for\n        # Oracle as well, but the semantics for extracting the primary keys is\n        # trickier so it's not done yet.\n        if batch_size is not None and batch_size <= 0:\n            raise ValueError(\"Batch size must be a positive integer.\")\n        # Check that the parents share the same concrete model with the our\n        # model to detect the inheritance pattern ConcreteGrandParent ->\n        # MultiTableParent -> ProxyChild. Simply checking self.model._meta.proxy\n        # would not identify that case as involving multiple tables.\n        for parent in self.model._meta.get_parent_list():\n            if parent._meta.concrete_model is not self.model._meta.concrete_model:\n                raise ValueError(\"Can't bulk create a multi-table inherited model\")\n        if not objs:\n            return objs\n        opts = self.model._meta\n        if unique_fields:\n            # Primary key is allowed in unique_fields.\n            unique_fields = [\n                opts.pk.name if name == \"pk\" else name for name in unique_fields\n            ]\n        on_conflict = self._check_bulk_create_options(\n            ignore_conflicts,\n            update_conflicts,\n            update_fields,\n            unique_fields,\n        )\n        self._for_write = True\n        fields = opts.concrete_fields\n        objs = list(objs)\n        self._prepare_for_bulk_create(objs)\n        with transaction.atomic(using=self.db, savepoint=False):\n            objs_with_pk, objs_without_pk = partition(lambda o: o.pk is None, objs)\n            if objs_with_pk:\n                returned_columns = self._batched_insert(\n                    objs_with_pk,\n                    fields,\n                    batch_size,\n                    on_conflict=on_conflict,\n                    update_fields=update_fields,\n                    unique_fields=unique_fields,\n                )\n                for obj_with_pk, results in zip(objs_with_pk, returned_columns):\n                    for result, field in zip(results, opts.db_returning_fields):\n                        if field != opts.pk:\n                            setattr(obj_with_pk, field.attname, result)\n                for obj_with_pk in objs_with_pk:\n                    obj_with_pk._state.adding = False\n                    obj_with_pk._state.db = self.db\n            if objs_without_pk:\n                fields = [f for f in fields if not isinstance(f, AutoField)]\n                returned_columns = self._batched_insert(\n                    objs_without_pk,\n                    fields,\n                    batch_size,\n                    on_conflict=on_conflict,\n                    update_fields=update_fields,\n                    unique_fields=unique_fields,\n                )\n                connection = connections[self.db]\n                if (\n                    connection.features.can_return_rows_from_bulk_insert\n                    and on_conflict is None\n                ):\n                    assert len(returned_columns) == len(objs_without_pk)\n                for obj_without_pk, results in zip(objs_without_pk, returned_columns):\n                    for result, field in zip(results, opts.db_returning_fields):\n                        setattr(obj_without_pk, field.attname, result)\n                    obj_without_pk._state.adding = False\n                    obj_without_pk._state.db = self.db\n\n        return objs"
      },
      {
        "file": "django/db/models/sql/compiler.py",
        "type": "function",
        "name": "as_sql",
        "class_name": "SQLAggregateCompiler",
        "code": "def as_sql(self):\n        \"\"\"\n        Create the SQL for this query. Return the SQL string and list of\n        parameters.\n        \"\"\"\n        sql, params = [], []\n        for annotation in self.query.annotation_select.values():\n            ann_sql, ann_params = self.compile(annotation)\n            ann_sql, ann_params = annotation.select_format(self, ann_sql, ann_params)\n            sql.append(ann_sql)\n            params.extend(ann_params)\n        self.col_count = len(self.query.annotation_select)\n        sql = \", \".join(sql)\n        params = tuple(params)\n\n        inner_query_sql, inner_query_params = self.query.inner_query.get_compiler(\n            self.using,\n            elide_empty=self.elide_empty,\n        ).as_sql(with_col_aliases=True)\n        sql = \"SELECT %s FROM (%s) subquery\" % (sql, inner_query_sql)\n        params += inner_query_params\n        return sql, params"
      }
    ]
  },
  "Justification": "Candidate C is the most relevant as it addresses a similar functionality, specifically `bulk_create()`, and concerns the unique fields which directly relates to the CURRENT bug report. It highlights issues with conflicting field names and how they affect SQL generation, similar to the CURRENT issue of handling unique fields in the context of conflict updates. The shared focus on `bulk_create()` and its intricacies in PostgreSQL gives valuable insights into possible corrections needed in the CURRENT bug, making it the best choice for debugging.",
  "instance_id": "django__django-17051",
  "repo": "django/django",
  "created_at": "2023-07-07T11:01:09Z",
  "problem_statement": "Allow returning IDs in QuerySet.bulk_create() when updating conflicts.\nDescription\n\t\nCurrently, when using bulk_create with a conflict handling flag turned on (e.g. ignore_conflicts or update_conflicts), the primary keys are not set in the returned queryset, as documented in bulk_create.\nWhile I understand using ignore_conflicts can lead to PostgreSQL not returning the IDs when a row is ignored (see ​this SO thread), I don't understand why we don't return the IDs in the case of update_conflicts.\nFor instance:\nMyModel.objects.bulk_create([MyModel(...)], update_conflicts=True, update_fields=[...], unique_fields=[...])\ngenerates a query without a RETURNING my_model.id part:\nINSERT INTO \"my_model\" (...)\nVALUES (...)\n\tON CONFLICT(...) DO UPDATE ...\nIf I append the RETURNING my_model.id clause, the query is indeed valid and the ID is returned (checked with PostgreSQL).\nI investigated a bit and ​this in Django source is where the returning_fields gets removed.\nI believe we could discriminate the cases differently so as to keep those returning_fields in the case of update_conflicts.\nThis would be highly helpful when using bulk_create as a bulk upsert feature.\n",
  "patch": "diff --git a/django/db/models/query.py b/django/db/models/query.py\n--- a/django/db/models/query.py\n+++ b/django/db/models/query.py\n@@ -1837,12 +1837,17 @@ def _batched_insert(\n         inserted_rows = []\n         bulk_return = connection.features.can_return_rows_from_bulk_insert\n         for item in [objs[i : i + batch_size] for i in range(0, len(objs), batch_size)]:\n-            if bulk_return and on_conflict is None:\n+            if bulk_return and (\n+                on_conflict is None or on_conflict == OnConflict.UPDATE\n+            ):\n                 inserted_rows.extend(\n                     self._insert(\n                         item,\n                         fields=fields,\n                         using=self.db,\n+                        on_conflict=on_conflict,\n+                        update_fields=update_fields,\n+                        unique_fields=unique_fields,\n                         returning_fields=self.model._meta.db_returning_fields,\n                     )\n                 )\n"
}