#34698 closed New feature (fixed)

Allow returning IDs in QuerySet.bulk_create() when updating conflicts.

Reported by: Thomas C Owned by: Thomas C
Component: Database layer (models, ORM) Version: 4.2
Severity: Normal Keywords:
Cc: Chih Sean Hsu Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Currently, when using bulk_create with a conflict handling flag turned on (e.g. ignore_conflicts or update_conflicts), the primary keys are not set in the returned queryset, as documented in bulk_create.

While I understand using ignore_conflicts can lead to PostgreSQL not returning the IDs when a row is ignored (see this SO thread), I don't understand why we don't return the IDs in the case of update_conflicts.

For instance:

MyModel.objects.bulk_create([MyModel(...)], update_conflicts=True, update_fields=[...], unique_fields=[...])

generates a query without a RETURNING my_model.id part:

INSERT INTO "my_model" (...)
VALUES (...)
    ON CONFLICT(...) DO UPDATE ...

If I append the RETURNING my_model.id clause, the query is indeed valid and the ID is returned (checked with PostgreSQL).

I investigated a bit and this in Django source is where the returning_fields gets removed.

I believe we could discriminate the cases differently so as to keep those returning_fields in the case of update_conflicts.

This would be highly helpful when using bulk_create as a bulk upsert feature.

Change History (6)

comment:1 by Mariusz Felisiak, 10 months ago

Cc: Chih Sean Hsu added
Summary: Allow returning IDs in bulk_create with conflicts handlingAllow returning IDs in QuerySet.bulk_create() when updating conflicts.
Triage Stage: UnreviewedAccepted

Thanks for the ticket. I've checked and it works on PostgreSQL, MariaDB 10.5+, and SQLite 3.35+:

  • django/db/models/query.py

    diff --git a/django/db/models/query.py b/django/db/models/query.py
    index a5b0f464a9..f1e052cb36 100644
    a b class QuerySet(AltersData):  
    18371837        inserted_rows = []
    18381838        bulk_return = connection.features.can_return_rows_from_bulk_insert
    18391839        for item in [objs[i : i + batch_size] for i in range(0, len(objs), batch_size)]:
    1840             if bulk_return and on_conflict is None:
     1840            if bulk_return and (on_conflict is None or on_conflict == OnConflict.UPDATE):
    18411841                inserted_rows.extend(
    18421842                    self._insert(
    18431843                        item,
    18441844                        fields=fields,
    18451845                        using=self.db,
     1846                        on_conflict=on_conflict,
     1847                        update_fields=update_fields,
     1848                        unique_fields=unique_fields,
    18461849                        returning_fields=self.model._meta.db_returning_fields,
    18471850                    )
    18481851                )

Would you like to prepare a patch via GitHub PR? (docs changes and tests are required)

comment:2 by Thomas C, 10 months ago

Owner: changed from nobody to Thomas C
Status: newassigned

Sure I will.

in reply to:  2 comment:3 by Mariusz Felisiak, 10 months ago

Replying to Thomas C:

Sure I will.

Thanks. About tests, it should be enough to add some assertions to existing tests: _test_update_conflicts_two_fields(), test_update_conflicts_unique_fields_pk(), _test_update_conflicts_unique_two_fields(), _test_update_conflicts(), and test_update_conflicts_unique_fields_update_fields_db_column() when connection.features.can_return_rows_from_bulk_insert is True.

comment:5 by Mariusz Felisiak, 10 months ago

Triage Stage: AcceptedReady for checkin

comment:6 by Mariusz Felisiak <felisiak.mariusz@…>, 10 months ago

Resolution: fixed
Status: assignedclosed

In 89c7454:

Fixed #34698 -- Made QuerySet.bulk_create() retrieve primary keys when updating conflicts.

Note: See TracTickets for help on using tickets.
Back to Top