💾Django ORM Queryset

Django ORM querysets

Working With ORM

Django ORM is a powerful way to query the database without writing SQL. QuerySets doesn’t actually involve any activity with the database. Querysets are lazy, Django will only read data from the database if the QuerySet is evaluated.

Basic Filters

# Filters
Race.objects.filter(year=2021)

#  Filter on related fields (foreign keys)
Race.objects.filter(year=2021, circuit__country="Spain")

# Select fields using .values() or .values_list()
# .values() will return a QuerySet of dictionaries and 
# .values_list() will return a QuerySet of tuples.

# .values()
Race.objects.filter(year=2021).values("circuit__country", "date")
<QuerySet [{'circuit__country': 'Italy', 'date': datetime.date(2021, 4, 18)}, {'circuit__country': 'Bahrain', 'date': datetime.date(2021, 3, 28)}]>

# .values_list()
Race.objects.filter(year=2021).values_list("circuit__country", "date")
<QuerySet [('Italy', datetime.date(2021, 4, 18)), ('Bahrain', datetime.date(2021, 3, 28))]>

# Convert QuerySets into a List
races = Race.objects.filter(year=2021).values_list("name", flat=True)
<QuerySet ['Emilia Romagna Grand Prix', 'Bahrain Grand Prix']>


# Sort Querysets: using .order_by() to sort the results.

Driver.objects.all().order_by("-date_of_birth").values_list("first_name", "last_name", "date_of_birth")

# Limiting (slicing) results
Driver.objects.all().order_by("date_of_birth").values_list("first_name", "last_name", "date_of_birth")[:5]

# Count results
Race.objects.filter(year=2021).count()

# Check if results exist
Race.objects.filter(year=2022).exists()

# Greater than / less than
# We can use __gt, __gte, __lt & __lte to perform ‘greater than’ and ‘less than’ queries.

drivers = (
    Driver.objects.filter(date_of_birth__gte="1998-01-01")
    .order_by("date_of_birth")
    .values_list("first_name", "last_name", "date_of_birth")
)

# To add OR conditions to your QuerySets
# We can use Q objects to create OR conditions in your QuerySets.

from django.db.models import Q

i = Race.objects.filter(
    Q(circuit__country="France") | Q(circuit__country="Spain"), year=2021
)

# Filter with lists
countries = ["France", "Spain", "Portugal", "Belgium", "Netherlands"]

races = Race.objects.filter(year=2021, circuit__country__in=countries)

# Get distinct values
# The .distinct() method can be used to make sure no value of the supplied field appears in the QuerySet more than once.

countries = (
    Race.objects.filter(year=2020)
    .distinct("circuit__country")
    .order_by("circuit__country")
    .values_list("circuit__country", flat=True)
)
<QuerySet ['Austria', 'Bahrain', 'Belgium', 'Germany']>

# Exclude data
# The .exclude() method can be used to remove results from a QuerySet.

nationalities = (
    Driver.objects.all()
    .exclude(nationality__in=["British", "American"])
    .distinct("nationality")
    .order_by("nationality")
    .values_list("nationality", flat=True)
)

# Earliest and Latest
# We can use .earliest and .latest on date fields. Instead of returning a QuerySet, they return a single result.

oldest = Driver.objects.earliest("date_of_birth")
youngest = Driver.objects.latest("date_of_birth")

# First and Last
# .first() and .last() can be used to return the first and last results of a QuerySet.

first = Race.objects.all().order_by("date").first()
last = Race.objects.all().order_by("date").last()

Annotations

Annotations are useful for generating statistics from your data.

circle-info

What do annotations do?

  • Add additional fields to each result of the queryset, calculated from other fields

  • Group results, similar to SQLs GROUP BY expression.

Applying .annotate() on an unevaluated queryset will apply the annotation to all results. If you evaluate the queryset using .values() or values_list() first and apply .annotate(), then this will have a grouping effect.

circle-info

Annotations will apply the expression to all results in the queryset, whereas aggregates will condense the data into one result.

Group Querysets with Django ORM

With .annotate() it is possible to group data, end up with fewer results than the original queryset. However, we will have to evaluate it using .values() first to get the grouping effect.

ExpressionWrapper

Aggregation with Filter

Prior to Django 2.0 if we wanted to get something like the total number of users and the total number of active users we had to resort to conditional expressions:

In Django 2.0 a filter argument_to_aggregate_functions was added to make it a lot easier:

Queryset results as namedtuples

Statement Timeout

We are all humans and we make mistakes, We can't possibly handle each and every edge case so we mush set boundaries.

Unlike other not-blocking app servers such as Torando, asyncio or even Node, Django usually uses synchronous worker processes. This means that When a user executes a long running operations, the worker process is blocked and no onse else can use it untill it is done.

None one should run worker process but we will still want to make sure a single query is not hogging too much resources for too long.

In most Django apps the majority of time is spent waiting for datbase queries. So, setting a timeout on SQL queries is a good place to start.

We like setting a global timeout in our wsgi.py file like this:

Side Note: The other common place we spent a lot of time at is networking. So make sure when you call a remote service to always set a timeout.

Last updated