The definitive guide to modeling polymorphism in Django

The definitive guide to modeling polymorphism in Django

Polymorphism allows you to use one type of object to work with multiple kinds of data.

Django makes it super easy to model your database, including polymorphism.

However, it also makes it too easy to design a poor performing database.

In this post, I will show you:

  1. Different ways to model polymorphism in Django.
  2. The difference between them in terms of complexity, and efficiency.
  3. Which one you should use depending on your problem domain.

The situation

We’re building a social media application where users will be able to create and view posts.

Each post can contain either:

  • Text.
  • Youtube video embed.
  • An image.
  • A URL.

We need to model this using the Django ORM. The user will need to be able to see a feed consisting of a mix of types of posts on the site.

The full source code for the examples in this posts is at this repository . I encourage you to clone it and run it locally to get a more concrete view.

Modeling without Polymorphism

First, let’s attempt to solve this problem without using polymorphism in order to better illustrate why we would want polymorphism in the first place.

We will create a model for each of the post types:

from django.db import models
from django.conf import settings

class Post(models.Model):
    user = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.CASCADE)
    title = models.CharField(max_length=155)
    date_added = models.DateTimeField(auto_now_add=True)

    class Meta:
        ordering = ["-date_added"]
        abstract = True

    def __str__(self):
        return f"{self.title}"

class TextPost(Post):
    body = models.TextField(max_length=10000)

    def __str__(self):
        return f'Text post "{self.title[:30]}" by {self.user.username}'

class YoutubeEmbedPost(Post):
    video_id = models.CharField(max_length=24)

def user_directory_path(instance, filename):
    return "user_{0}/{1}".format(instance.user.id, filename)

class ImagePost(Post):
    image = models.ImageField(upload_to=user_directory_path)

class URLPost(Post):
    url = models.URLField()

This looks simple and intuitive enough.

How would you retrieve these objects to display in a feed?

Let’s try:

from django.core.paginator import Paginator
from django.template.response import TemplateResponse

from . import models

def post_list(request):
    ctx = {}
    posts = []
    for text_post in models.TextPost.objects.all():
        posts.append({"post": text_post, "post_type": "text"})

    for youtube_post in models.YoutubeEmbedPost.objects.all():
        posts.append({"post": youtube_post, "post_type": "youtube"})

    for image_post in models.ImagePost.objects.all():
        posts.append({"post": image_post, "post_type": "image"})

    for url_post in models.URLPost.objects.all():
        posts.append({"post": url_post, "post_type": "url"})

    page = request.GET.get("page", 1)
    paginator = Paginator(posts, 10)
    page_obj = paginator.page(page)
    ctx.update({"page_obj": page_obj, "posts": page_obj.object_list})
    return TemplateResponse(request, "socialnotpolymorphic/list.html", ctx)

This is our list view.

Hmm, pagination looks quite complicated. How would you paginate in a way that allows you to get all of the post types on the page?

You’d have to implement some clever logic in your view. The more logic in your view, the slower it gets.

Here’s the template:

<h1>All posts</h1>
{% for post_data in posts %}
    {% with post=post_data.post post_type=post_data.post_type %}
        <h2>{{ post.title }}</h2>
        {% if post_type == "text" %}
            <p>{{ post.body }}</p>
        {% elif post_type == "image" %}
            <img src="{{ post.image.url }}" alt="{{ post.title }}">
        {% elif post_type == "youtube" %}
            {% include "_youtube-embed.html" with video_id=post.video_id %}
        {% elif post_type == "url" %}
            <a href="{{ post.url }}">{{ post.url }}</a>
        {% endif %}
    {% endwith %}

{% endfor %}
{% include '_pagination.html' %}

With the accompanying view, we will only have a single type of post (Text or Youtube or Image or URL) on a bunch of pages because of how we bundled up all QuerySets — not good at all.

While this implementation has a simple data model, it’s not practical because you can’t query and present the data efficiently.

We could have different list views for each type of post, but it’s not what our users want.

Object creation isn’t simple either because you need a different ModelForm for each model.

In short, our problem can’t be solved without polymorphism.

It would be best if we had a single model that we could query and display.

Using concrete inheritance

Instead of querying different models, we can use a concrete base model that we inherit from:

from django.db import models
from django.conf import settings

class Post(models.Model):
    user = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.CASCADE)
    title = models.CharField(max_length=155)
    date_added = models.DateTimeField(auto_now_add=True)

    class Meta:
        ordering = ["-date_added"]

    def __str__(self):
        return f"{self.title}"

class TextPost(Post):
    body = models.TextField(max_length=10000)

class YoutubeEmbedPost(Post):
    video_id = models.CharField(max_length=24)

def user_directory_path(instance, filename):
    return "user_{0}/{1}".format(instance.user.id, filename)

class ImagePost(Post):
    image = models.ImageField(upload_to=user_directory_path)

class URLPost(Post):
    url = models.URLField()

It’s the same as above, except this time, we’re using a concrete instead of abstract class. This allows us to query the Post model directly instead of having to query each model individually.

The view is now simpler:

from django.core.paginator import Paginator
from django.template.response import TemplateResponse

from .models import Post

def post_list(request):
    ctx = {}
    paginator = Paginator(Post.objects.all(), 10)
    page_obj = paginator.page(request.GET.get("page", 1))
    ctx.update({"posts": page_obj.object_list, "page_obj": page_obj})
    return TemplateResponse(request, "socialconcrete/list.html", ctx)

However, since we’re querying the Post model, we need a way to determine what type of post it is, in order to render it in the template.

For that, we can use a template tag:

from django import template

register = template.Library()

@register.simple_tag()
def get_actual_post(post):
    data = {}

    try:
        data["obj"] = post.textpost
        data["post_type"] = "text"
    except post.DoesNotExist:
        try:
            data["obj"] = post.youtubeembedpost
            data["post_type"] = "youtube"
        except post.DoesNotExist:
            try:
                data["obj"] = post.urlpost
                data["post_type"] = "url"
            except post.DoesNotExist:
                try:
                    data["obj"] = post.imagepost
                    data["post_type"] = "image"
                except post.DoesNotExist as e:
                    raise post.DoesNotExist from e

    return data

Given the post, we try to retrieve the actual post that we need.

Then we can render the list as follows:

{% extends "base.html" %}
{% load socialconcrete_tags %}

{% block content %}
    {% for post in posts %}
        <h2>{{ post.title }}</h2>
        {% get_actual_post post as post_data %}
        {% with post_type=post_data.post_type actual_post=post_data.obj %}
            {% if post_type == "text" %}
                <p>{{ actual_post.body }}</p>
            {% elif post_type == "image" %}
                <img src="{{ actual_post.image.url }}" alt="{{ actual_post.title }}">
            {% elif post_type == "youtube" %}
                {% include "_youtube-embed.html" with video_id=actual_post.video_id %}
            {% elif post_type == "url" %}
                <a href="{{ actual_post.url }}">{{ actual_post.url }}</a>
            {% endif %}
        {% endwith %}

    {% endfor %}
{% include '_pagination.html' %}
{% endblock %}

While this solves the problem we had with pagination and showing all types of posts on the page, it’s much more computationally intensive.

According to Django debug toolbar, the page takes about 511ms to render and required 29 SQL queries.

If you look at the SQL panel in debug toolbar, you’ll see a ton of SQL JOINS being made.

Using this approach, your app will use a ton of resources to serve only a few users and might even crash from time to time.

We can use template caching as a way to put a band-aid on this issue but we can do better.

Creating objects

Creating objects requires a ModelForm for each model:

class ModelFormWithUser(forms.ModelForm):
    def __init__(self, *args, **kwargs):
        user = kwargs.pop("user")
        super(ModelFormWithUser, self).__init__(*args, **kwargs)
        self.instance.user = user

class TextPostForm(ModelFormWithUser):
    class Meta:
        model = TextPost
        fields = ["title", "body"]

class ImagePostForm(ModelFormWithUser):
    class Meta:
        model = ImagePost
        fields = ["title", "image"]

class YoutubePostForm(ModelFormWithUser):
    class Meta:
        model = YoutubeEmbedPost
        fields = ["title", "video_id"]

class UrlPostForm(ModelFormWithUser):
    class Meta:
        model = URLPost
        fields = ["title", "url"]

And a corresponding view:

@login_required
def create_post(request, post_type):
    ctx = {}

    if post_type == "text":
        form_class = TextPostForm
    elif post_type == "image":
        form_class = ImagePostForm
    elif post_type == "youtube":
        form_class = YoutubePostForm
    else:
        form_class = UrlPostForm

    form = None

    if request.method == "POST":

        form = form_class(request.POST, request.FILES, user=request.user)

        if form.is_valid():
            form.save()
            return redirect("socialconcrete:list")

    if form is None:
        form = form_class(user=request.user)

    ctx["form"] = form
    return TemplateResponse(request, "socialconcrete/create.html", ctx)

We instantiate the form class corresponding to the type of post the user is trying to work with.

Using OneToOneField

Let’s try to reduce CPU usage by using a OneToOneField instead of concrete inheritance.

Under the hood, concrete inheritance will automatically create OneToOne relationships with the child models. Doing it explicitly gives us more control and avoids SQL JOINS, as you shall soon see.

from django.db import models
from django.conf import settings

class TextPost(models.Model):
    body = models.TextField(max_length=10000)

class YoutubeEmbedPost(models.Model):
    video_id = models.CharField(max_length=24)

class ImagePost(models.Model):
    image = models.ImageField()

class URLPost(models.Model):
    url = models.URLField()

class Post(models.Model):
    user = models.ForeignKey(
        settings.AUTH_USER_MODEL,
        on_delete=models.CASCADE,
        related_name="socialonetoone_posts",
    )
    title = models.CharField(max_length=155)
    date_added = models.DateTimeField(auto_now_add=True)

    text_post = models.OneToOneField(TextPost, on_delete=models.CASCADE, null=True)
    youtube_post = models.OneToOneField(
        YoutubeEmbedPost, on_delete=models.CASCADE, null=True
    )
    image_post = models.OneToOneField(ImagePost, on_delete=models.CASCADE, null=True)
    url_post = models.OneToOneField(URLPost, on_delete=models.CASCADE, null=True)

    class Meta:
        ordering = ["-date_added"]

    def __str__(self):
        return f"{self.title}"

And our view is the same as for concrete inheritance because we have a single Post model to query:

from django.core.paginator import Paginator
from django.template.response import TemplateResponse

from .models import Post

def post_list(request):
    ctx = {}
    paginator = Paginator(Post.objects.all(), 10)
    page_obj = paginator.page(request.GET.get("page", 1))
    ctx.update({"posts": page_obj.object_list, "page_obj": page_obj})
    return TemplateResponse(request, "socialonetoone/list.html", ctx)

We don’t need a template tag because we can simply check whether the fields are truthy:

{% extends "base.html" %}

{% block content %}
    {% for post in posts %}
        <h2>{{ post.title }}</h2>

        {% if post.text_post %}
            <p>{{ post.text_post.body }}</p>
        {% elif post.image_post %}
            <img src="{{ post.image_post.image.url }}" alt="{{ post.title }}">
        {% elif post.youtube_post %}
            {% include "_youtube-embed.html" with video_id=post.youtube_post.video_id %}
        {% elif post.url_post %}
            <a href="{{ post.url_post.url }}">{{ post.url_post.url }}</a>
        {% endif %}
    {% endfor %}
{% include '_pagination.html' %}
{% endblock %}

The results:

  • ~220ms vs the previous ~511ms CPU time
  • 12 SQL queries compared to 29 from before.

The SQL panel in debug toolbar doesn’t show any SQL JOINS at all, which is excellent.

Sometimes, SQL JOINS are useful because they actually reduce the number of queries by allowing you fetch data from multiple tables in a single query.

When designing your database, you need to use JOINS strategically because they cost CPU time.

Since SQL queries are so expensive, maybe we can try to do even better.

Creating objects

Since we have to save 2 models to create a post, creating objects is slightly more complicated than using a ModelForm as is.

class PostForm(forms.ModelForm):
    def __init__(self, *args, **kwargs):
        user = kwargs.pop("user")
        post_type = kwargs.pop("post_type")
        super(PostForm, self).__init__(*args, **kwargs)
        self.instance.user = user
        self.post_type = post_type

        if post_type == "text":
            self.fields["body"] = forms.CharField(
                widget=forms.Textarea(), max_length=10_000
            )
        elif post_type == "image":
            self.fields["image"] = forms.ImageField()
        elif post_type == "youtube":
            self.fields["video_id"] = forms.CharField(max_length=25)
        elif post_type == "url":
            self.fields["url"] = forms.URLField
        else:
            raise ValueError("Unknown post type.")

    def clean(self):
        cleaned_data = super(PostForm, self).clean()
        if self.post_type == "text":
            text_post = TextPost(body=cleaned_data["body"])
            text_post.full_clean()
            text_post.save()
            self.instance.text_post = text_post
        elif self.post_type == "image":
            image_post = ImagePost(image=cleaned_data["image"])
            image_post.full_clean()
            image_post.save()
            self.instance.image_post = image_post
        elif self.post_type == "youtube":
            youtube_post = YoutubeEmbedPost(video_id=cleaned_data["video_id"])
            youtube_post.full_clean()
            youtube_post.save()
            self.instance.youtube_post = youtube_post
        else:
            url_post = URLPost(url=cleaned_data["url"])
            url_post.full_clean()
            url_post.save()
            self.instance.url_post = url_post
        return cleaned_data

    class Meta:
        model = Post
        fields = ["title"]

And the view:

@login_required
def create_post(request, post_type):
    ctx = {}
    form = None
    kwargs = {"user": request.user, "post_type": post_type}
    if request.method == "POST":
        form = PostForm(request.POST, request.FILES, **kwargs)

        if form.is_valid():
            form.save()
            return redirect("socialonetoone:list")
    if not form:
        form = PostForm(**kwargs)
    ctx["form"] = form
    return TemplateResponse(request, "socialonetoone/create.html", ctx)

We have to create the corresponding post type before saving the post. I decided to do this in the clean method to avoid changing core functionality of the ModelForm. If the form is valid, we can simply save it and every corresponding model will be created.

Another option would be to use different views to save post types and posts with some sort of finite state machine to make sure that post types are always linked to posts (to prevent orphans) and vice versa. However, this is more complicated to implement and harder to maintain as well. Using basic Forms and ModelForms are favorable 99% of the time.

One model to rule them all

Instead of using relationships, we can use a single model:

from django.core.exceptions import ValidationError
from django.db import models
from django.conf import settings

class Post(models.Model):
    TEXT_POST_TYPE = 0
    YOUTUBE_POST_TYPE = 1
    IMAGE_POST_TYPE = 2
    URL_POST_TYPE = 3

    POST_TYPE_CHOICES = (
        (TEXT_POST_TYPE, "Text"),
        (YOUTUBE_POST_TYPE, "Youtube"),
        (IMAGE_POST_TYPE, "Image"),
        (URL_POST_TYPE, "URL"),
    )
    user = models.ForeignKey(
        settings.AUTH_USER_MODEL,
        on_delete=models.CASCADE,
        related_name="socialonlyone_posts",
    )
    title = models.CharField(max_length=155)
    post_type = models.PositiveSmallIntegerField(choices=POST_TYPE_CHOICES)
    date_added = models.DateTimeField(auto_now_add=True)

    text_body = models.TextField(max_length=10000, null=True, blank=True)
    youtube_video_id = models.CharField(max_length=22, null=True, blank=True)
    image = models.ImageField(null=True, blank=True)
    url = models.URLField(null=True, blank=True)

    class Meta:
        ordering = ["-date_added"]

    def __str__(self):
        return f"{self.title}"

    def clean(self):
        if self.post_type == self.TEXT_POST_TYPE:
            if self.text_body is None:
                raise ValidationError("Body can't be empty")

            if self.youtube_video_id or self.image or self.url:
                raise ValidationError("Text posts can only have body.")

        elif self.post_type == self.YOUTUBE_POST_TYPE:
            if self.youtube_video_id is None:
                raise ValidationError("You need to provide a video ID.")

            if self.text_body or self.image or self.url:
                raise ValidationError("Youtube posts can only have video id.")

        elif self.post_type == self.IMAGE_POST_TYPE:
            if self.image is None:
                raise ValidationError("Image required.")

            if self.text_body or self.youtube_video_id or self.url:
                raise ValidationError("Image posts can only have image")
        elif self.post_type == self.URL_POST_TYPE:
            if self.url is None:
                raise ValidationError("Url required.")
            if self.text_body or self.youtube_video_id or self.image:
                raise ValidationError("Url posts can only have url")
        return super(Post, self).clean()

The issue with this solution is that you need complex validation and more storage space (because of null columns) as your database grows. In this example, we have simple fields like text_body and image but sometimes, you might have additional metadata for each post types. In these cases, the single model will become quite bloated and inefficient to store.

We can use the same view as above since we’re still querying a single model.

And the template is simpler:

{% extends "base.html" %}

{% block content %}
    {% for post in posts %}
        <h2>{{ post.title }}</h2>
        {% if post.text_body %}
            <p>{{ post.text_body }}</p>
        {% elif post.image %}
            <img src="{{ post.image.url }}" alt="{{ post.title }}">
        {% elif post.youtube_video_id %}
            {% include "_youtube-embed.html" with video_id=post.youtube_video_id %}
        {% elif post.url %}
            <a href="{{ post.url }}">{{ post.url }}</a>
        {% endif %}
    {% endfor %}
{% include "_pagination.html" %}
{% endblock %}

The results?

  • Only 46ms required to render the list.
  • With 2 SQL queries!

While the data storage is a little inefficient, your app will perform much better if you use a single model. After all, you’re working with a single database table.

Storage is usually much cheaper than CPU usage, so, you might decide to go this route when polymorphism becomes necessary.

Creating objects

Since this is a single model, we can simply use a dynamic ModelForm to save posts.

class PostForm(forms.ModelForm):
    def __init__(self, *args, **kwargs):
        user = kwargs.pop("user")
        post_type = kwargs.pop("post_type")
        super(PostForm, self).__init__(*args, **kwargs)
        self.instance.user = user
        self.instance.post_type = post_type
        if post_type == Post.TEXT_POST_TYPE:
            self.fields["text_body"].required = True
            del self.fields["youtube_video_id"]
            del self.fields["image"]
            del self.fields["url"]
        elif post_type == Post.IMAGE_POST_TYPE:
            self.fields["image"].required = True
            del self.fields["youtube_video_id"]
            del self.fields["text_body"]
            del self.fields["url"]
        elif Post.YOUTUBE_POST_TYPE:
            self.fields["youtube_video_id"].required = True
            del self.fields["text_body"]
            del self.fields["image"]
            del self.fields["url"]
        elif Post.URL_POST_TYPE:
            self.fields["url"].required = True
            del self.fields["youtube_video_id"]
            del self.fields["image"]
            del self.fields["text_body"]
        else:
            raise ValueError("Unknown post type")

    class Meta:
        model = Post
        fields = ["title", "text_body", "youtube_video_id", "image", "url"]

With this view:

@login_required
def create_post(request, post_type):
    ctx = {}

    form = None
    kwargs = {"user": request.user, "post_type": post_type}

    if request.method == "POST":
        form = PostForm(request.POST, request.FILES, **kwargs)

        if form.is_valid():
            form.save()
            return redirect("socialonlyone:list")

    if form is None:
        form = PostForm(**kwargs)
    ctx["form"] = form
    return TemplateResponse(request, "socialonlyone/create.html", ctx)

We instantiate a ModelForm with fields corresponding to the type of post the user wants to work with. Piece of cake.

Using Content Types

Django has a built-in method for handling complex cases of polymorphism in the contenttypes contrib module.

Let’s try it out:

from django.contrib.contenttypes.fields import GenericForeignKey
from django.contrib.contenttypes.models import ContentType
from django.db import models
from django.conf import settings

class Post(models.Model):
    user = models.ForeignKey(
        settings.AUTH_USER_MODEL,
        on_delete=models.CASCADE,
        related_name="socialcontenttypes_posts",
    )
    title = models.CharField(max_length=155)
    date_added = models.DateTimeField(auto_now_add=True)

    post_content_type = models.ForeignKey(ContentType, on_delete=models.CASCADE)
    post_object_id = models.PositiveBigIntegerField()
    content_object = GenericForeignKey("post_content_type", "post_object_id")

    class Meta:
        ordering = ["-date_added"]

    def __str__(self):
        return f"{self.title}"

class TextPost(models.Model):
    POST_TYPE = "text"
    body = models.TextField(max_length=10000)

class YoutubeEmbedPost(models.Model):
    POST_TYPE = "youtube"
    video_id = models.CharField(max_length=24)

class ImagePost(models.Model):
    POST_TYPE = "image"
    image = models.ImageField()

class URLPost(models.Model):
    POST_TYPE = "url"
    url = models.URLField()

And a simple template:

{% extends "base.html" %}

{% block content %}
    {% for post in posts %}
        <h2>{{ post.title }}</h2>
        {% with post.content_object.POST_TYPE as post_type %}
            {% if post_type == "text" %}
                <p>{{ post.content_object.body }}</p>
            {% elif post_type == "image" %}
                <img src="{{ post.content_object.image.url }}" alt="{{ post.title }}">
            {% elif post_type == "youtube" %}
                {% include "_youtube-embed.html" with video_id=post.content_object.video_id %}
            {% elif post_type == "url" %}
                <a href="{{ post.content_object.url }}">{{ post.content_object.url }}</a>
            {% endif %}
        {% endwith %}

    {% endfor %}
    {% include "_pagination.html" %}
{% endblock %}

Results:

  • 215ms to render
  • 12 SQL queries

Not bad at all but still worse than our single model solution and not much better than OneToOneField.

However, this method is very powerful and can handle any kind of complex relationship you need. If you add more types of post later on, you won’t have to change anything in the base model because the relationship to the post types is generic.

Many people don’t like the GenericForeignKey method because it results in a complicated database schema. For example, someone unfamiliar with Django might not be able to make sense of the database just by looking at the tables because post_content_type , post_object_id , content_object are ambiguous — it can be difficult to track down which rows related to which tables.

When choosing this method, you need to determine whether the added complexity is worth it.

In this case, I would say that contenttypes isn’t necessary because there will always be an finite number of post types we can relate to. If I add 1 or 2 post types later on, modifying the base model to support these will not be an issue.

Creating objects

This method requires the same technique as OneToOneField because we need to create a corresponding post type instance before creating a post.

class PostForm(forms.ModelForm):
    def __init__(self, *args, **kwargs):
        user = kwargs.pop("user")
        post_type = kwargs.pop("post_type")
        super(PostForm, self).__init__(*args, **kwargs)
        self.instance.user = user
        self.post_type = post_type

        if post_type == "text":
            self.fields["body"] = forms.CharField(
                widget=forms.Textarea(), max_length=10_000
            )
        elif post_type == "image":
            self.fields["image"] = forms.ImageField()
        elif post_type == "youtube":
            self.fields["video_id"] = forms.CharField(max_length=25)
        elif post_type == "url":
            self.fields["url"] = forms.URLField
        else:
            raise ValueError("Unknown post type.")

    def clean(self):
        cleaned_data = super(PostForm, self).clean()
        if self.post_type == "text":
            text_post = TextPost(body=cleaned_data["body"])
            text_post.full_clean()
            text_post.save()
            self.instance.content_object = text_post
        elif self.post_type == "image":
            image_post = ImagePost(image=cleaned_data["image"])
            image_post.full_clean()
            image_post.save()
            self.instance.content_object = image_post
        elif self.post_type == "youtube":
            youtube_post = YoutubeEmbedPost(video_id=cleaned_data["video_id"])
            youtube_post.full_clean()
            youtube_post.save()
            self.instance.content_object = youtube_post
        else:
            url_post = URLPost(url=cleaned_data["url"])
            url_post.full_clean()
            url_post.save()
            self.instance.content_object = url_post
        return cleaned_data

    class Meta:
        model = Post
        fields = ["title"]

With the same view as the OneToOneField technique, we instantiate a form with fields corresponding to what the user is looking to work with.

The, we simply need to populate the content_object field before saving the form.

Using NoSQL

Instead of using relational features of the Django ORM, we can dump all of our data in a JSONField.

class Post(models.Model):
    TEXT_POST_TYPE = 0
    YOUTUBE_POST_TYPE = 1
    IMAGE_POST_TYPE = 2
    URL_POST_TYPE = 3

    POST_TYPE_CHOICES = (
        (TEXT_POST_TYPE, "Text"),
        (YOUTUBE_POST_TYPE, "Youtube"),
        (IMAGE_POST_TYPE, "Image"),
        (URL_POST_TYPE, "URL"),
    )
    user = models.ForeignKey(
        settings.AUTH_USER_MODEL,
        on_delete=models.CASCADE,
        related_name="socialjson_posts",
    )
    title = models.CharField(max_length=155)
    post_type = models.PositiveSmallIntegerField(choices=POST_TYPE_CHOICES)
    date_added = models.DateTimeField(auto_now_add=True)

    data = models.JSONField()

    class Meta:
        ordering = ["-date_added"]

    def __str__(self):
        return f"{self.title}"

    def clean(self):

        if self.post_type == self.TEXT_POST_TYPE:
            try:
                self.data["text_body"]
            except KeyError:
                raise ValidationError('Text posts must contain "text_body"')
        elif self.post_type == self.IMAGE_POST_TYPE:
            try:
                self.data["image"]
            except KeyError:
                raise ValidationError('Image posts must contain "image"')
        elif self.post_type == self.YOUTUBE_POST_TYPE:
            try:
                self.data["video_id"]
            except KeyError:
                raise ValidationError('Youtube posts must contain "video_id"')
        elif self.post_type == self.URL_POST_TYPE:
            try:
                self.data["url"]
            except KeyError:
                raise ValidationError('Url posts must contain "url"')
        return super(Post, self).clean()

And the template:

{% extends "base.html" %}

{% block content %}
    {% for post in posts %}
        <h2>{{ post.title }}</h2>
        {% if post.data.text_body %}
            <p>{{ post.data.text_body }}</p>
        {% elif post.data.image %}
            <img src="{{ post.data.image }}" alt="{{ post.title }}">
        {% elif post.data.video_id %}
            {% include "_youtube-embed.html" with video_id=post.data.video_id %}
        {% elif post.data.url %}
            <a href="{{ post.data.url }}">{{ post.data.url }}</a>
        {% endif %}
    {% endfor %}
    {% include "_pagination.html" %}
{% endblock %}

The great thing about JSONField is that we get the same performance as when using a single model but with the same flexibility as contenttypes because we can add whatever we want in the JSONField.

However, just like our single model solution, this one also requires manual validation. As your data grows, you’ll have to add more extensive model validation. I would argue that JSONField is more efficient than using single model as well because we’re no longer storing null fields. Efficiency isn’t necessarily better because we also lose built-in model features like automatic file saving.

Creating objects

The issue with JSONField is that all file uploads will have to be handled manually. For projects that use external cloud storage, you will have to use the low level storage backend APIs to write files to the object storage.

Another issue will be with Forms. You can’t simply use a ModelForm unless you write a custom widget for the JSONField that will render fields based on what type of post we’re trying to save.

Instead of a custom widget, we can use a dynamic Form:

class PostForm(forms.Form):
    title = forms.CharField(max_length=155)

    def __init__(self, *args, **kwargs):
        self.user = kwargs.pop("user")
        post_type = kwargs.pop("post_type")
        self.post_type = post_type

        super(PostForm, self).__init__(*args, **kwargs)

        if post_type == Post.TEXT_POST_TYPE:
            self.fields["body"] = forms.CharField(max_length=10000)
        elif post_type == Post.IMAGE_POST_TYPE:
            self.fields["image"] = forms.ImageField()
        elif post_type == Post.YOUTUBE_POST_TYPE:
            self.fields["video_id"] = forms.CharField(max_length=25)
        elif post_type == Post.URL_POST_TYPE:
            self.fields["url"] = forms.URLField()
        else:
            raise ValueError("Unknown post type.")

    def save(self):
        data = {}
        if self.post_type == Post.TEXT_POST_TYPE:
            data["text_body"] = self.cleaned_data["body"]
        elif self.post_type == Post.IMAGE_POST_TYPE:
            image_field = self.cleaned_data["image"]
            filename = image_field.name.replace(" ", "_")
            destination_path = settings.MEDIA_ROOT / self.user.username
            destination_path.mkdir(parents=True, exist_ok=True)
            with (destination_path / filename).open("wb+") as destination:
                for chunk in image_field.chunks():
                    destination.write(chunk)
            data["image"] = f"{settings.MEDIA_URL}{self.user.username}/{filename}"
        elif self.post_type == Post.YOUTUBE_POST_TYPE:
            data["video_id"] = self.cleaned_data["video_id"]
        else:
            data["url"] = self.cleaned_data["url"]

        post = Post(
            user=self.user,
            post_type=self.post_type,
            title=self.cleaned_data["title"],
            data=data,
        )
        post.full_clean()
        post.save()

With the following view:

@login_required
def create_post(request, post_type):
    ctx = {}
    form = None
    kwargs = {"user": request.user, "post_type": post_type}
    if request.method == "POST":
        form = PostForm(request.POST, request.FILES, **kwargs)

        if form.is_valid():
            form.save()
            return redirect("socialjson:list")

    if not form:
        form = PostForm(**kwargs)
    ctx["form"] = form
    return TemplateResponse(request, "socialjson/create.html", ctx)

We’ll dynamically generate a form with fields that correspond to the post type the user is trying to create or update.

It’s not as convenient as using a ModelForm but it works fine. Of course, as your data grows and if you’re dealing with complex post types, you’ll have to do more manual plumbing to get everything working.

Using django-polymorphic

Edit: No longer the best way. See next technique.

So far, we looked at ways to achieve polymorphism using tools built into the base Django package.

Let’s take a look at a popular 3rd party library designed specifically for polymorphism.

django-polymorphic allows you to structure your model the same way you did with concrete inheritance above.

The difference is that instead of returning the base class, it returns the subclass, which is more useful.

Recall that for concrete inheritance, rendering took 512ms with 29 SQL queries and numerous JOINS.

Take a look at the model:

from django.db import models
from django.conf import settings
from polymorphic.models import PolymorphicModel

class Post(PolymorphicModel):
    user = models.ForeignKey(
        settings.AUTH_USER_MODEL,
        on_delete=models.CASCADE,
        related_name="socialpolymorphic_posts",
    )
    title = models.CharField(max_length=155)
    date_added = models.DateTimeField(auto_now_add=True)

    class Meta:
        ordering = ["-date_added"]

    def __str__(self):
        return f"{self.title}"

class TextPost(Post):
    body = models.TextField(max_length=10000)

class YoutubeEmbedPost(Post):
    video_id = models.CharField(max_length=24)

def user_directory_path(instance, filename):
    return "user_{0}/{1}".format(instance.user.id, filename)

class ImagePost(Post):
    image = models.ImageField(upload_to=user_directory_path)

class URLPost(Post):
    url = models.URLField()

Like I said, it looks similar to concrete inheritance, except that the base Post class extends PolymorphicModel.

We’ll use the same view but use a different template tag to identify post types:

from django import template

from socialpolymorphic import models

register = template.Library()

@register.simple_tag()
def get_post_type(post):
    post_type = None

    if isinstance(post, models.TextPost):
        post_type = "text"
    elif isinstance(post, models.YoutubeEmbedPost):
        post_type = "youtube"
    elif isinstance(post, models.ImagePost):
        post_type = "image"
    elif isinstance(post, models.URLPost):
        post_type = "url"

    return post_type

As you can see, django-polymorphic makes it easier to identify specific instance types because it returns the sub class in the QuerySet. We can easily identify what kind of instance a particular post is.

And the template:

{% extends "base.html" %}
{% load socialpolymorphic_tags %}
{% block content %}
    {% for post in posts %}
        <h2>{{ post.title }}</h2>
        {% get_post_type post as post_type %}
        {% if post_type == "text" %}
            <p>{{ post.body }}</p>
        {% elif post_type == "image" %}
            <img src="{{ post.image.url }}" alt="{{ post.title }}">
        {% elif post_type == "youtube" %}
            {% include "_youtube-embed.html" with video_id=post.video_id %}
        {% elif post_type == "url" %}
            <a href="{{ post.url }}">{{ post.url }}</a>
        {% endif %}
    {% endfor %}
{% include "_pagination.html" %}
{% endblock %}

Results:

  • 118ms to render
  • 6 SQL queries

This is better than both the concrete inheritance, OneToOneField, and the contenttypes methods.

Not only does it perform better, but it uses the same intuitive data model as concrete inheritance. All you have to do is inherit from a base class and perform any queries and filtering on this single base class.

Creating objects

This technique is exactly the same as concrete inheritance from above and you can use the same straightforward way of creating objects you used there.

Using django-model-utils

django-model-utils has a nice manager called InheritanceManager (thanks to /u/pancakeses for mentioning this). It does the same thing as django-polymorphic above but it’s more performant.

from django.conf import settings
from django.db import models
from model_utils.managers import InheritanceManager


class Post(models.Model):
    user = models.ForeignKey(
        settings.AUTH_USER_MODEL,
        on_delete=models.CASCADE,
        related_name="socialmodelutils_posts",
    )
    title = models.CharField(max_length=155)
    date_added = models.DateTimeField(auto_now_add=True)

    objects = InheritanceManager()

    class Meta:
        ordering = ["-date_added"]

    def __str__(self):
        return f"{self.title}"


class TextPost(Post):
    body = models.TextField(max_length=10000)


class YoutubeEmbedPost(Post):
    video_id = models.CharField(max_length=24)


def user_directory_path(instance, filename):
    return "user_{0}/{1}".format(instance.user.id, filename)


class ImagePost(Post):
    image = models.ImageField(upload_to=user_directory_path)


class URLPost(Post):
    url = models.URLField()
from django.core.paginator import Paginator
from django.template.response import TemplateResponse

from .models import Post


def post_list(request):
    ctx = {}
    paginator = Paginator(Post.objects.all().select_subclasses(), 10)
    page_obj = paginator.page(request.GET.get("page", 1))
    ctx.update({"posts": page_obj.object_list, "page_obj": page_obj})
    return TemplateResponse(request, "socialmodelutils/list.html", ctx)

The code is pretty much the same except that we’re no longer inheriting from PolymorphicModel. Instead, we’re using the new manager.

Results:

  • 52ms to render
  • 2 SQL queries

It looks like this manager is much more optimised than django-polymorphic. It bundles up all queries to fetch related sub classes in a single query with multiple JOINs — more efficient than multiple queries with a JOIN each.

Creating objects

You can create objects the same way as concrete inheritance.

Conclusion

In this post, we looked at:

  1. Why polymorphism is necessary to solve a problem where the user wants to work with multiple types of data using a single interface.
  2. How different polymorphism techniques provide different maintainability and performance.
  3. How to use built-in tools Django provides for polymorphism.
  4. How to use django-polymorphic or django-model-utils to achieve fast and easy polymorphism.

A few suggestions:

  • If your data isn’t too complicated, just use single model with custom validation.
  • If your data is complex and you can’t afford to add 3rd party dependencies, use contenttypes or JSONField.
  • If your data is complex and you can install 3rd party tools, use django-model-utils unless it can’t handle your use case.
  • If high performance and minimum resource usage is a must, use a single model.

In many cases, you might prefer JSONField instead of GenericForeignKey because the former provides better performance with the same flexibility.

When it comes to choosing the best solution for you, consider both complexity and performance.

Sometimes, a few extra milliseconds of CPU time isn’t a big deal if developer productivity gains are large.

Of all the solutions mentioned, concrete inheritance is the most straightforward but also the worst performing one. I would go as far as saying that it should never be used, no matter how easy it makes modeling. If you really want concrete inheritance, strongly consider using InheritanceManager from django-model-utils instead.

Anyway, you now know everything about model polymorphism in Django. Go and build some awesome apps.