Multilingual Support In Django

Mohammad Alsakhawy on 2024-02-17

Table of Contents

Introduction

Visiting the root page: URL internationalization

Translating strings: Behind the scenes

Translating template files

Translating JS files

Conclusion

Introduction

My recent contributions to django CMS got me deep into the world of internationalization. It’s not an easy task to support multiple languages. So, in this blog, I’d like to give a brief overview of how Django supports internationalization (or i18n for short).

Before we begin, let’s get some jargon out of the way:

Django is a very mature project, so it has a pretty robust system for multi-lingual support throughout its various components. With just a few hooks (which are already enabled by default), your app can support multiple languages.

I like to learn stuff from a top-down approach, or at least what I think is a “top-down” approach. So, let’s start from the perspective of a user making a request to our Django-powered web app that lives at 127.0.0.1:8000 and see how Django reacts to that request.

NOTE: You can find the code used in this article at https://github.com/sakhawy/django-i18n-demo if you want to get your hands dirty!

Visiting the root page: URL internationalization

We’ll start with a GET http://127.0.0.1:8000/en/. Notice that /en/? It’s critical to what’s going to happen.

That /en/is “injected” by Django to make it known to its internals that English is the currently active language.

i18n_patterns()

By using this function in the root URLconf — wrapping other URLs — Django will prefix your normal URLs with the currently active language:

# conf/urls.py
from django.urls import include, path
from django.conf.urls.i18n import i18n_patterns

urlpatterns = i18n_patterns(
    path('', include('demo.urls')),
)

By default, the currently active language is the installation-wide active language you set in the LANGUAGE_CODE setting (in settings.py.)

# conf/settings.py
[...]
LANGUAGE_CODE = 'en'
[...]

That begs the question:

How does Django detect the active language?

We already mentioned the LANGUAGE_CODE setting, but that’s only used when all the other discovery options fail.

LocalMiddleware is what’s responsible for dynamically detecting the user’s active language. It does the following in the exact same order, and if a step fails, it goes straight to the next one:

  1. It checks the URL prefix (the one injected by i18n_patterns.)
  2. Checks the cookie set by LANGUAGE_COOKIE_NAME (default name is django_language.)
  3. Checks the Accept-Language HTTP header.
  4. Defaults to using the language in LANGUAGE_CODE.

Translating strings: Behind the scenes

Thus far, we’ve made a GET request to /en/ (or the root page / wrapped under i18n_patterns.)

Now, if we take a look at the simple view that serves the response for that URL:

# demo/views.py
[...]
def hello_world(request):
    return HttpResponse(_('Hello, World!'))

Everything seems normal, except for the _()function that’s wrapping the response text! That’s an alias for the gettext function. You can think of it as the entry point of the translation process.

GNU gettext

Django uses the GNU gettext framework to facilitate the process of translating strings.

[…] the GNU gettext utilities are a set of tools that provides a framework to help other GNU packages produce multi-lingual messages. These tools include a set of conventions about how programs should be written to support message catalogs, a directory and file naming organization for the message catalogs themselves, a runtime library supporting the retrieval of translated messages, and a few stand-alone programs to massage in various ways the sets of translatable strings, or already translated strings.

Illustration: How Django translates string literals

It all boils down to the following:

# demo/views
[...]
def hello_world(request):
    return HttpResponse(_('Hello, World!'))  # "Hello, World!" is 
                                             # marked for translation

The string literals and their translations will live in .po files (message files). Those files live in a well-structured directory (message catalog) organized by language.

.
├── conf
│   ├── asgi.py
│   ├── __init__.py
│   ├── locale
│   │   └── ar
│   │       └── LC_MESSAGES
│   │           ├── django.mo  # the compiled message file
│   │           └── django.po  # the message file

For each language you’ll support/generate message files for, there will be a sub-directory in locale/ with that language’s code. That directory, in our case, Arabic ar will contain all the message files and compiled message files for Arabic. If we decided to support another language, German de for example, this is what the local file would look like:

├── locale
│   └── ar
│   |   └── LC_MESSAGES
│   |       ├── django.mo  # the compiled message file
│   |       └── django.po  # the message file
│   └── de
│       └── LC_MESSAGES
│           ├── django.mo  # the compiled message file
│           └── django.po  # the message file

To generate message files for a language (i.e. Arabic) we use makemessages:

$ django-admin makemessages -l ar  # command to create message files for Arabic

By default, Django “selects” a large list of languages. But since it’s highly unlikely you’ll translate your site to all languages, you’ll probably want to override the LANGUAGES setting:

# [YOUR_PROJECT]/settings.py
LANGUAGES = [
    ("en", _("English")),
    ("ar", _("Arabic")),
    ("de", _("German")),
]

Working on .po message files files is the job of the translators. This is what the .po file for our hello_world view looks like:

# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER
# This file is distributed under the same license as the PACKAGE package.
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2024-02-24 20:24+0200\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"Language: \n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"

#: demo/views.py:5
msgid "Hello, World!"
msgstr "أهلا يا عالم!"

This is the hexdump of the .po file

00000000: de12 0495 0000 0000 0200 0000 1c00 0000  ................
00000010: 2c00 0000 0500 0000 3c00 0000 0000 0000  ,.......<.......
00000020: 5000 0000 0d00 0000 5100 0000 1701 0000  P.......Q.......
00000030: 5f00 0000 1700 0000 7701 0000 0100 0000  _.......w.......
00000040: 0200 0000 0000 0000 0000 0000 0000 0000  ................
00000050: 0048 656c 6c6f 2c20 576f 726c 6421 0050  .Hello, World!.P
00000060: 726f 6a65 6374 2d49 642d 5665 7273 696f  roject-Id-Versio
00000070: 6e3a 2050 4143 4b41 4745 2056 4552 5349  n: PACKAGE VERSI
00000080: 4f4e 0a52 6570 6f72 742d 4d73 6769 642d  ON.Report-Msgid-
00000090: 4275 6773 2d54 6f3a 200a 504f 2d52 6576  Bugs-To: .PO-Rev
000000a0: 6973 696f 6e2d 4461 7465 3a20 5945 4152  ision-Date: YEAR
000000b0: 2d4d 4f2d 4441 2048 4f3a 4d49 2b5a 4f4e  -MO-DA HO:MI+ZON
000000c0: 450a 4c61 7374 2d54 7261 6e73 6c61 746f  E.Last-Translato
000000d0: 723a 2046 554c 4c20 4e41 4d45 203c 454d  r: FULL NAME <EM
000000e0: 4149 4c40 4144 4452 4553 533e 0a4c 616e  AIL@ADDRESS>.Lan
000000f0: 6775 6167 652d 5465 616d 3a20 4c41 4e47  guage-Team: LANG
00000100: 5541 4745 203c 4c4c 406c 692e 6f72 673e  UAGE <LL@li.org>
00000110: 0a4c 616e 6775 6167 653a 200a 4d49 4d45  .Language: .MIME
00000120: 2d56 6572 7369 6f6e 3a20 312e 300a 436f  -Version: 1.0.Co
00000130: 6e74 656e 742d 5479 7065 3a20 7465 7874  ntent-Type: text
00000140: 2f70 6c61 696e 3b20 6368 6172 7365 743d  /plain; charset=
00000150: 5554 462d 380a 436f 6e74 656e 742d 5472  UTF-8.Content-Tr
00000160: 616e 7366 6572 2d45 6e63 6f64 696e 673a  ansfer-Encoding:
00000170: 2038 6269 740a 00d8 a3d9 87d9 84d8 a720   8bit.......... 
00000180: d98a d8a7 20d8 b9d8 a7d9 84d9 8521 00    .... ........!.

To compile message files, we use compilemessages:

$ django-admin compilemessages  # command to compile all the message files

So, the steps for adding localization are, in short:

Translating template files

We’ve seen how to tag string literals and how to use their translations in .py files. What about the Django templates? How can they access those translations? Well, we have two template tags for that: translate and blocktranslate.

NOTE: To use those tags, you have to load i18n tags (as seen in the provided examples).

<!-- demo/templates/demo/hello_world.html -->
{% load i18n %}
{% translate 'Hello, World!' as hello_world %}
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>{{ hello_world }}</title>
</head>
<body>
    {{ hello_world }}
</body>
</html>

Here translate is used to mark the string literal ‘Hello, World!’ for translation.

This tag is similar to translate but is used in more complicated sentences with many literals and variables. Here’s an example from the Django docs:

{% blocktranslate with book_t=book|title author_t=author|title %}
This is {{ book_t }} by {{ author_t }}
{% endblocktranslate %}

NOTE: We’ve marked ‘Hello, World!’ twice in two files. How do you think gettext will act when you extract the tagged strings into message files? Will it have a separate entry for each occurrence? Well, that’s inefficient, isn’t it? It’s actually smart enough not to repeat itself:

# conf/locale/ar/LC_MESSAGES/django.po

#: demo/templates/demo/hello_world.html:2 demo/views.py:7
msgid "Hello, World!"
msgstr "أهلا يا عالم!"

Notice it extracted the same literal from two files: hello_world.html and views.py.

Translating JS files

For both .py files and Django Templates, we had access to the gettext framework (directly in the .py files and indirectly through template tags for Django templates).

The story is a bit different for JS files. The literal string translation thus far happened before serving the response, inside Django. Meaning those JS files, after getting sent to the browser, won’t have access to neither gettext nor the .po and .mo files!

What to do?

JavaScriptCatalog to the rescue!

This view provides a JS library with code that mimics gettext utils and sends .mo files as needed. That’s called the JavaScript catalog.

The way this works is:

  1. Add the JavaScriptCatalog to your views.
  2. Django will make a request to that view every time there’s a request that uses the JavaScript you marked for translation.

Here’s what the conf.urls look like:

# conf/urls.py
[...]
from django.views.i18n import JavaScriptCatalog

urlpatterns = i18n_patterns(
    path("jsi18n/", JavaScriptCatalog.as_view(), name="javascript-catalog"),
    [...]
)

To use that view, we load it in out demo/hello_world.html file:

<!-- demo/templates/demo/hello_world.html -->
[...]
    <script src="{% url 'javascript-catalog' %}"></script> <!-- This line -->

    <script src="{% static 'demo/js/hello_world.js' %}" type="text/javascript"></script>
[...]

Now we can use gettext in our demo/hello_world.js file:

// demo/static/demo/js/hello_world.js
alert(gettext("Hello, World!"));

NOTE: When we generate the message files for JS, it’s a bit different; we need to specify the “gettext domain.” That can be done by adding an extra parameter (-d djangojs) to our command:

$ django-admin makemessages -d djangojs -l   # generates message file for js code
$ django-admin compilemessages               # compiles the generated file

The first command will generate a new djangojs.po file:

# conf/locale/ar/LC_MESSAGES/djangojs.po
# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER
# This file is distributed under the same license as the PACKAGE package.
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2024-02-25 20:31+0200\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"Language: \n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"

#: demo/static/demo/hello_world.js:1
msgid "Hello, World!"
msgstr "أهلا يا عالم!"

And a new compiled djangojs.mo file. This is its hexdump:

# conf/locale/ar/LC_MESSAGES/djangojs.mo
00000000: de12 0495 0000 0000 0200 0000 1c00 0000  ................
00000010: 2c00 0000 0500 0000 3c00 0000 0000 0000  ,.......<.......
00000020: 5000 0000 0d00 0000 5100 0000 1701 0000  P.......Q.......
00000030: 5f00 0000 1700 0000 7701 0000 0100 0000  _.......w.......
00000040: 0200 0000 0000 0000 0000 0000 0000 0000  ................
00000050: 0048 656c 6c6f 2c20 576f 726c 6421 0050  .Hello, World!.P
00000060: 726f 6a65 6374 2d49 642d 5665 7273 696f  roject-Id-Versio
00000070: 6e3a 2050 4143 4b41 4745 2056 4552 5349  n: PACKAGE VERSI
00000080: 4f4e 0a52 6570 6f72 742d 4d73 6769 642d  ON.Report-Msgid-
00000090: 4275 6773 2d54 6f3a 200a 504f 2d52 6576  Bugs-To: .PO-Rev
000000a0: 6973 696f 6e2d 4461 7465 3a20 5945 4152  ision-Date: YEAR
000000b0: 2d4d 4f2d 4441 2048 4f3a 4d49 2b5a 4f4e  -MO-DA HO:MI+ZON
000000c0: 450a 4c61 7374 2d54 7261 6e73 6c61 746f  E.Last-Translato
000000d0: 723a 2046 554c 4c20 4e41 4d45 203c 454d  r: FULL NAME <EM
000000e0: 4149 4c40 4144 4452 4553 533e 0a4c 616e  AIL@ADDRESS>.Lan
000000f0: 6775 6167 652d 5465 616d 3a20 4c41 4e47  guage-Team: LANG
00000100: 5541 4745 203c 4c4c 406c 692e 6f72 673e  UAGE <LL@li.org>
00000110: 0a4c 616e 6775 6167 653a 200a 4d49 4d45  .Language: .MIME
00000120: 2d56 6572 7369 6f6e 3a20 312e 300a 436f  -Version: 1.0.Co
00000130: 6e74 656e 742d 5479 7065 3a20 7465 7874  ntent-Type: text
00000140: 2f70 6c61 696e 3b20 6368 6172 7365 743d  /plain; charset=
00000150: 5554 462d 380a 436f 6e74 656e 742d 5472  UTF-8.Content-Tr
00000160: 616e 7366 6572 2d45 6e63 6f64 696e 673a  ansfer-Encoding:
00000170: 2038 6269 740a 00d8 a3d9 87d9 84d8 a720   8bit.......... 
00000180: d98a d8a7 20d8 b9d8 a7d9 84d9 8521 00    .... ........!.

And that’s how our complete locale directory tree looks up until now:

├── conf
│   ├── asgi.py
│   ├── __init__.py
│   ├── locale
│   │   └── ar
│   │       └── LC_MESSAGES
│   │           ├── djangojs.mo
│   │           ├── djangojs.po
│   │           ├── django.mo
│   │           └── django.po

NOTE: There’s a performance problem with the JavaScriptCatalog view: it’s dynamically generated from the .mo files on each request. That’s bad.

There are multiple ways to mitigate this:

  1. Server-side caching to reduce CPU load:
# https://docs.djangoproject.com/en/5.0/topics/i18n/translation/#note-on-performance
from django.views.decorators.cache import cache_page
from django.views.i18n import JavaScriptCatalog

# The value returned by get_version() must change when translations change.
urlpatterns = [
    path(
        "jsi18n/",
        cache_page(86400, key_prefix="jsi18n-%s" % get_version())(
            JavaScriptCatalog.as_view()
        ),
        name="javascript-catalog",
    ),
]

2. Client-side caching for reduced bandwidth:

Either use ConditionalGetMiddleware or last_modified as in this example:

# https://docs.djangoproject.com/en/5.0/topics/i18n/translation/#note-on-performance
from django.utils import timezone
from django.views.decorators.http import last_modified
from django.views.i18n import JavaScriptCatalog

last_modified_date = timezone.now()

urlpatterns = [
    path(
        "jsi18n/",
        last_modified(lambda req, **kw: last_modified_date)(
            JavaScriptCatalog.as_view()
        ),
        name="javascript-catalog",
    ),
]

3. Serving the JS catalogs as static using django-statici18n

What django-statici18n does is that it goes and pre-generates all the JavaScript catalogs from all of your apps and puts them in a single location. By default, that location is in your STATIC_ROOT under the name statici18n. It then serves the catalogs as static, like your JS/CSS files.

Conclusion

Given that the developer has marked the strings for translation, the translators have translated the .po message files, and we have the compiled .mo message files, here’s how a response gets translated:

  1. We make a GET request to 127.0.0.1:8000/en/. That /en/ is injected by i18n_patterns so that the LocaleMiddleware knows the user’s active language later on.
  2. LocaleMiddleware goes through its discovery algorithm. It first checks the URL prefix. It finds that the language is English.
  3. Django translates the string literals using the already-compiled .mo message files loaded in memory.
  4. If needed, Django will make a request to JavaScriptCatalog to get the translations for the JavaScript files and a code-library for the client to have access to mimicked gettext utils.

Well, that’s it! Have a nice day!

If this article peeked your interest, you can have a look at the References.

References