How to read and parse JSON data with Python

14 February 2023 | 10 min read

JSON, or JavaScript Object Notation, is a popular data interchange format that has become a staple in modern web development. If you're a programmer, chances are you've come across JSON in one form or another. It's widely used in REST APIs, single-page applications, and other modern web technologies to transmit data between a server and a client, or between different parts of a client-side application. JSON is lightweight, easy to read, and simple to use, making it an ideal choice for developers looking to transmit data quickly and efficiently.

In this article, you will learn how to work with JSON in Python:

cover image

What is JSON

JSON to Python

For those who are new to JSON, it's a text-based format that uses key-value pairs to represent data. Keys are strings, and values can be strings, numbers, arrays, or other JSON objects. This structure makes it simple to transmit complex data structures in a human-readable format, and it's easy for machines to parse and generate as well. With its popularity, many programming languages, including Python, Java, and JavaScript, have built-in support for reading and writing JSON data.

Here is what typical JSON data might look like:

{
    "name": "John Doe",
    "age": 32,
    "address": {
        "street": "123 Main St",
        "city": "Anytown",
        "state": "CA"
    },
    "tags": ["Finance", "Admin"]
}

JSON is a completely language-independent format but uses conventions that are familiar to programmers of the C family of languages, including C, C++, C#, Java, JavaScript, Perl, Python, and many others. This makes it really easy to parse JSON data into corresponding data structures in the target language.

For instance, here is how the same data as above will be represented in Python:

{
    "name": "John Doe",
    "age": 32,
    "address": {
        "street": "123 Main St", 
        "city": "Anytown", 
        "state": "CA"
    },
    "tags": ["Finance", "Admin"],
}

And likewise, here is how it will be represented in JavaScript:

{
    name: "John Doe",
    age: 32,
    address: {
        street: "123 Main St", 
        city: "Anytown", 
        state: "CA"
    },
    tags: ["Finance", "Admin"]
}

JSON has become the defacto standard on the web for API responses and various other avenues due to a bunch of reasons including:

  1. Lightweight: JSON data is relatively small and lightweight compared to other data interchange formats, making it quick to transmit and parse.
  2. Human-readable: JSON data is represented as key-value pairs, making it easy for humans to read and understand. This is particularly useful when debugging and troubleshooting.
  3. Language-independent: JSON is based on a text format, which makes it language-independent. This means that JSON data can be transmitted between different programming languages and platforms without any loss of information.
  4. Easy to use: Many programming languages, such as Python, Java, and JavaScript, have built-in support for reading and writing JSON data, making it easy for developers to integrate into their applications.
  5. Wide adoption: JSON has been widely adopted in modern web development, particularly in REST APIs and single-page applications, which has led to a large ecosystem of libraries and tools that support it.
  6. Flexibility: JSON can be used to represent a wide range of data structures, including simple key-value pairs, arrays, and complex nested objects. This flexibility makes it an ideal choice for representing complex data structures in a simple and intuitive format.

Now that you know what JSON is, what it looks like, and why it is popular, let's take a look at how to work with JSON in Python.

Convert JSON to Python

Python has extensive built-in support for JSON via its json package. This has long been a part of Python and can help you in converting JSON to Python and vice-versa. As you saw earlier, JSON objects and Python dictionaries look very similar and if you do not pay attention, you might even confuse one for the other.

First, go ahead and create a new directory named json_tutorial. This will contain all the code you will write in this tutorial. Then create a new app.py file in this directory. This file will store all of the Python code.

To make use of the json package, you need to import it into your Python projects. Go to the very top of the app.py file and add this import:

import json

Now you are ready to use any of the different modules and methods that are exposed by the json package.

How to convert a JSON string to a Python object

Suppose you have a JSON string:

json_string = """
{
    "name": "John Doe",
    "age": 32,
    "address": {
        "street": "123 Main St",
        "city": "Anytown",
        "state": "CA"
    },
    "tags": ["Finance", "Admin"]
}
"""

You can use the json.loads method to easily convert this JSON string to a Python object:

import json

json_string = """
{
    "name": "John Doe",
    "age": 32,
    "address": {
        "street": "123 Main St",
        "city": "Anytown",
        "state": "CA"
    },
    "tags": ["Finance", "Admin"]
}
"""

parsed_json = json.loads(json_string)
print(parsed_json["address"]["street"])
# Output: 123 Main St

Now you can access any key/value the same way as you would in a Python dictionary. This is the conversion table used by Python while decoding JSON to a Python object:

JSON Python
object dict
array list
string str
number (int) int
number (real) float
true True
false False
null None

You can read more about the json.loads method in the official docs.

How to convert a JSON file to a Python object

What if the same JSON string is stored in a file? Python makes it super easy to parse that into a Python object as well. Suppose you have a JSON file named profile.json in the json_tutorial folder and it contains the same JSON that you saw earlier:

{
    "name": "John Doe",
    "age": 32,
    "address": {
        "street": "123 Main St",
        "city": "Anytown",
        "state": "CA"
    },
    "tags": ["Finance", "Admin"]
}

You can use the json.load method to parse this into a Python object. Notice that this time it is load instead of loads. This slight distinction has a big impact on how either of these methods is used. json.loads, as you used previously, works on a str, bytes or bytearray instance that contains a JSON string and converts it into a Python object. Whereas json.load works on a text or binary file-like object containing JSON that has the read() method defined.

This is how you can use json.load to parse the profile.json file:

import json

with open("profile.json", "r") as f:
    parsed_json = json.load(f)
    
print(parsed_json["name"])
# Output: John Doe

Now let's take a look at how you can do it the other way around and encode a Python object to JSON.

Convert Python to JSON

Just like how the json package provides json.load and json.loads, it provides similar methods to "dump" a Python object to a string or a file. In this section, you will learn what these methods are and how to use them.

How to convert a Python object to a JSON string

The json package contains the json.dumps method to convert a Python object to a JSON string. This is how it works:

import json

profile = {
    "name": "John Doe",
    "age": 32,
    "address": {
        "street": "123 Main St",
        "city": "Anytown",
        "state": "CA"
    },
    "tags": ["Finance", "Admin"]
}

json_string = json.dumps(profile)

print(type(json_string))
# Output <class 'str'>

print(json_string)
# Output: '{"name": "John Doe", "age": 32, "address": {"street": "123 Main St", "city": "Anytown", "state": "CA"}, "tags": ["Finance", "Admin"]}'

The default conversion table used by json looks like this:

Python JSON
dict object
list, tuple array
str string
int, float, int- & float-derived Enums number
True true
False false
None null

One thing to keep in mind is that when you dump and load a JSON string, the resulting object might use different types from the one you dumped. For instance, take a look at this example:

import json

name = ("Yasoob", "Khalid")
encoded_json = json.dumps(name)
decoded_json = json.loads(encoded_json)

name == decoded_json
# Output: False

print(type(name))
# Output: <class 'tuple'>

print(type(decoded_json))
# Output: <class 'list'>

This makes sense when you look at the conversion tables for both operations. Python converts both lists and tuples to a JSON array during encoding and then converts a JSON array into a Python list while decoding. So the data is the same but the data types are different. This is an important caveat to keep in mind while working with JSON in Python.

How to write a Python object to a JSON file

Writing a Python object to a JSON file is as simple as calling json.dump. This method expects a file-like object that has a write() method defined. You can use it to write a Python object to a JSON file or a StringIO object (as that also provides a file-like interface with a write() method!). This is how it works:

import json

profile = {
    "name": "John Doe",
    "age": 32,
    "address": {
        "street": "123 Main St",
        "city": "Anytown",
        "state": "CA"
    },
    "tags": ["Finance", "Admin"]
}

with open("new_profile.json", "w") as f:
    json.dump(profile, f)

If you open up new_profile.json in a text editor, you will see the profile dictionary dumped as a JSON string.

How to convert custom Python objects to JSON objects

This is all well and good but what happens if you want to encode a custom Python data structure/class and dump it as JSON into a file? The conversion tables that you have seen so far don't list such a conversion. Let's still try it out and see what happens:

import json

class CustomProfile:
    def __init__(self, name, age, tags):
        self.name = name
        self.age = age
        self.tags = tags
        

profile = CustomProfile("Yasoob Khalid", 33, ["Finance"])
with open("new_profile.json", "w") as f:
    json.dump(profile, f)

When you run this code, Python will scream at you. It is not happy and throws an error similar to this:

Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/Users/yasoob/.pyenv/versions/3.10.0/lib/python3.10/json/__init__.py", line 179, in dump
    for chunk in iterable:
  File "/Users/yasoob/.pyenv/versions/3.10.0/lib/python3.10/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/Users/yasoob/.pyenv/versions/3.10.0/lib/python3.10/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type CustomProfile is not JSON serializable

Python has no idea how to serialize our custom Python object. Luckily, it provides us with a way to instruct it on how to do so.

By default, the json module uses the json.JSONEncoder class to perform the serialization. And as you saw, this class is aware of only a limited number of mappings between JSON and Python. However, you can subclass it and inform it about additional mappings and methods of performing serialization. Let's create a subclass and tell it how to serialize the CustomProfile object:

class ProfileEncoder(json.JSONEncoder):
    def default(self, custom_obj):
        if isinstance(custom_obj, CustomProfile):
            return custom_obj.__dict__
        else:
            return super().default(custom_obj)

with open("new_profile.json", "w") as f:
    json.dump(profile, f, cls=ProfileEncoder)

This time it works without a hitch! This is because the dump method knows how to encode the profile thanks to the ProfileEncoder subclass. The ProfileEncoder subclass simply checks whether the object being encoded is of type CustomProfile and if it is it returns the attributes of the class as a dict. You will need to add additional logic to this encoder if the CustomProfile contains attributes of additional custom types. It works for now because you are only using default types in CustomProfile.

This custom encoding magic works for json.dumps too!

Conclusion

In this article, you learned how to encode and decode a JSON string to a Python object and vice versa. Moreover, you saw how a custom encoder can be used to encode custom types to JSON. The json package is very versatile and provides a ton of additional features that are worth exploring. If you are working with JSON in Python then there is a pretty good chance that you will have to use this library. You can read more about it in the official Python documentation.

Python also provides the pickle and marshall modules to serialize data. They follow a similar API to the json package and will be even easier for you to learn now that you know about json. Do explore them as well in case they suit your needs better. Best of luck!

image description
Yasoob Khalid

Yasoob is a renowned author, blogger and a tech speaker. He has authored the Intermediate Python and Practical Python Projects books ad writes regularly. He is currently working on Azure at Microsoft.

You might also like: