Skip to main content
Uber logo

Schedule rides in advance

Reserve a rideReserve a ride

Schedule rides in advance

Reserve a rideReserve a ride
Data / ML, Engineering

Turning Metadata Into Insights with Databook

November 10, 2020 / Global
Featured image for Turning Metadata Into Insights with Databook
Figure 1. Databook, at launch in 2017
Figure 2. Data entities have many connections. Each entity has at least one relationship with another entity.
Figure 3. Databook architecture ingests metadata from various sources. Then it emits the changes to the event log which derived storages and services build on.
Figure 4. Dragon schema describes the relationship between HiveTable and Column metadata. They also contain standard and reusable types.
Figure 5. We represent HiveTable’s relationship to Column by connecting edges from one HiveTable node to multiple Column nodes.
Figure 6. Databook ingests metadata sources from different data systems.
Figure 7. Metadata ingestion and other services use Databook APIs to store metadata on data entities.
Figure 8. We push an event to the log when Databook APIs successfully process a change.
Figure 10. The Search Engine is built from the Metadata Event Log. This supports a real-time search experience.
Figure 11. The homepage displays popular and recommended data entities.
Figure 13. Users can get details on the dataset quality signal. They can see how we measured individual components of quality and how they performed over the last seven days
Figure 14. Data owners can edit metadata and review open issues for all of their data entities in a single interface
Sunheng Taing

Sunheng Taing

Sunheng Taing is a senior software engineer on Uber’s Metadata Platform team. He is passionate about building scalable, reliable, and high-quality data systems. Outside of Uber, he enjoys biking around San Francisco and checking out different food spots.

Atul Gupte

Atul Gupte

Atul Gupte is a former product manager on Uber's Product Platform team. At Uber, he drives product decisions to ensure our data science teams are able to achieve their full potential, by providing access to foundational infrastructure and advanced software to power Uber’s global business.

Posted by Sunheng Taing, Atul Gupte