The Microsoft Infer.NET machine learning framework goes open source

Published

By , Principal Research Engineering Manager

It isn’t every day that one gets to announce that one of the top-tier cross-platform frameworks for model-based machine learning is open to one and all worldwide. We’re extremely excited today to open source Infer.NET (opens in new tab) on GitHub (opens in new tab) under the permissive MIT license for free use in commercial applications.

Open sourcing Infer.NET represents the culmination of a long and ambitious journey. Our team at Microsoft Research in Cambridge, UK embarked on developing the framework back in 2004. We’ve learned a lot along the way about making machine learning solutions that are scalable and interpretable. Infer.NET initially was envisioned as a research tool and we released it for academic use in 2008. As a result, there have been hundreds of papers (opens in new tab) published using the framework across a variety of fields, everything from information retrieval to healthcare. In 2012 Infer.NET even won a Patents for Humanity award (opens in new tab) for aiding research in epidemiology, genetic causes of disease, deforestation and asthma.

Over time, the framework has evolved from a research tool to being the machine learning engine in a number of Microsoft products in Office, Xbox and Azure. A recent example is TrueSkill 2 (opens in new tab) – a system that matches players in online video games. Implemented in Infer.NET, it is running live in the bestselling titles Halo 5 and Gears of War 4, processing millions of matches.

Spotlight: Microsoft research newsletter

Microsoft Research Newsletter

Stay connected to the research community at Microsoft.

But in an age of abundance of machine learning libraries, what sets Infer.NET apart from the competition? Infer.NET enables a model-based approach to machine learning. This lets you incorporate domain knowledge into your model. The framework can then build a bespoke machine learning algorithm directly from that model. This means that instead of having to map your problem onto a pre-existing learning algorithm that you’ve been given, Infer.NET actually constructs a learning algorithm for you, based on the model you’ve provided.

An added advantage of model-based machine learning is interpretability. If you have designed the model yourself and the learning algorithm follows that model, then you can understand why the system behaves in a particular way or makes certain predictions. As machine learning applications gradually enter our lives, understanding and explaining their behavior becomes increasingly more important.

Model-based machine learning also naturally applies to problems with certain data traits, such as real-time data, heterogeneous data, insufficient data, unlabelled data, data with missing parts and data collected with known biases. Indeed, if you’ve read this far then it’s a good bet you’re interested in learning more about model-based machine learning. It just so happens that the Infer.NET team has written an awesome online book (opens in new tab) on the subject and it’s absolutely free.

In Infer.NET, models are described using a probabilistic program. This may seem like an oxymoron but is actually a powerful concept used to describe real-world processes in a language that machines understand. Infer.NET compiles the probabilistic program into high-performance code for implementing something cryptically called deterministic approximate Bayesian inference. This approach allows substantial scalability – for example, we use it in a system that automatically extracts knowledge from billions of web pages, comprising petabytes of data.

The use of deterministic inference algorithms is complementary to the predominantly sampling-based methods of most other probabilistic programming frameworks. A key capability of our approach is support for online Bayesian inference – the ability of the system to learn as new data arrives. We have observed that this is essential in business and consumer products that interact with users in real time. For example, in the aforementioned TrueSkill 2 system, in order to provide competitive matches, we need to update the skills of the players immediately following each round. And we do so in just a millisecond.

To sum up, you’d want to use Infer.NET when you have extensive knowledge about the domain you’re solving a problem in, or if interpreting the behaviour of the system is of importance for you, or if you have a production system that needs to learn as new data arrives.

The Infer.NET team is looking forward to engaging with the open-source community in developing and growing the framework further. Infer.NET will become a part of ML.NET (opens in new tab) – the machine learning framework for .NET developers. We have already taken several steps towards integration with ML.NET, like setting up the repository under the .NET Foundation (opens in new tab) and moving the package and namespaces to Microsoft.ML.Probabilistic. Infer.NET will extend ML.NET for statistical modelling and online learning.

Interested in Infer.NET? Download the framework here (opens in new tab). Support for Windows, Linux and MacOS is provided through .NET Core. Our Tutorials and Examples page (opens in new tab) gives a taste of what models can be implemented using Infer.NET. And the documentation also contains a detailed User Guide (opens in new tab). You are warmly invited to join us on GitHub (opens in new tab) if you want to contribute!

The Infer.NET Team. Top row, left to right: Martin Kukla, John Guiver, Tom Minka, John Winn, Sam Webster, Dany Fabian. Bottom row, left to right: Pavel Myshkov, Yordan Zaykov, Alex Spengler.

 

Continue reading

See all blog posts