Inside Facebook’s AI Machine

The Applied Machine Learning group helps Facebook see, talk, and understand. It may even root out fake news.
Image may contain Graphics Art Purple Floral Design Pattern and Text

When asked to head Facebook’s Applied Machine Learning group — to supercharge the world’s biggest social network with an AI makeover — Joaquin Quiñonero Candela hesitated. It was not that the Spanish-born scientist, a self-described “machine learning (ML) person,” hadn’t already witnessed how AI could help Facebook. Since joining the company in 2012, he had overseen a transformation of the company’s ad operation, using an ML approach to make sponsored posts more relevant and effective. Significantly, he did this in a way that empowered engineers in his group to use AI even if they weren’t trained to do so, making the ad division richer overall in machine learning skills. But he wasn’t sure the same magic would take hold in the larger arena of Facebook, where billions of people-to-people connections depend on fuzzier values than the hard data that measures ads. “I wanted to be convinced that there was going to be value in it,” he says of the promotion.

Despite his doubts, Candela took the post. And now, after barely two years, his hesitation seems almost absurd.

How absurd? Last month, Candela addressed an audience of engineers at a New York City conference. “I’m going to make a strong statement,” he warned them. “Facebook today cannot exist without AI. Every time you use Facebook or Instagram or Messenger, you may not realize it, but your experiences are being powered by AI.”

Joaquin Candela, Director of Engineering for Applied Machine Learning at Facebook.

Stephen Lam

Last November I went to Facebook’s mammoth headquarters in Menlo Park to interview Candela and some of his team, so that I could see how AI suddenly became Facebook’s oxygen. To date, much of the attention around Facebook’s presence in the field has been focused on its world-class Facebook Artificial Intelligence Research group (FAIR), led by renowned neural net expert Yann LeCun. FAIR, along with competitors at Google, Microsoft, Baidu, Amazon, and Apple (now that the secretive company is allowing its scientists to publish), is one of the preferred destinations for coveted grads of elite AI programs. It’s one of the top producers of breakthroughs in the brain-inspired digital neural networks behind recent improvements in the way computers see, hear, and even converse. But Candela’s Applied Machine Learning group (AML) is charged with integrating the research of FAIR and other outposts into Facebook’s actual products—and, perhaps more importantly, empowering all of the company’s engineers to integrate machine learning into their work.

Because Facebook can’t exist without AI, it needs all its engineers to build with it.

My visit occurs two days after the presidential election and one day after CEO Mark Zuckerberg blithely remarked that “it’s crazy” to think that Facebook’s circulation of fake news helped elect Donald Trump. The comment would turn out be the equivalent of driving a fuel tanker into a growing fire of outrage over Facebook’s alleged complicity in the orgy of misinformation that plagued its News Feed in the last year. Though much of the controversy is beyond Candela’s pay grade, he knows that ultimately Facebook’s response to the fake news crisis will rely on machine learning efforts in which his own team will have a part.

But to the relief of the PR person sitting in on our interview, Candela wants to show me something else—a demo that embodies the work of his group. To my surprise, it’s something that performs a relatively frivolous trick: It redraws a photo or streams a video in the style of an art masterpiece by a distinctive painter. In fact, it’s reminiscent of the kind of digital stunt you’d see on Snapchat, and the idea of transmogrifying photos into Picasso’s cubism has already been accomplished.

“The technology behind this is called neural style transfer,” he explains. “It’s a big neural net that gets trained to repaint an original photograph using a particular style.” He pulls out his phone and snaps a photo. A tap and a swipe later, it turns into a recognizable offshoot of Van Gogh’s “The Starry Night.” More impressively, it can render a video in a given style as it streams. But what’s really different, he says, is something I can’t see: Facebook has built its neural net so it will work on the phone itself.

That isn’t novel, either — Apple has previously bragged that it does some neural computation on the iPhone. But the task was much harder for Facebook because, well, it doesn’t control the hardware. Candela says his team could execute this trick because the group’s work is cumulative — each project makes it easier to build another, and every project is constructed so that future engineers can build similar products with less training required —so stuff like this can be built quickly. “It took eight weeks from us to start working on this to the moment we had a public test, which is pretty crazy,” he says.

(L-R) Joaquin Candela, Director of Engineering for Applied Machine Learning; Manohar Paluri, Applied Computer Vision Team Lead; Rita Aquino, Technical Product Manager; and Rajen Subba, Engineering Manager.

Stephen Lam

The other secret in pulling off a task like this, he says, is collaboration—a mainstay of Facebook culture. In this case, easy access to other groups in Facebook — specifically the mobile team intimately familiar with iPhone hardware — led to the jump from rendering images in Facebook’s data centers to performing the work on the phone itself. The benefits won’t only come from making movies of your friends and relatives looking like the woman in “The Scream.” It’s a step toward making all of Facebook more powerful. In the short term, this allows for quicker responses in interpreting languages and understanding text. Longer term, it could enable real-time analysis of what you see and say. “We’re talking about seconds, less than seconds — this has to be real time,” he says. “We’re the social network. If I’m going to make predictions about people’s feedback on a piece of content, [my system] needs to react immediately, right?”

Candela takes another look at the Van Gogh-ified version of the selfie he’s just shot, not bothering to mask his pride. “By running complex neural nets on the phone, you’re putting AI in the hands of everybody,” he says. “That does not happen by chance. It’s part of how we’ve actually democratized AI inside the company.

“It’s been a long journey,” he adds.

Candela was born in Spain. His family moved to Morocco when he was three, and he attended French language schools there. Though his grades were equally high in science and humanities, he decided to attend college in Madrid, ideally studying the hardest subject he could think of: telecommunications engineering, which not only required a mastery of physical stuff like antennas and amplifiers, but also an understanding of data, which was “really cool.” He fell under the spell of a professor who proselytized adaptive systems. Candela built a system that used intelligent filters to improve the signal of roaming phones; he describes it now as “a baby neural net.” His fascination with training algorithms, rather than simply churning out code, was further fueled by a semester he spent in Denmark in 2000, where he met Carl Rasmussen, a machine learning professor who had studied with the legendary Geoff Hinton in Toronto—the ultimate cool kid credential in machine learning. Ready for graduation, Candela was about to enter a leadership program at Procter & Gamble when Rasmussen invited him to study for a PhD. He chose machine learning.

In 2007, he went to work at Microsoft Research’s lab in Cambridge, England. Soon after he arrived, he learned about a company-wide competition: Microsoft was about to launch Bing, but needed improvement in a key component of search ads — accurately predicting when a user would click on an ad. The company decided to open an internal competition. The winning team’s solution would be tested to see if it was launch-worthy, and the team members would get a free trip to Hawaii. Nineteen teams competed, and Candela’s tied for the winner. He got the free trip, but felt cheated when Microsoft stalled on the larger prize — the test that would determine if his work could be shipped.

What happened next shows Candela’s resolve. He embarked on a “crazy crusade” to make the company give him a chance. He gave over 50 internal talks. He built a simulator to show his algorithm’s superiority. He stalked the VP who could make the decision, positioning himself next to the guy in buffet lines and synching his bathroom trips to hype his system from an adjoining urinal; he moved into an unused space near the executive, and popped into the man’s office unannounced, arguing that a promise was a promise, and his algorithm was better.

Candela’s algorithm shipped with Bing in 2009.

In early 2012, Candela visited a friend who worked at Facebook and spent a Friday on its Menlo Park campus. He was blown away to discover that at this company, people didn’t have to beg for permission to get their work tested. They just did it. He interviewed at Facebook that next Monday. By the end of the week he had an offer.

Joining Facebook’s ad team, Candela’s task was to lead a group that would show more relevant ads. Though the system at the time did use machine learning, “the models we were using were not very advanced. They were pretty simple,” says Candela.

An interior view of Facebook Building 20.

Stephen Lam

Another engineer who had joined Facebook at the same time as Candela (they attended the new employee “code boot camp” together) was Hussein Mehanna, who was similarly surprised at the lack of the company’s progress in building AI into its system. “When I was outside of Facebook and saw the quality of the product, I thought all of this was already in shape, but apparently it wasn’t,” Mehanna says. “Within a couple of weeks I told Joaquin that what’s really missing at Facebook is a proper, world-class machine learning platform. We had machines but we didn’t have the right software that would could help the machines learn as much as possible from the data.” (Mehanna, who is now Facebook’s director of core machine learning, is also a Microsoft veteran — as are several other engineers interviewed for this story. Coincidence?)

By “machine learning platform,” Mehanna was referring to the adoption of the paradigm that has taken AI from its barren “winter” of the last century (when early promises of “thinking machines” fell flat) to its more recent blossoming after the adoption of models roughly based on the way the brain behaves. In the case of ads, Facebook needs its system to do something that no human is capable of: Make an instant (and accurate!) prediction of how many people will click on a given ad. Candela and his team set out to create a new system based on the procedures of machine learning. And because the team wanted to build the system as a platform, accessible to all the engineers working in the division, they did it in a way where the modeling and training could be generalized and replicable.

One huge factor in building machine learning systems is getting quality data—the more the better. Fortunately, this is one of Facebook’s biggest assets: When you have over a billion people interacting with your product every day, you collect a lot of data for your training sets, and you get endless examples of user behavior once you start testing. This allowed the ads team to go from shipping a new model every few weeks to shipping several models every week. And because this was going to be a platform — something that others would use internally to build their own products — Candela made sure to do his work in a way where multiple teams were involved. It’s a neat, three-step process. “You focus on performance, then focus on utility, and then build a community,” he says.

Candela’s ad team has proven how transformative machine learning could be at Facebook. “We became incredibly successful at predicting clicks, likes, conversions, and so on,” he says. The idea of extending that approach to the larger service was natural. In fact, FAIR leader LeCun had already been arguing for a companion group devoted to applying AI to products — specifically in a way that would spread the ML methodology more widely within the company. “I really pushed for it to exist, because you need organizations with highly talented engineers who are not directly focused on products, but on basic technology that can be used by a lot of product groups,” LeCun says.

Candela became director of the new AML team in October 2015 (for a while, because of his wariness, he kept his post in the ads division and shuttled between the two). He maintains a close relationship with FAIR, which is based in New York City, Paris, and Menlo Park, and where its researchers literally sit next to AML engineers.

The way the collaboration works can be illustrated by a product in progress that provides spoken descriptions of photos people post to Facebook. In the past few years, it has become a fairly standard AI practice to train a system to identify objects in a scene or make a general conclusion, like whether the photo was taken indoors or outdoors. But recently, FAIR’s scientists have found ways to train neural nets to outline virtually every interesting object in the image and then figure out from its position and relation to the other objects what the photo is all about—actually analyzing poses to discern that in a given picture people are hugging, or someone is riding a horse. “We showed this to the people at AML,” says LeCun, “and they thought about it for a few moments and said, ‘You know, there’s this situation where that would be really useful.’” What emerged was a prototype for a feature that could let blind or visually impaired people put their fingers over an image and have their phones read them a description of what’s happening.

“We talk all the time,” says Candela of his sister team. “The bigger context is that to go from science to project, you need the glue, right? We are the glue.”

Candela breaks down the applications of AI in four areas: vision, language, speech, and camera effects. All of those, he says, will lead to a “content understanding engine.” By figuring out how to actually know what content means, Facebook intends to detect subtle intent from comments, extract nuance from the spoken word, identify faces of your friends that fleetingly appear in videos, and interpret your expressions and map them onto avatars in virtual reality sessions.

“We are working on the generalization of AI,” says Candela. “With the explosion of content we need to understand and analyze, our ability to generate labels that tells what things can’t keep up.” The solution lies in building generalized systems where work on one project can accrue to the benefit of other teams working on related projects. Says Candela, “If I can build algorithms where I can transfer knowledge from one task to another, that’s awesome, right?”

That transfer can make a huge difference in how quickly Facebook ships products. Take Instagram. Since its beginning, the photo service displayed user photos in reverse chronological order. But early in 2016, it decided to use algorithms to rank photos by relevance. The good news was that because AML had already implemented machine learning in products like the News Feed, “they didn’t have to start from scratch,” says Candela. “They had one or two ML-savvy engineers contact some of the several dozen teams that are running ranking applications of one kind or another. Then you can clone that workflow and talk to the person if you have questions.” As a result, Instagram was able to implement this epochal shift in only a few months.

The AML team is always on the prowl for use cases where its neural net prowess can be combined with a collection of different teams to produce a unique feature that works at “Facebook scale.” “We’re using machine learning techniques to build our core capabilities and delight our users,”says Tommer Leyvand, a lead engineer of AML’s perception team. (He came from…wait for it…Microsoft.)

Rita Aquino, Technical Product Manager at Facebook.

Stephen Lam

An example is a recent feature called Social Recommendations. About a year ago, an AML engineer and a product manager for Facebook’s sharing team were talking about the high engagement that occurs when people ask their friends for recommendations about local restaurants or services. “The issue is, how do you surface that to a user?” says Rita Aquino, a product manager on AML’s natural language team. (She used to be a PM at…oh, forget it.) The sharing team had been trying to do that by word matching certain phrases associated with recommendation requests. “That’s not necessarily very precise and scalable, when you have a billion posts per day,” Aquino says. By training neural nets and then testing the models with live behavior, the team was able to detect very subtle linguistic differences so it could accurately detect when someone was asking where to eat or buy shoes in a given area. That triggers a request that appears on the News Feed of appropriate contacts. The next step, also powered by machine learning, figures out when someone supplies a plausible recommendation, and actually shows the location of the business or restaurant on a map in the user’s News Feed.

Aquino says in the year and half she has been at Facebook, AI has gone from being a fairly rare component in products to something now baked in from conception. “People expect the product they interact with to be smarter,” she says. “Teams see products like social recommendations, see our code, and go — ‘How do we do that?’ You don’t have to be a machine learning expert to try it out for your group’s experience.” In the case of natural language processing, the team built a system that other teams can easily access, called Deep Text. It helps power the ML technology behind Facebook’s translation feature, which is used for over four billion posts a day.

For images and video, the AML team has built a machine learning vision platform called Lumos. It originated with Manohar Paluri, then an intern at FAIR who was working on a grand machine learning vision he calls the visual cortex of Facebook — a means of processing and understanding all the images and videos posted on Facebook. At a 2014 hackathon, Paluri and colleague Nikhil Johri cooked up a prototype in a day and a half and showed the results to an enthusiastic Zuckerberg and Facebook COO Sheryl Sandberg. When Candela began AML, Paluri joined him to lead the computer vision team and to build out Lumos to help all of Facebook’s engineers (including those at Instagram, Messenger, WhatsApp, and Oculus) make use of the visual cortex.

With Lumos, “anybody in the company can use features from these various neural networks and build models for their specific scenario and see how it works,” says Paluri, who holds joint positions in AML and FAIR. “And then they can have a human in the loop correct the system, and retrain it, and push it, without anybody in the [AML] team being involved.”

Paluri gives me a quick demo. He fires up Lumos on his laptop and we undertake a sample task: refining the neural net’s ability to identify helicopters. A page packed with images — if we keep scrolling, there would be 5,000 — appears on the screen, full of pictures of helicopters and things that aren’t quite helicopters. (One is a toy helicopter; others are objects in the sky at helicopter-ish angles.) For these datasets, Facebook uses publicly posted images from its properties—those limited to friends or other groups are off limits. Even though I’m totally not an engineer, let alone an AI-adept, it’s easy to click on negative examples to “train an image classifier for helicopters,” as the jargon would have it.

Eventually, this “classifying” step—known as supervised learning—may become automated, as the company pursues an ML holy grail known as “unsupervised learning,” where the neural nets are able to figure out for themselves what stuff is in all those images. Paluri says the company is making progress. “Our goal is to reduce the number of (human) annotations by 100 times in the next year,” he says.

In the long term, Facebook sees the visual cortex merging with the natural language platform for the generalized content understanding engine that Candela spoke about. “No doubt we will end up combining them together,” says Paluri. “Then we’ll just make it…cortex.”

Ultimately, Facebook hopes that the core principles it uses for its advances will spread even outside the company, through published papers and such, so that its democratizing methodology will spread machine learning more widely. “Instead of spending ages and ages trying to build an intelligent application, you can build applications far faster,” says Mehanna. “Imagine the impact of this on medicine, safety, and transportation. I think building applications in those domains is going to be faster by a hundred-x magnitude.”

Manohar Paluri, Applied Computer Vision Team Lead at Facebook, at Building 20 in Menlo Park, Calif. on Monday, Feb. 6, 2017.

Stephen Lam

Though AML is deeply involved in the epic process of helping Facebook’s products see, interpret, and even speak, CEO Zuckerberg also sees it as critical to his vision of Facebook as a company working for social good. In Zuckerberg’s 5,700-word manifesto about building communities, the CEO invoked the words “artificial intelligence” or “AI” seven times, all in the context of how machine learning and other techniques will help keep communities safe and well informed.

Fulfilling those goals won’t be easy, for the same reasons that Candela first worried about taking the AML job. Even machine learning can’t resolve all those people problems that come when you are trying to be the main source of information and personal connections for a couple billion users. That’s why Facebook is constantly fiddling with the algorithms that determine what users see in their News Feeds—how do you train a system to deliver the optimal mix when you’re not really sure that that is? “I think this is almost an unsolvable problem,” says Candela. “Us showing news stories at random means you’re wasting most of your time, right? Us only showing news stories from one friend, winner takes all. You could end up in this round-and-round discussion forever where neither of the two extremes is optimal. We try to bake in some explorations.” Facebook will keep trying to solve this with AI, which has become the company’s inevitable hammer to drive in every nail. “There’s a bunch of action research in machine learning and in AI in optimizing the right level of exploration,” Candela says, sounding hopeful.

Naturally, when Facebook found itself named a culprit in the fake news blame-athon, it called on its AI teams to quickly purge journalistic hoaxes from the service. It was an unusual all-hands effort, including even the long-horizon FAIR team, which was was tapped almost “as consultants,” says LeCun. As it turns out, FAIR’s efforts had already produced a tool to help with the problem: a model called World2Vec (“vec” being a shorthand for the technical term, vectors). World2Vec adds a sort of memory capability to neural nets, and helps Facebook tag every piece of content with information, like its origin and who has shared it. (This is not be be confused, though I originally was, with a Google innovation called Word2Vec.) With that information, Facebook can understand the sharing patterns that characterize fake news, and potentially use its machine learning tactics to root out the hoaxes. “It turns out that identifying fake news isn’t so different than finding the best pages people want to see,” says LeCun.

The preexisting platforms that Candela’s team built made it possible for Facebook to launch those vetting products sooner than they could have done otherwise. How well they actually perform remains to be seen; Candela says it’s too soon to share metrics on how well the company has managed to reduce fake news by its algorithmic referees. But whether or not those new measures work, the quandary itself raises the question of whether an algorithmic approach to solving problems — even one enhanced by machine learning — might inevitably have unintended and even harmful consequences. Certainly some people contend that this happened in 2016.

Candela rejects that argument. “I think that we’ve made the world a much better place,” he says, and offers to tell a story. The day before our interview, Candela made a call to a Facebook connection he had met only once—a father of one of his friends. He had seen that person posting pro-Trump stories, and was perplexed by their thinking. Then Candela realized that his job is to make decisions based on data, and he was missing important information. So he messaged the person and asked for a conversation. The contact agreed, and they spoke by phone. “It didn’t change reality for me, but made me look at things in a very, very different way,” says Candela. “In a non-Facebook world I never would have had that connection.”

In other words, though AI is essential — even existential — for Facebook, it’s not the only answer. “The challenge is that AI is really in its infancy still,” says Candela. “We’re only getting started.”

Creative Art Direction: Redindhi Studio
Photography by: Stephen Lam

[How Google is Remaking Itself as a “Machine Learning First” Company
*If you want to build artificial intelligence into every product, you better retrain your army of coders. Check.*backchannel.com](https://backchannel.com/how-google-is-remaking-itself-as-a-machine-learning-first-company-ada63defcb70 "https://backchannel.com/how-google-is-remaking-itself-as-a-machine-learning-first-company-ada63defcb70")[You Too Can Become a Machine Learning Rock Star! No PhD Necessary.
*Neural net startup Bonsai launches AI for dummies.*backchannel.com](https://backchannel.com/you-too-can-become-a-machine-learning-rock-star-no-phd-necessary-107a1624d96b "https://backchannel.com/you-too-can-become-a-machine-learning-rock-star-no-phd-necessary-107a1624d96b")[An Exclusive Look at How AI and Machine Learning Work at Apple
*The iBrain is here — and it’s already inside your phone.*backchannel.com](https://backchannel.com/an-exclusive-look-at-how-ai-and-machine-learning-work-at-apple-8dbfb131932b "https://backchannel.com/an-exclusive-look-at-how-ai-and-machine-learning-work-at-apple-8dbfb131932b")