September 15, 2016

How Hillary's Campaign Is (Almost Certainly) Using Big Data

The evidence suggests her campaign is using a highly targeted technique that worked for Obama—but which Trump may not be taking advantage of

By Eric Siegel

This article was published in Scientific American’s former blog network and reflects the views of the author, not necessarily those of Scientific American

Analytics will win votes this year. Science, as it did in 2012, is playing an important role for mass voter persuasion in the U.S. presidential race. It’s a numbers game: Predictive analytics targets campaign activities, strengthening a campaign's army of volunteers by driving its activities more optimally.

Of which presidential candidate do I speak? We have every reason to believe that Hillary Clinton's campaign is leveraging predictive analytics—as Obama’s did in 2012. Donald Trump's campaign appears to lag in such efforts.

Hillary for America is leveraging data science in a very particular way. The undertaking predicts each individual voter's response to campaign contact in order to drive millions of decisions as to which voter receives a knock on the door or a phone call. It’s an innovative, data-driven process that has changed the game for political campaigns.

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

Nate Silver Forecasts the Future—but Doesn't Change It

A campaign's number crunching is an undercover enterprise—but another form of quantitative prognostication is right out in the open: campaign forecasting via poll aggregation. Here, the heavyweight champion is Nate Silver, the most celebrated statistician in the US, who forecasted the election outcome correctly for each and every state in 2012. See his current 2016 forecasts here.

But an election poll does not constitute prognostic technology—it does not endeavor to calculate insights that foresee human behavior. Rather, a poll is plainly the act of voters explicitly telling you what they’re going to do. It’s a mini-election dry run. There’s a craft to aggregating polls, as Silver has mastered so adeptly, but even he admits it’s no miracle of clairvoyance. “It’s not really that complicated,” he told late night talk show host Stephen Colbert the day before the 2012 election. “There are many things that are much more complicated than looking at the polls and taking an average... right?”

Instead, true power comes in influencing the future rather than only speculating on it. Nate Silver publicly competes to win election forecasting, while Obama’s analytics team discreetly competed by way of predictive analytics to win the election itself–as Hillary for America is now doing. This is a form of of quantitative prediction that transcends forecasting the outcome to actually exert an effect on it.

The value proven in 2012 is too good to pass up for 2016. Obama for America showed that their analytics convinced more voters than traditional campaign targeting. The method also improved the campaign’s TV ad buying, making the TV ad buy 18% more effective—they could persuade 18 percent more voters with the same level of investment, which is a meaningful effect given their TV budget of $400 million.

The Evidence: Hillary for America is Using Analytics for Voter Persuasion

The specifics are well-guarded secrets, but the evidence clearly indicates that Hillary for America is deploying predictive analytics—more specifically, an advanced flavor thereof called persuasion modeling (aka uplift modeling)—as Obama for America did. Here’s the data that supports this presumption:

1) TRACTION. Daniel Porter, one of three hands-on practitioners who executed the persuasion modeling for Obama for America, and who has since co-founded the analytics firm BlueLabs (see the Q-and-A below), stands by this technical approach. “It remains clear that persuasion modeling is extraordinarily valuable for political campaigns. In fact, after the experience accrued last time around, it’s sure to be done by 2016 campaigns even more effectively than in 2012,” he told me. He says there’s also going to be better data for this work, at least on the Democratic side. “The DNC is building out further its data infrastructure about voters in battleground states.”

2) HIRES. As early as July 2015, the Hillary for America campaign posted that their “analytics team is looking for data nerds.” Shown as one of 11 campaign job categories on the campaign's website, analytics included five types of open roles. More specifically, analytics job postings enlisted staff for persuasion modeling: “helping the campaign determine which voters to target for persuasion.” The campaign's analytics director is Elan Kriegel, another co-founder of BlueLabs, who grew the campaign’s data team by pulling people from BlueLabs.

3) CONTRACTS. Hillary for America has engaged BlueLabs for analytics services—at least $50,000 worth. And Civis Analytics, another analytics company, which employs at least 27 "data whiz kids" from Obama’s 2012 campaign (Eric Schmidt is the sole investor) has received more than $3.5 million in payments from Democratic campaigns in the last two cycles.

In anticipation of his keynote presentation at Predictive Analytics World (October 23-27 in New York), "Persuasion Modeling in Presidential Campaigns and How It Applies to Business," I had the opportunity to ask Dan Porter a few questions about his work for Obama and what may currently be in play for the 2016 election.

Q: What was the most surprising discovery or insight you unearthed when applying uplift modeling for Obama for America 2012?

We discovered in 2012 that self-reported independents and non-partisans are not especially likely to be persuadable, and many voters that were affiliated with a political party actually were persuadable. However, in uplift modeling work we've undertaken at BlueLabs since 2013, this actually isn't always the case. The lesson we've learned is that constant experimentation and uplift modeling is a worthwhile endeavor, since the types of people who are persuadable can vary widely based on the particular campaign, message, mode, and timing.

Q: What are the biggest differences between applying uplift modeling for a commercial marketing campaign versus for a political campaign?

On the political side we are relying on survey as a proxy for a voter's candidate preference. It's the best proxy we have, but it still relies on self-reported intent, and requires innovative sampling design to ensure that the survey is unbiased and reaches a representative sample of the population. However, on the commercial side, in many cases, we can build and validate our uplift models off actual purchase data. This makes the problem more straightforward.

Q: What public evidence is there that Trump's campaign is or is not using predictive analytics or even uplift modeling in particular?

We really have no idea what Trump is or isn't doing. We are confident on where campaigns and organizations on the Democratic side of the spectrum are in terms of its analytics capabilities, but it's important that we continue to innovate and that we can't worry about what groups on the right are/aren't doing.

Indeed, the Trump campaign is "spurning the kind of sophisticated data operation that was a centerpiece of Barack Obama's winning White House runs." There's speculation this could not only hurt his chances in the election, but also deny the RNC—generally thought to already be behind the DNC in data and analytics—valuable data collection for future campaigns.

Advanced and analytical yet not arcane, predictive modeling for voter persuasion has launched a whole new chapter for politics.