How to turn Facebook likes into Votes? - Link3 Technologies Ltd
Loading…

Our Blog

Find the latest updates of Link3 in our blog.

How to turn Facebook likes into Votes?

How do 87m records scraped from Facebook become an advertising campaign that could help swing an election? What does gathering that much data actually involve? And what does that data tell us about ourselves?

The Cambridge Analytica scandal has raised question after question, but for many, the technological USP of the company, which announced last week that it was closing its operations, remains a mystery.

For those 87 million people probably wondering what was actually done with their data, I went back to Christopher Wylie, the ex-Cambridge Analytica employee who blew the whistle on the company’s problematic operations in the Observer. According to Wylie, all you need to know is a little bit about data science, a little bit about bored rich women, and a little bit about human psychology…

Step one, he says, over the phone as he scrambles to catch a train: “When you’re building an algorithm, you first need to create a training set.” That is: no matter what you want to use fancy data science to discover, you first need to gather the old-fashioned way. Before you can use Facebook likes to predict a person’s psychological profile, you need to get a few hundred thousand people to do a 120-question personality quiz.

The “training set” refers, then, to that data in its entirety: the Facebook likes, the personality tests, and everything else you want to learn from. Most important, it needs to contain your “feature set”: “The underlying data that you want to make predictions on,” Wylie says. “In this case, it’s Facebook data, but it could be, for example, text, like natural language, or it could be clickstream data” – the complete record of your browsing activity on the web.“Those are all the features that you want to [use to] predict.”

At the other end, you need your “target variables” – in Wylie’s words, “the things that you’re trying to predict for. So in this case, personality traits or political orientation, or what have you.”

To a survey user, the process was quick: “You click the app, you go on, and then it gives you the payment code.” But two very important things happened in those few seconds. First, the app harvested as much data as it could about the user who just logged on. Where the psychological profile is the target variable, the Facebook data is the “feature set”: the information a data scientist has on everyone else, which they need to use in order to accurately predict the features they really want to know.

It also provided personally identifiable information such as real name, location and contact details – something that wasn’t discoverable through the survey sites themselves. “That meant you could take the inventory and relate it to a natural person [who is] matchable to the electoral register.”

Second, the app did the same thing for all the friends of the user who installed it. Suddenly the hundreds of thousands of people who you’ve paid a couple of dollars to fill out a survey, whose personalities are a mystery, become millions of people whose Facebook profiles are an open book.

That’s where the final transformation comes in. How do you turn a few hundred thousand personality profiles into a few million? With a lot of computing power, and a massive matrix of possibilities. “Even though your sample size is 300,000 people, give or take, your feature set is like 100m across,” says Wylie. Every single Facebook “like” found in the data set becomes its own column in this enormous matrix. “Even if there is only one instance in the entire set, it’s still a feature.”

“All that data was then put into an ensemble model,” Wylie says. “This is when you use different families or approaches of machine learning, because each of them will have their own strengths and weaknesses… and then they sort of vote, and then you amalgamate the results and come up with a conclusion.” This is where data science becomes more of a data art: the exact input of each approach to the overall model isn’t set in stone, and there’s no right way to do it. In the academic world, it’s sometimes called “training by grad student” – the point where the only thing to do is move forward through laborious trial and error. Still, it worked well enough, and in the end, Wylie says, “we built 253 algorithms, which meant there were 253 predictions per profiled record”. The goal was achieved: a model that could effectively take the Facebook likes of its subjects and work backwards, filling in the rest of the columns in the spreadsheet to arrive at guesses as to their personalities, political affiliations and more.

YOU MIGHT ALSO LIKE

  • Susan Fowler, a former Uber Engineer posted a blog where ...

  • If your account is hacked, it's often because ...

  • A long-term video feature has been added to the Instagram ...

  • The fist Hifiman was something to talk about. He could ...

Archives

  • Security researcher UpGuard Cyber Risk disclosed Friday that ...

  • Setting up a wireless router can be mammoth task and ...

  • On 11th March, 2017, Disney gave a bit of an ...

  • Many people have already experienced the security checks while browsing ...

  • Susan Fowler, a former Uber Engineer posted a blog where ...

  • After a long awaited beta process the new and latest ...

  • The Classic Tetris World Championship happened in Oregon. There a ...

  • The fist Hifiman was something to talk about. He could ...

  • An aircraft was damaged by an explosion shortly after the ...

  • Very soon people will be able to play World of ...