After Hours Marketing podcast: “The Subtle Differences Between Reporting and Analytics”

I was recently interviewed by entrepreneurial B2B digital marketing strategist Greg Allbright for the third episode of his newest podcast, After Hours Marketing. The episode, titled “The Subtle Differences Between Reporting and Analytics,” follows this recent post on the subject. We also touched on the CRAPOLA design principles for quantitative data. If you’d like to listen, the recording is available here.

 

Seven ways to avoid the seven month data science itch

Image courtesy of Relationships Unscripted

It is a story as old as data science: company has valuable data asset, company hires data scientist, data scientist finds insights, company is thrilled, company can’t figure out what to do next. The months after a company’s data first goes under the microscope can be an exciting time, but as initial goodwill fades the thrill of young love can turn into a seven month itch.

The itch often starts as a mere tickle, but indicates very serious underlying issues. If you have heard the question “how can we use data science here?” around the office more than once, it is likely that your organization is about to enter the trough of disillusionment. Seemingly small problems have a way of revealing themselves to be much larger underneath; the seven month itch can be deadly to a company’s data science efforts.

The seven month itch is felt by both the data science team and its collaborators. The result is a lack of effectiveness, though the perceived root causes differ across the aisle.

Data scientists typically see the seven month itch as a lack of business prioritization and engineering resources to bring their vision to fruition. They often have a seemingly endless stream of ideas and are able to find promising insights in the data, but see the business as unsupportive. This is probably the most common issue expressed by young data science teams.

Collaborators of the data science team, on the other hand, typically see the seven month itch as a lack of practicality and business savvy on the part of the data scientists. Data scientists are frequently derided for being too academic, for privileging theoretical interest over practical relevance. This is often the case.

Both data scientists and their collaborators feel they are not getting what they need from the other party.

For product-facing data science teams, for example, product design is a common area where each side feels the other should have most of the answers. Data scientists assume product managers should be able to spearhead the translations of their predictive models into functioning systems, and product managers assume data scientists should guide the product realization process. In reality, building data products is best done as a team effort that incorporates a variety of skillsets (see, for example, DJ Patil’s Data Jujitsu: The Art of Turning Data into Product).

Communication disconnects – the primary source of the seven month itch – are widespread and detrimental. Generally scrappy folks, data scientists often find ways to scratch the itch. But treating the underlying cause is the only effective long term solution.

As a start, we propose seven things that data science teams and their managers can do to avoid the itch altogether. While these suggestions are targeted toward fairly young teams that are product facing, they should be generally applicable to more experienced and decision support oriented teams as well. These recommendations can help align the business and data science stakeholders, justify dedicating company-wide resources to data science projects, and generally innovate more effectively.

#1: Tie efforts to revenue

Tie data science efforts directly to revenue. Find ways to impact client acquisition, retention, upselling, or cross-selling, or figure out ways to reduce costs. There are invariably countless ways to do this, and the method makes less difference than the critical pivot of allowing data science to be a revenue generator rather than a cost center. Executive sponsorship is critical for the success of data science efforts, especially due to their interdisciplinary and collaborative nature. Some organizations may highly value making progress toward other critical business goals, and making a dent in such goals can be a perfectly valid substitute, but focusing on projects that will impact the company’s bottom line is the surest universal way to get the attention of senior management.

#2: Generate revenue without help

Generate revenue streams prior to requesting any engineering effort. Charging for one time or periodic reports or analyses delivered via Excel or PowerPoint, for example, can be useful in getting attention and buy-in from the rest of the organization.

#3: Strengthen the core

Focus first on improving the organization’s core products, the products that generate most of the revenue. The numbers are bigger there, so having a bigger revenue impact is easier.

#4: Strengthen client relationships

Generate a more consultative relationship with clients through analytics, which will make data science indispensable to account managers and have the potential for many revenue related benefits. For example, do one off analyses for clients to help them improve their use of the product or alleviate other pain points, and track progress closely. Sometimes these partnerships will generate big wins for clients, and sometimes they will be generalizable. As a result, they can be an excellent source of battle-tested business cases, which stand a better chance of getting funded.

#5: Kickstart big projects with quick wins

Don’t forget about the low hanging fruit. Data scientists are often tempted by the hard problems, but quick wins can help justify a longer leash. Instead of going straight to creating robust decision recommendations for clients, for example, surface some basic data visibility to support their decisions, or give them A/B testing capability if applicable. This first step may not be interesting work from a data science perspective, but is can establish a new client need (e.g. “I don’t know what to do with this information or even what to A/B test; help me optimize this decision”) that data science can then step in and solve. An iterative approach works very well for developing data (and many other types of) products.

#6: Measure customer value

View your products in the light of customer value. You can often justify high prices for data products because they add demonstrable value for clients. Customer value can be data science’s best friend; giving the business a baseline level of visibility into customer value can open new worlds, and should probably be an even higher priority than building any specific data product. Create customer value metrics for the business, separately tracking core products and products that add measurable value. Ideally, give an executive responsibility for goals based on these metrics.

#7: Measure innovation

Create innovation metrics for the business, such as the percentage of revenue from products that are at most one year old, or the percentage of revenue from products that draw their value from analytics. Ideally, give an executive responsibility for goals based on these metrics.

Many organizations have most of the elements necessary to benefit greatly from data science. These recommendations should help provide a common language between the data science team and its collaborators, which is all that is necessary to avoid the seven month itch. No scratching needed.

The law of small numbers

Otto von Bismarck

Big Mark, a new manager at Statistical Misconception Corporation, recently hired his first ever direct report, Little Mark. Little Mark turned out to be a star contributor. The company’s CEO, Biggest Mark, noticed one of Little Mark’s landmark accomplishments, and quickly earmarked another hire for Big Mark. However, he remarked, “You’ve really lucked out with Little Mark, who set a high benchmark. Less than half of new hires make such a marked impact, so according to the law of averages your next hire will miss the mark.” Big Mark raised a question mark: is Biggest Mark’s reasoning on the mark?

The law of averages is an imprecise term, and it is used in many different ways. It is one of the most popularly misused applications of statistical principles. Biggest Mark’s interpretation of the law of averages is fairly common – that outcomes of a random event will even out within a small sample. Let’s discuss two of the main problems with Biggest Mark’s reasoning:

First, the sample size is too small. In probability theory, the law of large numbers (which is likely the seed that is misinterpreted into the law of averages) states that the average of the results obtained from a large number of independent trials should come close to the expected value, and will tend to become closer as more trials are performed. For example, the likelihood that a flipped penny lands on heads is 50%, so if you flip it a large number of times then roughly half of the results should be heads. This law doesn’t say anything meaningful about small sample sizes – it’s the law of large numbers, not the law of small numbers. In fact, if you flip a penny twice, there is only a 50% chance that you will end up with one head and one tail; the other half of the time you’ll end up with either two heads or two tails. For 10 flips, about a third of the time you’ll end up with at least 7 heads or 7 tails.

Second, the law of large numbers is only applicable if each event is independent of all the others. In probability theory, two events are independent if the occurrence of one does not affect the probability of the other. In our case, invoking the law of large numbers would necessitate that the likelihood of Big Mark’s second hire being excellent is unaffected by the performance of his first hire.

Back at Statistical Misconception Corporation (trademark), Biggest Mark has a business sense second to none. While attending college in Denmark, where his maximum marks were in marketing, he became known as Business Mark, or Biz Markie for short. (As a side note, his relationship with this girl from the U.S. nation, Beauty Mark, ended tragically when he found out that Other Mark wasn’t just a friend.)

In the end, it turned out that Big Mark’s next hire, Accent Mark, defied the CEO’s curse. Instead of making a skid mark, he set a new watermark for what a Mark should be – he was one of the markiest Marks the company had ever marked.

#HashMark

The red and green rule

Image courtesy of auroragov.org

Early in my career, I produced many data visualizations for a senior executive. Let’s call him Gordon. Gordon is unabashedly a man of strong convictions. One of his most strongly and repeatedly voiced was that, in any data visualization, the color green had to represent good and the color red had to represent bad. And there always had to be good and bad. No exceptions.

I quickly caught on, and for the work I did for him I began abiding by his iron red and green rule. Nevertheless, he reminded me of it several times, asking whether red meant good and green meant bad in the visualizations I produced, despite my answers being uniformly in the affirmative.

At first, I attributed his frequent questions to a lack of trust; in fact, it was one of the only factors contributing to my (slight) perception of a lack of trust in our relationship.

Then, one day, I was sitting in Gordon’s office, while a dashboard brimming with my red and green charts filled his computer screen. He was one of the primary users of this particular dashboard, and was quite familiar with its content. He asked me a question that, for the first time, took me by surprise: “are these charts red and green?”

That’s the moment I realized that the frequent questions were not about trust. Gordon is colorblind.

Why would someone with red-green colorblindness want reports of which he is a primary user in red and green?

It turns out that Gordon picked up this conviction while working for a (colorseeing) CEO who insisted on red and green in his charts. Once acquired, it became a hard and fast rule, part of Gordon’s data visualization grammar.

We all have strong convictions. In the world of data visualization, convictions are often influenced by poor conventions set over decades, such as by lay users of Microsoft Office. The proliferation of 3D pie charts is more about convictions and conventions than about good data visualization.

The next time you’re creating a data visualization, apply your own critical thinking rather than relying on conventions. Consider the best way to surface and communicate the information. Critical thinking, not following conventions, is the path to creating the best data visualizations.

CRAPOLA design principles for quantitative data

crapola

Robin Williams (from her website: “writer. teacher. mom. not the actor”) has developed the widely cited CRAP design principles: Contrast, Repetition, Alignment, and Proximity. These are powerful general purpose design principles, but a few additional principles are helpful with respect to the display of quantitative information.

Inspired by the work of Edward Tufte, I propose three additional rules be added to complete CRAP: Obviousness, Lightness, and Accuracy. The resulting CRAPOLA design principles for quantitative data are:

Contrast: avoid similar elements (type, color, size, shape, etc.); if they’re not the same, make them very different.

Repetition: repeat visual elements throughout to organize and unify.

Alignment: every element should have some visual connection with another element.

Proximity: group related items close together to facilitate comparison.

Obviousness (clarity): clearly communicate the data.

Lightness (efficiency): show the data and nothing else.

Accuracy (precision): do not omit or distort data.

Reporting versus analytics

“We are drowning in information but starved for knowledge.”

- John Naisbitt

For the love of statistics, please stop using the words “reporting” and “analytics” interchangeably. Reporting is the display of information. Analytics is the interpretation of information. Analytics is the process that turns information into insight, reporting into understanding.

Every time someone uses the word “analytics” to mean “reporting,” a statistician loses its wings. If you ever need evidence that the misuse of these terms is widespread, just think about how many statisticians you’ve met who have wings. I’ve never met one. All the wings are gone.

Here are a few of my favorite illustrations of the difference between reporting and analytics:

xkcd extrapolating

Source: xkcd.com

New Cuyama

Since analytics solves problems, which is a lot sexier than reporting’s goal of presenting information, companies with reporting solutions have a vested interest in conflating the terms. Phrases like “analytics solution,” “analytic database,” and “advanced analytics” are used with the intent to confuse, to convince users to invest in technology over thought. A typical exchange goes as follows:

Reporting tool salesperson: “I have an analytic database.”

Business person: “Thank God! All my problems are solved.”

Here are a few of my favorite examples of oversells in the reporting world:

And don’t get me started on the phrase “business intelligence,” which has been used so many different ways with so many different connotations that it has been rendered virtually meaningless. If you ask ten people in the BI industry what “business intelligence” means, you’ll get twenty opinions. In case you’re looking for something to do this weekend.

Uber’s newest data visualization is uber misleading

The description of surge pricing on Uber’s website begins with the carefully crafted sentence: “Uber rates increase to ensure reliability when demand cannot be met by the number of drivers on the road.” What that really means is that fares are higher at the worst possible time.

If it weren’t for surge pricing, Uber would be America’s most loved corporate darling. Okay, that’s not true at all. What is true is that Uber’s practice of surge pricing is widely disliked. This past New Year’s Eve surge ushered in some particularly bad press (e.g. this and this), so Uber’s on a mission to get back its users’ love.

I woke up this morning to an e-mail from Uber justifying and explaining the practice:

Notice anything funny about the graphic toward the bottom of the e-mail? If not, take a moment to look at it again.

See it? There’s a huge mismatch between the prices displayed and the widths of the corresponding bars.

Here are the prices of the illustrative rides and bar widths in Uber’s e-mail (as displayed in Gmail in the latest version of Google Chrome). Also included for each value is percentage change compared to an uberX ride with no surge.

The graphic contains three clear visual lies, each of which skews perception in Uber’s favor:

  1. The uberX ride with 1.5x surge is 41% more expensive than the uberX ride with no surge, but the corresponding bar is only 13% wider. The effect is that surge pricing is portrayed as less of a jump than it actually is.
  2. The uberX ride with 2.0x surge is 83% more expensive than the uberX ride with no surge, but the corresponding bar is only 47% wider. The effect is the same.
  3. The uberPOOL (Uber’s ride-sharing service) ride is 47% less expensive than the uberX ride, but the corresponding bar is 58% narrower. While uberPOOL is significantly cheaper than uberX, the effect portrays uberPOOL as an even better deal than it is. (It is no secret that Uber wantsmore riders to pool.)

I repeat: Uber’s poor visualization makes it look like surge pricing isn’t as much of a jump as it actually is, and that uberPOOL is a better deal than it actually is. That is not good.

Uber’s bar chart doesn’t just use tricks of the eye as many misleading visualizations do. The widths of the bars purporting to represent price simply don’t correspond to the prices in any way. From a data visualization perspective, this is a clear misrepresentation of the facts. Regardless of whether this misrepresentation was intentional or an oversight (to be clear, this article takes no position on this issue), this will almost definitely have an impact on consumer behavior.

Instead of graphing as an afterthought, recognize that visualizations can have real consequences. Which, if you are seeking to spread the truth, is a challenge worth some serious consideration. And which, if you happen to be a propagandist, is pretty darn convenient.

How to hire data scientists – free white paper

Check out Bitten Labs’ new white paper on how to repeatably identify top data science talent. The process has been informed by hiring processes at many organizations with effective data science groups, refined through several iterations, and tested with hundreds of applicants. The white paper is a significantly expanded version of our popular blog post.

The white paper can be accessed here: How to Hire Data Scientists.

Happy hiring!

 

Hiring data scientists: the data problem

Image courtesy of misterbisson

Data scientists: many companies want them, but few know how to identify them. A widely-cited 2011 McKinsey report highlights the steep shortage of analytical talent capable of extracting value from data. With demand outpacing supply and companies hiring faster than they can find good people, the data scientist title is rapidly becoming diluted as it spreads. As a result, the title’s proliferation in resume experience sections over the past few years isn’t making it any easier to find qualified candidates.

I’ll be sharing some tricks of the trade that can help identify the elusive data scientist, a mythical beast with the power to turn data into not just insight but dollar signs. The current post will focus on what I’ve found to be a key component to interviewing data scientists: the data problem. Many prospective data scientists can talk before they can walk so, unless you are otherwise certain of their skills, requiring job candidates to prove their mettle is a must.

The process we’ve developed at Litle, where I founded and continue to run the data science group, has been refined through several iterations and proven invaluable with dozens of job applicants over the past few years. The basic framework is simple: once candidates have passed an initial screening, we provide some data and a problem statement and ask them to prepare slides and present the results to a diverse group of interviewers.

The rest of this post will delve deeper into (1) finding a good data problem, (2) engaging candidates while they’re working on the problem, (3) hosting the presentation, and (4) evaluating the candidate.

Phase 1: Finding a problem

At Litle, we give prospective data scientists a table containing around a million rows and ten columns of data directly (modulo anonymization) from our database. This dataset is accompanied by a dense and polished one page problem description – divided roughly equally into background information, data definitions, and problem statement – that describes a research problem to which we’ve dedicated many scientist-months and from whose results we’ve built a successful analytics-based product.

Designing a good problem is critical to success. The particulars will (and should) vary, but here are some general pointers:

Make it relevant. Gear the problem statement to identify skills relevant to your business; there is no one size fits all data problem. Ideally, use a real dataset and pose a problem with which the interviewers are highly familiar, to facilitate conversation and evaluation.

Make it fuzzy. Refrain from providing clear success metrics. Well-defined problems are rare in the real world and not typically the purview of the data scientist. To mirror what makes a data scientist successful on the job, identifying the problem – discovering the right questions to ask of the data given limited background knowledge – and defining success metrics for a solution should be part of the challenge. However, make sure to keep definitions clear so you’re all on the same page. In addition to asking for a solution to the problem at hand, explicitly encourage the candidate to present other insights found in the data; a good candidate won’t let this distract from the main point.

Make it hard. This goes hand in hand with “make it relevant” and “make it fuzzy”: you probably aren’t looking for a data scientist to solve easy, well-defined problems for your business. Make sure a substantial amount of preprocessing is necessary before a standard model is applied.

Ask for action. Require actionable insights from the data. Someone who finds cool things in a data set all day long may be a great help in generating marketing content, but recommending actions is what will help your business make money and ship product.

Use complex, real world data. Assuming the hire will be working with real world (as opposed to simulated) data, which is usually the case, make sure the data has some messiness. Missing values are a must (don’t try to make them go missing yourself, as this process will invariably leave undesirable artifacts behind). Unlabeled factors (e.g. “Factor A,”, Factor B,” etc.) are nice, as candidates won’t have the easy out of relying on logic alone to dictate models and better candidates will attempt to discern the meanings of these factors and of individual values.

Phase 2: Guiding the solution

There are many opportunities during the work phase to have a big impact on the outcome.

Set expectations. Be open and honest from the start on your expectations around desired output, project timeline, interactions, and audience. Aim for a quick turnaround from beginning to end; one to two weeks usually works well. Most candidates who make it all the way to the presentation claim they spent two to three full days between doing research and preparing slides.

Communicate early and often. Encourage candidates to send questions, plans for vetting, and even presentation drafts. These interactions make the process go more smoothly and allow early appearance of green and red flags. We’ve learned from work plans that certain candidates are exceptionally organized. We’ve also learned from early presentation drafts that others are not going to work out. In return, provide plentiful feedback and advice, as you (presumably) would on a new employee’s first project.

Start evaluating early. Evaluation starts the moment you introduce the notion of a data problem to a candidate. A great deal of useful information can be gathered before the candidate even shows up for the presentation.

Pull the plug early if it won’t work out. Be clear in advance that the invitation to present is conditional on strong evidence of potential success. Don’t invite anyone to give a presentation until you’ve seen this evidence. If there is reasonable evidence to the contrary then thank them for their time and pull the plug. Time is precious.

Be nice. Be polite and appreciative as candidates will be putting a lot of their own time into something that may very well not result in a job offer. Also, remember that recruiting is a great way to build your network.

Leave the tools up to the candidate. A good data scientist should be resourceful enough to find freely available tools to work on the problem, and most should already have some favorites on hand. You don’t need to, and shouldn’t, provide software. Advice is of course fine.

Phase 3: Hosting the presentation

An interactive presentation is key to successful evaluation.

Keep it short and follow up with breakouts. An hour including questions works well. Starting with a presentation to all the day’s interviewers has the added bonus of making subsequent meetings (one on one or in small groups) more efficient.

Have the right people in the room. At least one or two strong data scientists absolutely need to be present. If you don’t have any data scientists in house, make sure to bring in a trusted consultant (who should also craft the problem statement). This isn’t something you should try without the proper expertise. Beyond this requirement, tailor the audience to the hire’s role. A mix of data scientists, product owners, engineers, and solutions experts may work well for a product research and development role, for example.

Ask the easy questions. Get a sense for whether the candidate has an overall understanding of the data and the business problem at hand. A good data scientist will have a basic sense for the data after spending several hours with it, and practically minded candidates will have given the business some thought.

Ask the hard questions. Don’t be afraid to ask a question that’s too difficult to answer completely. Soliciting some speculation is a good thing. The good candidates will be able to say something meaningful and, just as importantly, will understand and state their limits.

Make it interactive. Challenge what the candidate is saying. Ask for an alternate explanation when something isn’t perfectly clear. If a machine learning method is applied, ask the candidate to explain it in terms everyone in the room can understand. Data scientists worth their weight generally excel at common sense explanations. A quick round of introductions can help break the ice and allow the presenter to better contextualize questions.

Add information. Provide information during the presentation that the candidate didn’t have before. A strong applicant should be able to roll with the punches, incorporating and synthesizing new information on the spot. Changing information and assumptions are part of the job.

Ask for next steps. Science is never done. What would the candidate do if given more time?

Evaluation

While evaluation should start before the presentation, this section will focus on the presentation. Look for:

Big picture thinking. The candidate should spend most of the presentation on understanding the data and problem: visualization, data cleaning, asking fundamental questions to assess bias, introducing frameworks, diagramming, discussing assumptions, etc. Back of the envelope calculations are a good sign. Better machine learning people will usually spend the bulk of the time talking about feature selection and cover the modeling in a few minutes. Clarity on the big picture should be evident, and your own understanding of the data should improve from attending. Simplicity is king.

Storytelling. Distilling a complex dataset into a simple and meaningful story is a key skill for data scientists. As an apocryphal quote often attributed to Einstein goes, “Everything should be made as simple as possible, but not simpler.” Approximation is one simple example of this: a candidate should verbalize numbers to the precision at which they best support the story without distraction (“twelve million” or “around ten million” instead of “eleven million, seven hundred fifty four thousand, eight hundred twenty one”).

Communication. Since subtle communication is common for data scientist roles, expect strong communication skills. Superb scientific communication is important for every hire, and excellent business communication is key for those interacting outside the data science group. Even if some interviewers don’t get all the technical intricacies of the presentation, everyone should get something significant out of it. The slides should be well structured and look passably decent, with a significantly higher bar for business-facing roles. The candidate should display openness to new ideas and be able to address constructive feedback without defensiveness. Humility is critical to success in the data scientist role.

Productivity. Look for evidence of productivity while still demanding simplicity in communication of results. Prepare to be amazed here; some stronger candidates do a pretty incredible amount of work in a short time. Depending on the tool used, high productivity (taken together with logical reasoning, organization, a clear framework and consistent use of terminology, etc.) is one clue that the candidate is or has potential to become a strong coder.

Creativity. Expect to learn something new (and non-canonical) from the presentation. The candidate should look at things at least a little bit differently than anyone else in your organization and go beyond textbook methods.

Connoisseurship of good science. You’re hiring a scientist, so evaluate the candidate on rationality and skepticism. The candidate should pay explicit attention to issues of experimental design and validation, causality, bias, normalization, significance, etc. If a model is created the data should be split into modeling and testing sets, time trends should be interrogated to make sure results are meaningful, and so forth. Talking in terms of black boxes and magic is not a good sign. Measured confidence is a positive trait but humility in the face of reason is equally as important for a data scientist role.

Curiosity, passion, and resourcefulness. These are traits that help employees grow; if you find a high potential individual who possesses them in spades but is missing some experience then the hire may be well worth the training effort. Candidates with these skills will be able to figure out more about the data set, the problem, and your business than you expected given the information provided. The Internet and scientific literature are their friends. Pay attention to candidates who say they enjoyed the exercise, ask for ways to improve their solution, or express interest in continuing the investigation.

Practicality. The candidate should define success metrics that make sense. Conclusions should be actionable: they should solve meaningful business problems. The candidate should possess the mentality to get initial actionable results quickly and iterate, particularly if you are hiring for a product development role.

Attention to detail. The work should be thorough and the presentation should be relatively free from logical and typographical errors. The candidate should notice irregularities in the data and detect and correct for systemic biases. For example, in the case of time series the candidate should account for series cut off by the end of the sample period. Pay particular attention to how the candidate handles missing data.

Performance expected from a new employee. What you see is what you get. The presentation is a very good indication of what the output of a candidate’s first few projects will look like. You can blame performance less than what you’d expect from a new employee on any number of factors, but in practice job applicants put their best foot forward. Make sure the output works for you before proceeding.

Wow factor. Finally, expect to be wowed. Taken together, these criteria may seem like a steep order, but keep your expectations sky high. Data scientists who can make a real impact on your business are not junior employees and, to succeed, must excel in many of these areas. Despite the talent shortage, there are still good people on the market. If you have rich data sets and impactful problems, you should be able to attract talent. Just be ready to pay for it.

Nate Silver, Probabilistic Celebrity

Image courtesy of Randy Stewart via Wikimedia Commons

Tuesday night was quite spectacular for Nate Silver. I’m referring not to my suspicion that the oft-bespectacled, apparently liberal-leaning statistician was pleased with the outcome of the presidential race – though there was that – but rather the accompanying mighty boost to his career as his highest likelihood prediction of the election results proved spot on.

Silver’s predictions were based on what might be termed a “sophisticated form of poll aggregation.” His calculations incorporated the results of many polls, which were scrubbed and weighted based on factors such as historical accuracy of each polling firm, and his probability forecasts were derived from repeated election simulations (biostatistician Bob O’Hara speculated on some of the details). Note that Silver wasn’t the only one who applied statistics to polling data with impressive results; for example, check out Princeton Election Consortium’s forecast.

In the end, while the presidential race was declared by many prominent pundits as a dead heat or favoring Romney, the non-pundit’s New York Times blog served as a Silver lining for Democrats. Designer Michael Cosentino of gdgt.com published impressive-looking side-by-side maps of Silver’s predictions and actual results once most contests had been called by the major networks:

For correctly predicting all fifty states, Nate Silver was proclaimed the winner of the election (or at least a Silver medalist), a “poster child” of data in politics, “king of the quants,” the “patron saint” of big data, and even “a near-god for nervous Democrats.” Sales of his book, The Signal and the Noise, experienced a healthy spike. Even before the election, Twitter had been exploding with love for Nate, resulting in among many other things a surge of activity in the Chuck Norris-esque hashtag #NateSilverFacts and the parody accounts @fivethirtygreat, @fivethirtynate, and @drunknatesilver. Multiple domains have been registered in his honor, including NateSilverFacts.com and IsNateSilverAWitch.com, whose homepage reads:

Blind hate (for Nate)

Not all of the press has been as positive as labeling him a magical being. But most of the vitriol was unsubstantiated and suffered from a poor grasp of statistics and reason. New York Times columnist David Brooks said that “pollsters tell us what’s happening now. When they start projecting, they’re getting into silly land” and wrote that “even experts with fancy computer models are terrible at predicting human behavior” (to which technosociologist Zeynep Tufekci had a nice retort: “But experts with fancy computer models are good at predicting many things in the aggregate”). MSNBC’s Joe Scarborough referred to Nate Silver as an “ideologue” and a “joke.” On National Review Online, Josh Jordan claimed without substantiation that Silver’s partisanship “shows in the way he forecasts the election” (note that the article Jordan cites to demonstrate that Silver was “openly rooting” for Obama in 2012 talks only of the 2008 election). Slate’s Daniel Engber completely missed the point when he wrote that “Nate Silver didn’t nail it; the pollsters did.”

One of the worst offenders was Politico’s Dylan Byers, who wrote:

Prediction is the name of Silver’s game, the basis for his celebrity. So should Mitt Romney win on Nov. 6, it’s difficult to see how people can continue to put faith in the predictions of someone who has never given that candidate anything higher than a 41 percent chance of winning (way back on June 2) and – one week from the election – gives him a one-in-four chance, even as the polls have him almost neck-and-neck with the incumbent.

Ezra Klein has a nice explanation of one of the many flaws in Byer’s logic:

If Mitt Romney wins on election day, it doesn’t mean Silver’s model was wrong. After all, the model has been fluctuating between giving Romney a 25 percent and 40 percent chance of winning the election. That’s a pretty good chance! If you told me I had a 35 percent chance of winning a million dollars tomorrow, I’d be excited. And if I won the money, I wouldn’t turn around and tell you your information was wrong. I’d still have no evidence I’d ever had anything more than a 35 percent chance.

MIT Knight Science Journalism Tracker’s media critic Paul Raeburn had a similar reaction:

Let’s compare Silver’s work to a weather forecast. As of Nov. 4, Silver gives Obama an 86.3 percent chance of winning the election. If a meteorologist said there was an 86 percent chance of rain – and it didn’t rain – Byers would presumably “not continue to put faith in the predictions” of the weather forecaster. But we know that’s not right. Forecasts are generally correct – but not always. That does not make them worthless. When there is an 86 percent change of rain, most of us grab an umbrella. And we should.

Liberal satire blog Wonkette also posted a direct reaction to Byer’s troubling piece:

we would like to urge everyone to go read this Politico piece again about how dumb and wrong Nate is and how math and numbers are ruining political punditry forever, and then laugh and laugh at how upset people were by the concept that you could tell how an election might turn out by asking people in advance how they’ll vote and then figuring a way to accurately assess the answers they give.

Silver simply critically interpreted people’s answers to the question of how they plan to vote in a scientific manner. Sounds like a prime example of turning information into knowledge to me. Data science for the win.

This shouldn’t have been upsetting to people. Tufekci eloquently sums up why:

We rely on statistical models for many decisions every single day, including, crucially: weather, medicine, and pretty much any complex system in which there’s an element of uncertainty to the outcome. In fact, these are the same methods by which scientists could tell Hurricane Sandy was about to hit the United States many days in advance. Dismissing predictive methods is not only incorrect; in the case of electoral politics, it’s politically harmful.

She also has a message for the haters:

Refusing to run statistical models simply because they produce probability distributions rather than absolute certainty is irresponsible…. We should defend statistical models because confusing uncertainty and variance with “oh, we don’t know anything, it could go any which way” does disservice to important discussions we should be having on many topics – not just on politics.

Blind love

All of this is not to say that articles that came out in favor of Silver were flawless. Quite the opposite was true; quantitative exceptionalism abounded on both sides of the debate. His predictions were perceived as exceptional (in some cases better, in others worse) because they came from science, which has a habit of making people’s brains turn off. As Paul Raeburn wrote, “Nate Silver’s rational approach to politics seems to provoke highly irrational responses.” Many voices on both sides of the debate conflated likelihood of winning and victory margin, for example. Silver predicted a high likelihood of Obama winning by a fairly slim margin, not a landslide victory.

Science making people's brains turn off (image courtesy of fancylad)

One critical fact that is largely absent from the conversation is that it was partially luck that Silver’s predictions were spot on for all fifty states. If we assume that the state-by-state results as predicted in Silver’s November 6 presidential forecast were independent events, he had a 12.5% chance of getting all fifty correct (the product of all of his state win likelihoods). Granted, this represents much better odds than the approximately one in a quadrillion (1 in 1,000,000,000,000,000) chance of guessing all fifty correct by flipping a coin (50% raised to the 50th power), but Silver’s perfect sweep can’t be attributed to his skill alone.

In effect, being celebrated for this accomplishment makes Silver a probabilistic celebrity. He seems like a pretty rational guy, so I’m guessing he’s more than mildly amused by how much credit he’s gotten for his probabilistic predictions ending up correct. Bowing down to him for this reason is at its core a quantitative exceptionalist act.

Silver’s biggest victory might have been for making a case by example for moving more toward science in politics, away from gut-instinct-based punditry. Pundits are only human, and humans are known for being biased and otherwise poorly calibrated estimators (e.g. see Douglas Hubbard’s How to Measure Anything); Silver himself characterizes subjective pundit predictions as “literally no better than flipping a coin.” Slate decried the poor accuracy of pundits’ predictions in this election, and several media outlets have questioned whether punditry is dead as a result of Silver’s superior methods. But keep in mind that prediction is hard; for some people, even predicting the past can be tricky. At the very least, some good humor has come out of the discussion, such as the election drinking game proposed by Brian Fadden of The New York Times: “Drink until you’re as dumb as a pundit.”

Statisticians would most likely have beaten pundits at predicting a domestic feline’s surprising third place finish in Virginia’s senate race had they been looking for it, but this is not to say that qualitative analysis lacks value. To the contrary, designing and interpreting polls requires it. It is the marriage of qualitative and quantitative analysis that makes statisticians like Silver so successful. Critical inquiry is the key ingredient. Blindly following anyone – either pundits or statisticians – can get you into trouble. In fact, many pundits get themselves into trouble for their own lack of skepticism. A Forbes.com article titled “Nate Silver’s Prediction Was Awesome – But Don’t Build Statues To ‘Algorithmic Overlords’ Just Yet” includes the fair and balanced passage:

Discrediting expert predictions [pundits] seems like real progress – but not if we believe the enduring lesson is to replace one group of fortune tellers with another.  We should certainly strive to use big data and rigorous math whenever we can – but let’s be careful not to fall into the trap of letting down our guard, and trusting all experts who come bearing algorithms.

This isn’t about picking the right side. It’s about being rational.

Next Page »

  • About

    Data Bitten aims to tell the story of the data revolution. More to come.

  • Stay connected