Earlier this month, Michael Scherer published an article called "Fact Checking and the False Equivalence Dilemma" on Time's Swampland blog. Scherer wrote the article in response to criticism of a cover story he wrote about the "factual deceptions" of Barry Obama and Willard Romney. Some readers accused him of false centrism.

Scherer's defense is that we cannot reliably compare the deceptiveness of individuals or groups, especially not based on fact checker rulings. He based his defense on comments by the leaders of the fact checking industry during a press conference that Scherer attended. (In fact, the comments responded to a question that Scherer himself asked.)

Evidenced by my previous post on estimating partisan and centrist bias from fact checker report cards, I sympathize with Scherer's defense against frothy-mouthed partisans who are convinced that the other side tells nothing but a bunch of stuff. Yet I disagree with him and the leaders of the fact checking industry that we cannot reliably compare fact checker rulings (notice I don't say deceptiveness) across politicians and political groups.

To make my point, I'll condense into a list what the fact checking industry leaders and Michael Scherer have said about what Scherer calls the "false equivalence dilemma" (but which should be called the "false comparison dilemma"). For each item in the list, I'll describe the issue, then explain why it's not that big of a deal.

1. "...it's self selective process," says Glen Kessler from The Fact Checker at The Washington Post.

Kessler argues that fact checkers cherrypick the statements that they fact check. No, not out of centrist or partisan bias. In this case, Kessler's talking about a bias toward the timeliness and relevance of the statement. Kessler says that he decides what to fact check based on how much he thinks the fact check will educate the public about something important, like medicare or health insurance reform. He shies away from mere slips of the tongue.

Wait a while. If the only bias fact that checkers had was to fact check timely and relevant remarks about policy, that would make Malark-O-Meter's comparisons more valid, not less. Far more concerning is the possibility that some fact checkers have a fairness bias. Which brings me to...

2. "...it would look like we were endorsing the other candidate," says Brooks Jackson of FactCheck.org.

This comment raises one non-issue against comparisons while implying another. Brooks argues that by demonstrating that one politician is more deceptive than another, FactCheck.org would open itself up to accusations of partisanship. From a publishing standpoint, this makes some sense, especially if your organization wants to maintain a nonpartisan reputation. Yet the ensuing controversy might cause the buzz about your organization to get louder. Just look what's happened with Nate Silver's political calculus this week. Or better yet, look what's happened to Internet searches for PolitiFact compared to factcheck.org over the last year. (Among frothy-mouthed right-wing partisans, PolitiFact is the poster child of the liberal fact checking establishment.)
Yet from the standpoint of informing the public (which is what we're trying to do, right?), who cares if you gain a false reputation of partisan bias? Many people already believe that the fact checking industry is biased, but at least as many people find it highly readable and refreshing. Perhaps that same demographic will find lucid, academically respectable factuality comparisons similarly refreshing.

Interestingly, Jackson's comment hints at the separate issue of centrist bias among today's top fact checkers. In the quest to avoid a partisan reputation, frothy-mouthed liberals allege, the fact checking industry is too fair-minded and falsely balanced (the same criticism leveled against Scherer's cover story in Time).

I've already shown that we can use Malark-O-Meter's statistical methods to estimate the likely level of centrist bias (assuming that one exists). In the same article, I made suggestions for how to estimate the actual level of centrist (and partisan) bias among professional fact checkers.

Furthermore, if what we're aiming at is a more informed public, why must we always shy away from ambiguity? Yes, Malark-O-Meter's measurements are a complex mix of true difference, bias, sampling error, and perceptual error. No, we don't know the relative weights of those influences. But that doesn't make the estimates useless. In fact, it makes them something for people to discuss in light of other evidence about the comparative factuality of political groups.

3. “Politicians in both parties will stretch the truth if it is in their political interest,” says Glen Kessler.

Glen Kessler argues that comparing politicians is fruitless because all politicians lie. Well, I statistically compared the factuality of Obama, Biden, Romney, and Ryan. While all of them appear about half factual, there are some statistically significant differences. I estimate that Rymney's statements are collectively nearly 20% more false than Obiden's statements (I also estimated our uncertainty in that judgment). So yes, both parties' candidates appear to stretch (or maybe just not know) the facts about half the time. But one of them most likely does it more than the other, and maybe that matters.

4. "...not all deceptions are equally deceiving, and different people will reach different judgements about which is worse," says Michael Scherer.

Scherer goes on to ask:
Do you think it was worse for President Obama to claim that Romney supports outlawing abortion even in cases of rape and incest, when Romney does not? Or for Romney to claim that Obama plans to give welfare recipients a check without any work requirement, when he does not?
He then says he doesn't know the answer to those questions. Neither do I, but I don't think the answers matter. What matters is the extent to which an individual's or group's policy recommendations and rhetoric adhere to the facts. That is why the fact checking industry exists. If the questions above bother you, then the fact checking industry writ large should bother you, not just the comparison niche that Malark-O-Meter is carving out. Furthermore, since Kessler has already established that fact checkers tend to examine statements that would lead to instructive journalism, we can be confident that most rulings that we would compare are, roughly speaking, equally cogent.

Which brings me to the straw man of the false equivalence dilemma:

5. We can't read someone's mind.

Much of the fact checking industry leaders' commentary, and Michael Scherer's subsequent blog entry, assumed that what we're comparing is the deceptiveness (or conversely the truthfulness) of individuals or groups. This opened up the criticism that we can't read people's minds to determine if they are being deceptive. All we can do is rate the factuality of what they say. I agree with this statement so much that I discuss this issue in the section of my website about the caveats to the malarkey score and its analysis.

I contend, however, that when words come out of someone's mouth that we want to fact check, that person is probably trying to influence someone else's opinion. The degree to which people influence our opinion should be highly positively correlated with the degree to which their statements are true. No, not true in the value laden sense. True in the sense that matters to people like scientists and court judges. So I don't think it matters whether or not we can tell if someone is trying to be deceptive. What matters should be the soundness and validity of someone's arguments. The fact checking industry exists to facilitate such evaluations. Malark-O-Meter's comparisons facilitate similar evaluations at a higher level.

Lastly, I want to address one of Michael Scherer's remarks about a suggestion by political deceptiveness research pioneer, Kathleen Hall Jamieson, who works with Brooks Jackson at the Annenberg Public Policy Center, which runs FactCheck.org.
[Jamieson] said what you really wanted to measure was consequential deceptions, meaning the level of deception that moved voters. One way of doing this would be to score every campaign ad that runs in a cycle for deception, and then weight the ads by the number of people who see them. It’s a fine idea, but difficult to do in real time, when the reputational cost is the highest for the campaigns.
Three things. First, this is definitely a fine idea...if you want to measure the level of deception that moved voters. But what if you simply want to measure the average factuality of the statements that an individual or group makes? In that case, there is no need to weight fact check rulings by the size of their audience. In fact, by believing this measure is a measure of individual or group factuality (rather than a measure of the effects of an individual or group's statements), you would overestimate the factuality or falsehood of highly influential people relative to less influential people.

Second, most fact check rulings are of timely and relevant statements, and they are often a campaign's main talking points. So I would be interested to see what information all that extra work would add to a factuality score. 

Third, while it is difficult to do in real time, it isn't impossible, especially not in pseudo real time. (Why do we have to do it in real time, anyway? Can't people wait a day? They already wait that long or more for most fact checker rulings! Moreover, didn't we once believe real time fact checking was so difficult, and yet that's what PolitiFact did during the debates.) 

Anyway, for any given campaign ad or speech or debate, there's usually a transcript. We often know the target audience. We can also estimate the size of the audience. Come up with a systematic way to put those pieces of information together, and it will become as straightforward as...well...fact checking!

In sum, so long as fact checkers are doing their job fairly well (and I think they are) people like me can do our job (oh, but I wish it actually were my job!) fairly well. That said, there is much room for improvement and innovation. Stay tuned to Malark-O-Meter, where I hope some of that will happen.
 

    about

    Malark-O-blog published news and commentary about the statistical analysis of the comparative truthfulness of the 2012 presidential and vice presidential candidates. It has since closed down while its author makes bigger plans.

    author

    Brash Equilibrium is an evolutionary anthropologist and writer. His real name is Benjamin Chabot-Hanowell. His wife calls him Babe. His daughter calls him Papa.

    what is malarkey?

    It's a polite word for bullshit. Here, it's a measure of falsehood. 0 means you're truthful on average. 100 means you're 100% full of malarkey. Details.

    what is simulated malarkey?

    Fact checkers only rate a small sample of the statements that politicians make. How uncertain are we about the real truthfulness of politicians? To find out, treat fact checker report cards like an experiment, and use random number generators to repeat that experiment a lot of times to see all the possible outcomes. Details.

    malark-O-glimpse

    Can you tell the difference between the 2012 presidential election tickets from just a glimpse at their simulated malarkey score distributions?

    Picture
    dark = pres, light = vp
    (Click for larger image.)

    fuzzy portraits of malarkey

    Simulated distributions of malarkey for each 2012 presidential candidate with 95% confidence interval on either side of the simulated average malarkey score. White line at half truthful. (Rounded to nearest whole number.)

    Picture
    (Click for larger image.)
    • 87% certain Obama is less than half full of malarkey.
    • 100% certain Romney is more than half full of malarkey.
    • 66% certain Biden is more than half full of malarkey.
    • 70% certain Ryan is more than half full of malarkey.
    (Probabilities rounded to nearest percent.)

    fuzzy portraits of ticket malarkey

    Simulated distributions of collated and average malarkey for each 2012 presidential election ticket, with 95% confidence interval labeled on either side of the simulated malarkey score. White line at half truthful. (Rounded to nearest whole number.)

    malarkometer fuzzy ticket portraits 2012-10-16 2012 election
    (Click for larger image.)
    • 81% certain Obama/Biden's collective statements are less than half full of malarkey.
    • 100% certain Romney/Ryan's collective statements are more than half full of malarkey.
    • 51% certain the Democratic candidates are less than half full of malarkey.
    • 97% certain the Republican candidates are on average more than half full of malarkey.
    • 95% certain the candidates' statements are on average more than half full of malarkey.
    • 93% certain the candidates themselves are on average more than half full of malarkey.
    (Probabilities rounded to nearest percent.)

    Comparisons

    Simulated probability distributions of the difference the malarkey scores of one 2012 presidential candidate or party and another, with 95% confidence interval labeled on either side of simulated mean malarkey. Blue bars are when Democrats spew more malarkey, red when Republicans do. White line and purple bar at equal malarkey. (Rounded to nearest hundredth.)

    Picture
    (Click for larger image.)
    • 100% certain Romney spews more malarkey than Obama.
    • 55% certain Ryan spews more malarkey than Biden.
    • 100% certain Romney/Ryan collectively spew more malarkey than Obama/Biden.
    • 94% certain the Republican candidates spew more malarkey on average than the Democratic candidates.
    (Probabilities rounded to nearest percent.)

    2012 prez debates

    presidential debates

    Simulated probability distribution of the malarkey spewed by individual 2012 presidential candidates during debates, with 95% confidence interval labeled on either side of simulated mean malarkey. White line at half truthful. (Rounded to nearest whole number.)

    Picture
    (Click for larger image.)
    • 66% certain Obama was more than half full of malarkey during the 1st debate.
    • 81% certain Obama was less than half full of malarkey during the 2nd debate.
    • 60% certain Obama was less than half full of malarkey during the 3rd debate.
    (Probabilities rounded to nearest percent.)

    Picture
    (Click for larger image.)
    • 78% certain Romney was more than half full of malarkey during the 1st debate.
    • 80% certain Romney was less than half full of malarkey during the 2nd debate.
    • 66% certain Romney was more than half full of malarkey during the 3rd debate.
    (Probabilities rounded to nearest percent.)

    aggregate 2012 prez debate

    Distributions of malarkey for collated 2012 presidential debate report cards and the average presidential debate malarkey score.
    Picture
    (Click for larger image.)
    • 68% certain Obama's collective debate statements were less than half full of malarkey.
    • 68% certain Obama was less than half full of malarkey during the average debate.
    • 67% certain Romney's collective debate statements were more than half full of malarkey.
    • 57% certain Romney was more than half full of malarkey during the average debate.
     (Probabilities rounded to nearest percent.)

    2012 vice presidential debate

    Picture
    (Click for larger image.)
    • 60% certain Biden was less than half full of malarkey during the vice presidential debate.
    • 89% certain Ryan was more than half full of malarkey during the vice presidential debate.
    (Probabilities rounded to nearest percent.)

    overall 2012 debate performance

    Malarkey score from collated report card comprising all debates, and malarkey score averaged over candidates on each party's ticket.
    Picture
    (Click for larger image.)
    • 72% certain Obama/Biden's collective statements during the debates were less than half full of malarkey.
    • 67% certain the average Democratic ticket member was less than half full of malarkey during the debates.
    • 87% certain Romney/Ryan's collective statements during the debates were more than half full of malarkey.
    • 88% certain the average Republican ticket member was more than half full of malarkey during the debates.

    (Probabilities rounded to nearest percent.)

    2012 debate self comparisons

    Simulated probability distributions of the difference in malarkey that a 2012 presidential candidate spews normally compared to how much they spewed during a debate (or aggregate debate), with 95% confidence interval labeled on either side of the simulated mean difference. Light bars mean less malarkey was spewed during the debate than usual. Dark bars less. White bar at equal malarkey. (Rounded to nearest hundredth.)

    individual 2012 presidential debates

    Picture
    (Click for larger image.)
    • 80% certain Obama spewed more malarkey during the 1st debate than he usually does.
    • 84% certain Obama spewed less malarkey during the 2nd debate than he usually does.
    • 52% certain Obama spewed more malarkey during the 3rd debate than he usually does.
    Picture
    (Click for larger image.)
    • 51% certain Romney spewed more malarkey during the 1st debate than he usually does.
    • 98% certain Romney spewed less malarkey during the 2nd debate than he usually does.
    • 68% certain Romney spewed less malarkey during the 3rd debate than he usually does.

    (Probabilities rounded to nearest percent.)

    aggregate 2012 presidential debate

    Picture
    (Click for larger image.)
    • 58% certain Obama's statements during the debates were more full of malarkey than they usually are.
    • 56% certain Obama spewed more malarkey than he usually does during the average debate.
    • 73% certain Romney's statements during the debates were less full of malarkey than they usually are.
    • 86% certain Romney spewed less malarkey than he usually does during the average debate.

    (Probabilities rounded to nearest percent.)

    vice presidential debate

    Picture
    (Click for larger image.)
    • 70% certain Biden spewed less malarkey during the vice presidential debate than he usually does.
    • 86% certain Ryan spewed more malarkey during the vice presdiential debate than he usually does.

    (Probabilities rounded to nearest percent.)

    2012 opponent comparisons

    Simulated probability distributions of the difference in malarkey between the Republican candidate and the Democratic candidate during a debate, with 95% confidence interval labeled on either side of simulated mean comparison. Blue bars are when Democrats spew more malarkey, red when Republicans do. White bar at equal malarkey. (Rounded to nearest hundredth.)

    individual 2012 presidential debates

    Picture
    (Click for larger image.)
    • 60% certain Romney spewed more malarkey during the 1st debate than Obama.
    • 49% certain Romney spewed more malarkey during the 2nd debate than Obama.
    • 72% certain Romney spewed more malarkey during the 3rd debate than Obama.

    (Probabilities rounded to nearest percent.)

    aggregate 2012 presidential debate

    Picture
    (Click for larger image.)
    • 74% certain Romney's statements during the debates were more full of malarkey than Obama's.
    • 67% certain Romney was more full of malarkey than Obama during the average debate.

    (Probabilities rounded to nearest percent.)

    vice presidential debate

    • 92% certain Ryan spewed more malarkey than Biden during the vice presidential debate.

    (Probabilities rounded to nearest percent.)

    overall 2012 debate comparison

    Party comparison of 2012 presidential ticket members' collective and individual average malarkey scores during debates.
    • 88% certain that Republican ticket members' collective statements were more full of malarkey than Democratic ticket members'.
    • 86% certain that the average Republican candidate spewed more malarkey during the average debate than the average Democratic candidate.

    (Probabilities rounded to nearest percent.)

    observe & report

    Below are the observed malarkey scores and comparisons form the  malarkey scores of the 2012 presidential candidates.

    2012 prez candidates

    Truth-O-Meter only (observed)

    candidate malarkey
    Obama 44
    Biden 48
    Romney 55
    Ryan 58

    The Fact Checker only (observed)

    candidate malarkey
    Obama 53
    Biden 58
    Romney 60
    Ryan 47

    Averaged over fact checkers

    candidate malarkey
    Obama 48
    Biden 53
    Romney 58
    Ryan 52

    2012 Red prez vs. Blue prez

    Collated bullpucky

    ticket malarkey
    Obama/Biden 46
    Romney/Ryan 56

    Average bullpucky

    ticket malarkey
    Obama/Biden 48
    Romney/Ryan 58

    2012 prez debates

    1st presidential debate

    opponent malarkey
    Romney 61
    Obama 56

    2nd presidential debate (town hall)

    opponent malarkey
    Romney 31
    Obama 33

    3rd presidential debate

    opponent malarkey
    Romney 57
    Obama 46

    collated presidential debates

    opponent malarkey
    Romney 54
    Obama 46

    average presidential debate

    opponent malarkey
    Romney 61
    Obama 56

    vice presidential debate

    opponent malarkey
    Ryan 68
    Biden 44

    collated debates overall

    ticket malarkey
    Romney/Ryan 57
    Obama/Biden 46

    average debate overall

    ticket malarkey
    Romney/Ryan 61
    Obama/Biden 56

    the raw deal

    You've come this far. Why not just check out the raw data Maslark-O-Meter is using? I promise you: it is as riveting as a phone book.

    archives

    June 2013
    May 2013
    April 2013
    January 2013
    December 2012
    November 2012
    October 2012

    malark-O-dex

    All
    2008 Election
    2012 Election
    Average Malarkey
    Bias
    Brainstorm
    Brier Score
    Bullpucky
    Caveats
    Closure
    Collated Malarkey
    Conversations
    Dan Shultz
    Darryl Holman
    Debates
    Drew Linzer
    Election Forecasting
    Equivalence
    Fact Checking Industry
    Fallacy Checking
    Foreign Policy
    Fuzzy Portraits
    Gerrymandering
    Incumbents Vs. Challengers
    Information Theory
    Kathleen Hall Jamieson
    Launch
    Logical Fallacies
    Longitudinal Study
    Malarkey
    Marco Rubio
    Meta Analysis
    Methods Changes
    Misleading
    Model Averaging
    Nate Silver
    Origins
    Pants On Fire
    Politifactbias.com
    Poo Flinging
    Presidential Election
    Ratios Vs Differences
    Redistricting
    Red Vs. Blue
    Root Mean Squared Error
    Sam Wang
    Science Literacy
    Short Fiction
    Simon Jackman
    Small Multiples
    Stomach Parasite
    The Future
    The Past
    To Do
    Truth Goggles
    Truth O Meter
    Truth O Meter