Election Day fast approaches. A few remain undecided. The majority have decided, and look for any last bits of manure to fling at the other side. In the era of fact checking websites, a candidate's factuality has become important to a lot of people, including me. 

That's why, earlier this week, I introduced Malark-O-Meter to the world (well...more like only 500 people in the world). I statistically analyzed fact checker report cards from Truth-O-Meter and The Fact Checker to compare the factuality of the 2012 presidential and vice presidential candidates overall, and during the debates that had happened so far. I promised I'd get back to you on the third debate, and with a summary of how the two parties did in the debates overall, compared to one another, and to their usual selves.

Unless Bidama and Rymney (or is it Obiden and Romyan?) blast each other in the next few weeks as much as they have in the last year, this is probably my final 2012 election malarkey analysis until Election Day. This is the one of the most comprehensive, sophisticated, and detailed analyses of the 2012 presidential candidates' factuality. 

Share it with your friends. Discuss its results. Debate its merits. Tell me precisely why you think I'm full of shit. Supersize the histograms and past them to brick walls like you're Shepard Fairey. Because this stuff matters. It matters because the facts matter. It matters because we should understand how confident we can be in our judgments about people.

Enough histrionics. Let's get to the science. If you've never been here before, quickly skim how I calculate the malarkey score and how I do my statistical comparisons before continuing. If you read my last 2012 presidential campaign update, not much has changed. So you might want to scroll down to my analysis of the third debate, and the debates as a whole.

Full report cards


I collected the full Truth-O-Meter and Fact Checker report cards for Obama, Biden, Romney, and Ryan this morning. Let's start with what we observe. That is, what can we say about the factuality of the two sides if we take the report cards at face value? Here are the revised overall malarkey scores for each individual candidate and each ticket.

Individuals

candidate malarkey
Obama 48
Biden 53
Romney 57
Ryan 52

Tickets

collated
(average malarkey in statements)

ticket malarkey
Obama/Biden 46
Romney/Ryan 56

average
(average malarkey of individuals)

ticket malarkey
Obama/Biden 48
Romney/Ryan 57
Okay. Not much has changed since last time. We observe that Obama spews less malarkey than Romney, and Biden less than Ryan. We observe that the blue team's statements are less full of malarkey than the red team's, and the Democratic candidates themselves are less full of malarkey than the Republican candidates. But not by much. No candidate or ticket appears much better or worse than half full of malarkey.

The trouble is, for each candidate (and party), we only have a small sample of the statements they've made. That introduces sampling error. We must calculate the certainty with which we can make judgments about the candidates and parties given the data we have.

Picture
To the right are the probability distributions of malarkey scores for the four candidates, labeled with the 95% confidence intervals on either side of the expected malarkey score. The white line lies at a malarkey score of 50.

How certain can we be that the candidates are much better or worse than half full of malarkey? From the probability distributions shown at right, I calculated the probabilities.

Odds are 9 to 1 that Obama's less than half full of malarkey, but not by much. It's almost 100% likely the Romney is more than half full of malarkey. The difference is greater than for Obama, but still not much difference than a half buck of malarkey. 

The odds are only around 2 to 1 that Biden is more than half full of malarkey. Again, not by much. The same is true for Ryan.


Picture
We can only be pretty certain about how the presidential candidates compare to a half bucket of malarkey. What about the party tickets as a whole? 

To the left are the probability distributions of the collated and average malarkey scores. Based on these distributions, the odds are nearly 9 to 1 that the Obiden's collective statements are on average less than half full of malarkey (but not by much), while it is almost certain that Rymney's are more than half full of malarkey (by a wider margin, but still not by much). 

It's a statistical toss-up whether Obama and Biden are on average less than half full of malarkey, whereas the odds are over 19 to 1 that Romney and Ryan are on average more than half full of malarkey (but still not by much).


Picture
But how do the candidates and tickets compare, and how certain can we be about those comparisons?

To the right are probability distributions of the ratio between a Republican malarkey score and a Democrat malarkey score. Red bars occur when the Republican malarkey score is greater than the Democrat score. The comparisons run from very murky (for the v.p. candidates) to pretty clear (for the collated ticket report cards and the presidential candidates).

I am essentially 100% certain that Romney spews more malarkey than Obama...but not much more. Not even twice as much. Not even one and a half times as much. I'm also nearly 100% certain that Obiden collectively spew less malarkey than Rymney. But again, they're not that different. Basically, I can't tell a difference between Biden and Ryan. If there is a difference between them, it is tiny, but in favor of Biden. I can, however, give 9 to 1 odds that Obama and Biden are on average more factual than Romney and Ryan. But not that much more factual!

We draw two lessons. First, and I repeat from last time, the differences in factuality between the two parties aren't as large as either side would have you believe. That said, there is a clear difference. The differences we can be certain about favor Democrats. And all of the differences we've measured, regardless of our certainty in them, suggest that the Democratic candidates are more factual than the Republicans.

On to the debates.


Debates

This morning, I collected the Truth-O-Meter rulings of claims made during the final debate. To the right are the observed malarkey scores for that debate.

opponent malarkey
Romney 57
Obama 46

Once again, Romney spews more malarkey than Obama. At least, that's what the data says. But how strong is the evidence? Let's bust out the simulator.
Picture
To the left are the probability distributions of Obama's malarkey score in all three debates. The odd are only about 2 to 1 that Obama was more than half full of malarkey than the first debate. But they're better than 4 to 1 that he was less than half full of malarkey in the second. The odds are only 3 to 2 that he was less than half full of malarkey during the last debate.

Picture
As for Romney, the odds are better than 3 to 1 that he was more than half full of malarkey during the 1st debate. His report card from the 2nd debate was surprisingly truthful, the odds 4 to 1 that he was less than half full of malarkey. As for the third debate, we're back to 2 to 1 odds that he spewed more than half a bucket of malarkey.

Picture
How did the presidential candidates perform in the debates overall? To answer this question, I calculated two summary malarkey scores for the debates. 

First, I collated the candidates' report cards from each debate, then calculates a malarkey score from it. This measures the average falsehood of the statements a candidate made across all the debates. The odds are about 2 to 1 that Obama's statements during the debates were less than half full of malarkey. The odds are about the same that Romney's statements were more than half full of malarkey.

Second, I averaged the candidate's malarkey score across the three debates. The odds are again about 2 to 1 that Obama was on average less than half full of malarkey during a debate. Statistically, we can't tell whether Romney was on average more or less than half full of malarkey during a debate.


Picture
People seemed very let down by Obama's performance in the first debate. Mostly, it had to do with his demeanor. But could people have been subconsciously disappointed with his truthfulness during that debate as well, perhaps cued by his subtle facial expressions and body language that he as being more false than he usually is? 

Maybe. In any case, the odds are 4 to 1 he spewed more a few more falsehoods during the first debate than he usually does. The odds are again 4 to 1 that his performance improved during the second debate, when he appeared to be more factual than usual. Statistically, we can't tell a difference between the usual Obama and the Obama in the final debate.


Picture
In contrast, we can't tell a difference between the usual Romney and the Romney in the first debate. The odds are 49 to 1 that Romney was more factual during the 2nd debate than he usually is. Yet Romney lost that debate. 

As for the third debate, the odds are only 2 to 1 that Romney was more factual in the final debate than he usually is. Whatever the case, he seems to have lost that one, too.

So much for the persuasive power of facts!


Picture
We've seen how the presidential candidates' compare to themselves, but how did their debate performances compare to one another? To the right are the probability distributions of the ratio of Romney's malarkey score to Obama's in each of the debates. Red means Romney spewed more malarkey, blue means Obama did. The odds are only 2 to 1 that Romney spewed more malarkey than Obama during the first debate. It's a toss-up who was more factual in the second debate. And the odds are better than 3 to 1 that Obama was more factual than Romney in the final debate. So the evidence is pretty damn weak, but favors Obama for two of the debates. But even if Obama was more factual than Romney, he probably wasn't that much more factual. What about overall performance during the presidential debates?


Picture
To the right are the probability distributions of the ratio of Romney's collated debate report card to Obama's (top), and the ratio of Romney's average debate malarkey score to Obama's (bottom). The odds are nearly 3 to 1 that Obama's statements during the debates were more factual than Romney's. The odds are only about 2 to 1 that he was more factual on average than Romney was in a given debate. Again, the evidence is fairly weak, but it favors Obama. Even if it favors Obama, the differences in factuality aren't that great.


Picture
To prepare for my analysis of the collective debate performance of each party's ticket, I review my analysis of the the vice presidential debate. To the right are the probability distributions of Biden's and Ryan's malarkey scores during the vice presidential debate.

The odds are about 3 to 2 that Biden was less than half full of malarkey during the debate. Contrastingly, the odds are nearly 9 to 1 that Ryan was more than half full of malarkey during the debate. So Biden was probably right. It was all just a bunch of stuff! Well, not all of it. Actually, not much more than half of it was malarkey. 

Still, given the small amount of evidence we have from the debate (which introduces a lot of sampling error), it's quite interesting that the odds remain so high that Ryan spewed so much malarkey. Perhaps it was Biden's mastering of the facts after all that dampened the Republican's Romentum! Well, at least it was his ability to point out Ryan's factual missteps. But remember, Biden was about half full of malarkey during the debate, too.


Picture
That said, the plot at right suggests that Biden was probably about as factual as he usually is during the debate, whereas the odds are better 6 to 1 that Ryan was less factual than usual.

I'd like to think that Ryan's subtle cues of his own falsehood were a letdown to some undecided voters who had expected more from him, but the polls were pretty split about who won the debate.


Picture
Regardless of whether people think Ryan or Biden "won" the debate, the probability distribution of the ratio of Ryan's to Biden's debate malarkey score suggest better than 9 to 1 odds that Ryan spewed more malarkey than Biden. In this case, the mean difference between the two scores is actually fairly large. Or at least larger than we've come to expect from these candidates, who are all basically half charlatans. So here's a shout-out to Biden, whose spirited use of the term "malarkey" inspired the name of my factuality score. Good work, Mr. Vice President. Or at least, I'm over 90% sure it's good work.


Is there a way to analyze the collective malarkey scores across all four debates, and across the presidential and vice presidential candidates from each party? I'm Brash Equilbrium, baby! Of course there is.

I came up with two measures of malarkey overall all four debates. First, I simply collated the statements from the presidential and vice presidential candidates into two summary report cards for each party. 

Second, I took the average of a presidential candidate's average malarkey across all three presidential debates, and the vice president's debate malarkey score. Let's unpack that a bit. Step one was to average the presidential candidate's malarkey across all three debates. Step two was to take the average of that average and the vice president's malarky score from the vice presidential debate. 

Why did I take an average of averages? Because I wanted to measure the average malarky score of an individual on a party's ticket, not the party's average score across the four debates. If I'd done the latter, I would have weighted presidential candidates more heavily, which I already do in the collated measure since presidential candidates were more heavily fact checked, and had more debates, than vice presidential candidates.

Okay, let's look at some graphs and calculate some odds.

Picture
At left are the probability distributions for the collated and candidate average debate malarkey for each party. 

The odds are almost 3 to 1 that Obiden's collective statements during the debates were less than half full of malarkey. The odds are about 2 to 1 that the average Democratic candidate's average debate performance was more than half factual.

Contrastingly, the odds are better than 6 to 1 that the collective statements of Rymney were more than half full of malarkey. The case is similar for the average Republican candidate's average debate performance.

Not the similarities between the confidence intervals and means of the debate summaries and those of each party's malarkey scores calculated from their full report cards. These similarities make me confident that malarkey scores taken from full report cards are a pretty good predictor malarkey scores accrued during events like televised debates. Remember also that the candidates' overall malarkey scores were calculated from two fact checkers, whereas the debate data comes from just one.

Maybe there is something to this Malark-O-Meter thing after all. Which brings me to our final plot.


Picture
The odds are better than 7 to 1 that the Republican candidates' collective statements during the debates were more full of malarkey than the Democrats'. The difference isn't that big, but it's not trivial.

The odds are better than 6 to 1 that the average Republican candidate's average debate performance was more full of malarkey than the average average Democratic candidate's. Again, the difference isn't big, but it's not trivial.


If factuality were all you cared about in a candidate, there is pretty strong evidence that you should prefer the Democrats over the Republicans. That said, there is also pretty strong evidence that the differences that likely exist between the two tickets aren't massive. Still, they aren't trivial.

Of course, you don't only care about factuality. You care about policy. You care about issues. But therein lies the rub. When politicians design and advocate for policies, they ideally do so with some grounding in the facts. Evidence matters, or at least it should, just as much to policymaking as it does in a courtroom or a chemistry lab.

What about values? You should care about your candidates' values too, right? But how are your candidates' values informed by the facts?

You see where I'm going here. I understand that factuality isn't the only characteristic we should consider when deciding who gets our vote.

But it sure seems to be at the root of all the others!

VOTE!
 
 
Those who've read my description of the malarkey score for a group (such as the members a presidential campaign ticket) know that I have two group malarkey measures: the collated malarkey score, and the average malarkey score.

Collated malarkey combines the statements of separate individuals into a single report card, then calculates the malarkey score from the combined report card. Average malarkey calculates a malarkey score for each member of the group, then averages them. Collated malarkey measures the average falseness of the statements a group makes. Average malarkey measures the average falsehood of a group's members.

You can also calculate collated and average malarkey scores for report cards grouped by a type of event. For example, there were three presidential debates. I will have a report card for each presidential candidate and each debate. I can collate them and calculate a malarkey score or calculate a malarkey score from each and average a candidate's malarkey scores across the three debates.

Right now, it looks like the most rulings will be for the first and last debates. That means the collated bullpucky score will be influenced more by the statements in these debates than in the second debate. Yet if I average across the debates, I treat each debate equally.

Hrm. Well, I think there's value in both strategies. So I'll just do both.
 
 
The era of rapid fact checking is upon us. On an almost daily basis, websites like PolitiFact.com, FactCheck.org, and The Fact Checker give us in-depth analysis of the facts, and how they compare to what politicians say. PolitiFact and The Fact Checker go two steps further by using categories to rank factuality on an intuitive scale while maintaining up-to-date report cards on individuals who have been fact checked.

Categorical ranking systems make it easier to internalize and remember the results of a detailed fact check. Together with individual report cards, the categories give us a sense of someone's overall factuality. Yet these report cards are only a small sample of the statements that individuals (or groups) make. Furthermore, a list of counts in different fact checking categories provides no simple, singular measure of factuality that most people can easily interpret.

I created Malark-O-Meter to solve these two problems by using sophisticated statistical and computational methods, whereby I can make inferences about an individual's factuality from a small sample of statements. Beyond the measurement of factuality, I hope to convince people that they must consider the certainty with which they can make statements about the relative truthfulness of different people, especially political opponents. As you'll see, my analyses also belie the hyperbole spoken by one side against the other.

Malark-O-Meter starts with a simple scoring system. I assign numeric values to each category in a ranking system, with more false statements receiving higher values. Then I multiply those values by the percentage of statements made in that category. Finally, I sum the results. I call the end product "bullpucky". A bullpucky of zero means you are always factual. A bullpucky of 100 means you are 100% full of bullpucky. The bullpucky scale is continuous between those two values.

Sound familiar? That's because Jeremy Kalgreen did something similar with his hilarious and beautifully laid out website, whosmorefullofshit.com. But I go two steps further than Kalgreen, who personally approved of my decision to duplicate his scoring system.

First, Kalgreen only used PolitiFact report cards. In an attempt to account for variation among fact checkers, Malark-O-Meter averages scores based on both PolitiFact and The Fact Checker, and can easily be extended to incorporate any number of fact checkers. Second, Malark-O-Meter doesn't just calculate bullpucky scores. I measure our uncertainty in bullpucky scores due to the small sizes, and in our comparisons of one individual or party to another.

I encourage you to navigate this website if you want to learn more details about my scoring and statistical methods, especially if you aren't well-versed in statistics. Then, I urge you to come back here and read my first official analysis.

Tonight, I analyze the comparative bullpucky of the 2012 presidential candidates and their running mates overall, and specific to their performance in the 2012 debates so far. This analysis is especially salient now given that there is one more debate tomorrow, and just 15 days until election day, November 6th.

Let's start by measuring the overall bullpucky of each candidate, and establishing a range of values for the bullpucky score that we can be reasonably certain they have, given the available data. Before we use fancy statistical methods to estimate probability distributions of bullpucky, let's look at the bullpucky scores we observe directly from the report cards. Below are bullpucky scores calculated from the candidates' report cards from PolitiFact and The Fact Checker, respectively.

PolitiFact bullpucky

candidate % bull
Obama 44
Biden 48
Romney 56
Ryan 58

The Fact Checker bullpucky

candidate % bull
Obama 54
Biden 58
Romney 60
Ryan 47
It looks like there are some differences between the fact checkers. Overall, The Fact Checker detects more bullpucky than PolitiFact for everyone except Paul Ryan, for whom The Fact Checker detects less bullpucky. It's differences like this that motivate an average of bullpucky scores across fact checkers. Here are the observed average bullpucky scores for each candidate.

Average bullpucky

candidate % bull
Obama 49
Biden 53
Romney 58
Ryan 52
I hope I've convinced you that averaging bullpucky over multiple fact checkers is better than trusting only one fact checker. Whether or not I have, I'm going to focus on the "average" fact checker in all the analyses that follow. Anyway, the observed scores suggest that Obama spews less bullpucky than Romney, and Biden less than Ryan, but that none of the candidates are much better (or worse) than half truthful (bullpucky score of 50). From this, we might conclude that the Republican ticket spews more bullpucky than the Democratic ticket. 

Logically, then, we want to compare the two campaign tickets against one another by calculating one bullpucky score for each ticket. I do this in two ways. First, I calculate the collated bullpucky of the ticket. This method adds together the number of statements in each category from each ticket member before calculating a bullpucky score. Collated bullpucky measures the average factuality of the statements made by the members of a party's ticket. Second, I average the bullpucky scores of the politicians on each ticket. This measures the average factuality of the party members on a ticket. Here are the observed collated and average ticket bullpucky for the Republicans and Democrats.

collated ticket bullpucky

ticket % bull
Obama/Biden 46
Romney/Ryan 57

average ticket bullpucky

ticket % bull
Obama/Biden 49
Romney/Ryan 58
Sure enough, the Democratic ticket appears to spew less bullpucky than the Republican ticket, although neither party is much better (or worse) than half truthful (bullpucky score of 50).

The trouble is that making comparisons like this based only on observationsfrom small samples is...well...bullpucky. We also need to measure our statistical confidence in those statements. That is, we must treat each report card like an experiment in which we sample a few among the many statements that politicians make during their political career, or evening a political debate. Then we use a random number generator to virtually repeat that experiment many many times. This process results in a whole universe of possible bullpucky scores (or comparisons between them). We can calculate the percentage of virtual experiments in this universe that would take on a particular value or range of values. We can also calculate the average bullpucky score (or score comparison) that we would expect. Finally, we can calculate an interval of values that we can be, say, 95% certain would result from such experiments (this is called the 95% confidence interval).

Let's compare the bullpucky scores of the candidates and tickets from this more sophisticated perspective. We'll start by calculating for each candidate the mean bullpucky and its 95% confidence interval, then plotting it on a histogram.

Below are those histograms, labeled with 95% confidence intervals on either side of the candidate's mean bullpucky score. The thick white line marks a half truthful bullpucky score of 50.
Already, we're getting somewhere. See how Obama and Romney's distributions barely overlap? The lack of overlap suggests we can be reasonably confident that the difference in their observed bullpucky is real. 

The same is not the case for Biden and Ryan. First, Biden and Ryan now appear to have equal average bullpucky scores. Second, their distributions are very wide compared to the presidential candidates. That's because there are far fewer statements rated for each of the individuals by either of the fact checkers. Third, Biden and Ryan's bullpucky distributions overlap considerably. Together, these findings suggest we shouldn't place much confidence in the observed differences between Biden and Ryan. We just don't have enough evidence to draw a clear distinction.

But how much certainty do we have that Romney spews more bullpucky than Obama, or Ryan more than Biden? Just like we can build a universe of possible bullpucky scores, we can build a universe of possible ratios between bullpucky scores.

Below, I plot comparisons between presidential candidates, vice presidential candidates, collated ticket bullpucky scores, and average ticket bullpucky scores. The red area of the histogram represents the portion of the virtual universe in which the Republican(s) spew(s) more bullpucky than the Democrat(s). The blue area is the opposite. The white line marks the point where the two have equal bullpucky. The scale on the horizontal axis is the ratio of the Republican bullpucky score to the Democrat score.
Indeed, it looks like we can be quite confident that Obama spews less bullpucky than Romney, but not so confident that Biden spews less than Ryan. Moreover, we can be quite confident that the average bullpucky of the statements made by the Democratic ticket is less than that of the Republicans. It also looks like we can be somewhat confident that the members of the Democratic ticket spew less bullpucky on average than the members of the Republican ticket.

We say, "It looks like we can be certain," but how certain we can be? From the virtual universe of comparisons, we can calculate the total percentage of experiments in which, for example, Obama spews less bullpucky than Romney. Doing so results in the following statements associated with the histograms above.
  • We can be 99.95% certain Romney spews more bullpucky than Obama. So, very certain. Like, almost completely certain.
  • We can be 55.24% certain Ryan spews more bullpucky than Biden. We're not doing much better than flipping a coin to make our decision about who spews more bullpucky.
  • We can be 99.93% certain Romney/Ryan spew more (collated) bullpucky than Obama/Biden. Again, almost completely certain.
  • We can be 91.79% certain Romney/Ryan spew more (average) bullpucky than Obama/Biden. Not completely certain, but pretty certain.

Not only can we examine the comparative bullpucky spewed over all of an individual's statements that have been fact checked. We can do the same for a subset of statements that occurred during a particular event, such as a presidential or vice presidential debate. We can compare the bullpucky scores not only between different candidates, but between a candidate's debate performance and their overall factuality.

Here are the histograms of simulated bullpucky scores for each of the presidential candidates during each of their debates so far, labeled with the 95% confidence interval on either side of the simulated mean bullpucky. The white line represents half truthfulness (bullpucky score of 50).
Notice that the confidence intervals are wider now because the sample size of statements is smaller. At first glance, there are some clear departures in the candidates' debate performance from their usual lamount of bullpucky. But are these departures "real"? Let's plot the comparisons, much as we did with the comparisons between candidates. The horizontal scale is the ratio of a candidate's usual bullpucky to the bullpucky spewed during a particular debate. The lighter portion of the plot represents the portion of the virtual universe in which the overall bullpucky is greater than the debate bullpucky. The white line lies at the point where both overall and debate bullpucky are equal.
It looks like we can be somewhat confident that Obama spews less bullpucky normally than he did during the 1st debate, and more bullpucky normally than he did during the 2nd debate. It doesn't look like we can be so certain that Romney spews more bullpucky normally than he did during the 1st debate, but that we can be quite certain he spews more bullpucky normally than he did during the 2nd debate.

Again, we say, "It looks like...," but what is the probability of a given comparison? I calculated those probabilities from the virtual universe of comparisons.
  • We can be 76.19% certain Obama spews less bullpucky normally than he did during the 1st debate. Not certain, but better than three to one odds.
  • We can 50.92% certain Romney spews more bullpucky normally than he did during the 1st debate. Less than 1% better than the toss of a coin.
  • We can 86.29% certain Obama spews more bullpucky normally than he did during the 2nd debate. Not certain, but about six to one odds.
  • We can 98.25% certain Romney spews more bullpucky normally than he did during the 2nd debate. Not 100% certain, but pretty certain.

The vice presidential debate between Biden and Ryan was particularly heated. We can do a similar analysis for this debate. Here are the debate bullpucky scores of the two vice presidential candidates in their one and only debate.
And here is the plot of the comparison between their overall bullpucky and their performance during the debate, as we saw before with the presidential candidates.
It looks like we can't be too confident that Vice President Joe Biden spewed less bullpucky during the debate than he does normally, but we can be somewhat confident that Ryan spewed more bullpucky during the debate than he does normally. Here are the probabilities that describe our level of certainty in such statements:
  • We can be 70.37% certain that Biden spews more bullpucky normally than he did during the vice presidential debate. Not completely certain, but better than two to one odds. 
  • 86.35% certain that Ryan spews less bullpucky normally than he did during the vice presidential debate. Not completely certain, but about six to one odds.
What about comparisons between the presidential and vice presidential candidates during the debates? Here is are comparison plots analogous to the ones we've constructed before, drawing the histograms of the simulated ratio between the Republican and Democrat bullpucky during a particular debate.
It looks like we can't make heads or tails of which presidential candidate spewed more bullpucky during either of their first two debates. It does look promising that Ryan spewed more bullpucky than Biden during their debate. And here are the probability statements that confirm these graphical hunches:
  • We can be 64.14% certain Romney spewed more bullpucky than Obama during the 1st debate. That's nearly 2 to 1 odds.
  • We can be 92.38% certain Ryan spewed more bullpucky than Biden during the vice presidential debate. Not completely certain, but pretty certain.
  • We can be 53.11% certain Romney spewed more bullpucky than Obama during the 2nd debate. Not much better than a toss up.

Monday is the third and final presidential debate. Can we predict from these analyses who will spew more bullpucky than who? Unfortunately, not with much precision. But I hope the candidates make a trend out of the second debate's pattern, wherein both candidates appeared to spew less bullpucky than normal. On Tuesday, I will do an analysis of the fourth debate, as well as an analysis of Obama and Romney's overall debate performance, and also an analysis comparing the two tickets overall.

What do we learn from these analyses? First, we learn that there is a lot uncertainty in the amount of bullpucky that politicians spew compared to one another. Second, despite this uncertainty, we can be reasonably confident that Obama spews less bullpucky than Romney, and the Democratic ticket spews less bullpucky than the Republican, but not as much less as left-leaning pundits would have you believe. Third, all of the candidates have mean bullpucky scores that are within ten points of half truthful, and all of their distributions overlap the half truthful mark. Fourth, we can do similar measurements and comparisons of individual performance during key debates. 

Factuality isn't the only factor you should consider when you vote. But I hope that you'll use Malark-O-Meter to inform your decisions in this year's presidential election and beyond.

Welcome to Malark-O-Meter.
 

    about

    Malark-O-blog published news and commentary about the statistical analysis of the comparative truthfulness of the 2012 presidential and vice presidential candidates. It has since closed down while its author makes bigger plans.

    author

    Brash Equilibrium is an evolutionary anthropologist and writer. His real name is Benjamin Chabot-Hanowell. His wife calls him Babe. His daughter calls him Papa.

    what is malarkey?

    It's a polite word for bullshit. Here, it's a measure of falsehood. 0 means you're truthful on average. 100 means you're 100% full of malarkey. Details.

    what is simulated malarkey?

    Fact checkers only rate a small sample of the statements that politicians make. How uncertain are we about the real truthfulness of politicians? To find out, treat fact checker report cards like an experiment, and use random number generators to repeat that experiment a lot of times to see all the possible outcomes. Details.

    malark-O-glimpse

    Can you tell the difference between the 2012 presidential election tickets from just a glimpse at their simulated malarkey score distributions?

    Picture
    dark = pres, light = vp
    (Click for larger image.)

    fuzzy portraits of malarkey

    Simulated distributions of malarkey for each 2012 presidential candidate with 95% confidence interval on either side of the simulated average malarkey score. White line at half truthful. (Rounded to nearest whole number.)

    Picture
    (Click for larger image.)
    • 87% certain Obama is less than half full of malarkey.
    • 100% certain Romney is more than half full of malarkey.
    • 66% certain Biden is more than half full of malarkey.
    • 70% certain Ryan is more than half full of malarkey.
    (Probabilities rounded to nearest percent.)

    fuzzy portraits of ticket malarkey

    Simulated distributions of collated and average malarkey for each 2012 presidential election ticket, with 95% confidence interval labeled on either side of the simulated malarkey score. White line at half truthful. (Rounded to nearest whole number.)

    malarkometer fuzzy ticket portraits 2012-10-16 2012 election
    (Click for larger image.)
    • 81% certain Obama/Biden's collective statements are less than half full of malarkey.
    • 100% certain Romney/Ryan's collective statements are more than half full of malarkey.
    • 51% certain the Democratic candidates are less than half full of malarkey.
    • 97% certain the Republican candidates are on average more than half full of malarkey.
    • 95% certain the candidates' statements are on average more than half full of malarkey.
    • 93% certain the candidates themselves are on average more than half full of malarkey.
    (Probabilities rounded to nearest percent.)

    Comparisons

    Simulated probability distributions of the difference the malarkey scores of one 2012 presidential candidate or party and another, with 95% confidence interval labeled on either side of simulated mean malarkey. Blue bars are when Democrats spew more malarkey, red when Republicans do. White line and purple bar at equal malarkey. (Rounded to nearest hundredth.)

    Picture
    (Click for larger image.)
    • 100% certain Romney spews more malarkey than Obama.
    • 55% certain Ryan spews more malarkey than Biden.
    • 100% certain Romney/Ryan collectively spew more malarkey than Obama/Biden.
    • 94% certain the Republican candidates spew more malarkey on average than the Democratic candidates.
    (Probabilities rounded to nearest percent.)

    2012 prez debates

    presidential debates

    Simulated probability distribution of the malarkey spewed by individual 2012 presidential candidates during debates, with 95% confidence interval labeled on either side of simulated mean malarkey. White line at half truthful. (Rounded to nearest whole number.)

    Picture
    (Click for larger image.)
    • 66% certain Obama was more than half full of malarkey during the 1st debate.
    • 81% certain Obama was less than half full of malarkey during the 2nd debate.
    • 60% certain Obama was less than half full of malarkey during the 3rd debate.
    (Probabilities rounded to nearest percent.)

    Picture
    (Click for larger image.)
    • 78% certain Romney was more than half full of malarkey during the 1st debate.
    • 80% certain Romney was less than half full of malarkey during the 2nd debate.
    • 66% certain Romney was more than half full of malarkey during the 3rd debate.
    (Probabilities rounded to nearest percent.)

    aggregate 2012 prez debate

    Distributions of malarkey for collated 2012 presidential debate report cards and the average presidential debate malarkey score.
    Picture
    (Click for larger image.)
    • 68% certain Obama's collective debate statements were less than half full of malarkey.
    • 68% certain Obama was less than half full of malarkey during the average debate.
    • 67% certain Romney's collective debate statements were more than half full of malarkey.
    • 57% certain Romney was more than half full of malarkey during the average debate.
     (Probabilities rounded to nearest percent.)

    2012 vice presidential debate

    Picture
    (Click for larger image.)
    • 60% certain Biden was less than half full of malarkey during the vice presidential debate.
    • 89% certain Ryan was more than half full of malarkey during the vice presidential debate.
    (Probabilities rounded to nearest percent.)

    overall 2012 debate performance

    Malarkey score from collated report card comprising all debates, and malarkey score averaged over candidates on each party's ticket.
    Picture
    (Click for larger image.)
    • 72% certain Obama/Biden's collective statements during the debates were less than half full of malarkey.
    • 67% certain the average Democratic ticket member was less than half full of malarkey during the debates.
    • 87% certain Romney/Ryan's collective statements during the debates were more than half full of malarkey.
    • 88% certain the average Republican ticket member was more than half full of malarkey during the debates.

    (Probabilities rounded to nearest percent.)

    2012 debate self comparisons

    Simulated probability distributions of the difference in malarkey that a 2012 presidential candidate spews normally compared to how much they spewed during a debate (or aggregate debate), with 95% confidence interval labeled on either side of the simulated mean difference. Light bars mean less malarkey was spewed during the debate than usual. Dark bars less. White bar at equal malarkey. (Rounded to nearest hundredth.)

    individual 2012 presidential debates

    Picture
    (Click for larger image.)
    • 80% certain Obama spewed more malarkey during the 1st debate than he usually does.
    • 84% certain Obama spewed less malarkey during the 2nd debate than he usually does.
    • 52% certain Obama spewed more malarkey during the 3rd debate than he usually does.
    Picture
    (Click for larger image.)
    • 51% certain Romney spewed more malarkey during the 1st debate than he usually does.
    • 98% certain Romney spewed less malarkey during the 2nd debate than he usually does.
    • 68% certain Romney spewed less malarkey during the 3rd debate than he usually does.

    (Probabilities rounded to nearest percent.)

    aggregate 2012 presidential debate

    Picture
    (Click for larger image.)
    • 58% certain Obama's statements during the debates were more full of malarkey than they usually are.
    • 56% certain Obama spewed more malarkey than he usually does during the average debate.
    • 73% certain Romney's statements during the debates were less full of malarkey than they usually are.
    • 86% certain Romney spewed less malarkey than he usually does during the average debate.

    (Probabilities rounded to nearest percent.)

    vice presidential debate

    Picture
    (Click for larger image.)
    • 70% certain Biden spewed less malarkey during the vice presidential debate than he usually does.
    • 86% certain Ryan spewed more malarkey during the vice presdiential debate than he usually does.

    (Probabilities rounded to nearest percent.)

    2012 opponent comparisons

    Simulated probability distributions of the difference in malarkey between the Republican candidate and the Democratic candidate during a debate, with 95% confidence interval labeled on either side of simulated mean comparison. Blue bars are when Democrats spew more malarkey, red when Republicans do. White bar at equal malarkey. (Rounded to nearest hundredth.)

    individual 2012 presidential debates

    Picture
    (Click for larger image.)
    • 60% certain Romney spewed more malarkey during the 1st debate than Obama.
    • 49% certain Romney spewed more malarkey during the 2nd debate than Obama.
    • 72% certain Romney spewed more malarkey during the 3rd debate than Obama.

    (Probabilities rounded to nearest percent.)

    aggregate 2012 presidential debate

    Picture
    (Click for larger image.)
    • 74% certain Romney's statements during the debates were more full of malarkey than Obama's.
    • 67% certain Romney was more full of malarkey than Obama during the average debate.

    (Probabilities rounded to nearest percent.)

    vice presidential debate

    • 92% certain Ryan spewed more malarkey than Biden during the vice presidential debate.

    (Probabilities rounded to nearest percent.)

    overall 2012 debate comparison

    Party comparison of 2012 presidential ticket members' collective and individual average malarkey scores during debates.
    • 88% certain that Republican ticket members' collective statements were more full of malarkey than Democratic ticket members'.
    • 86% certain that the average Republican candidate spewed more malarkey during the average debate than the average Democratic candidate.

    (Probabilities rounded to nearest percent.)

    observe & report

    Below are the observed malarkey scores and comparisons form the  malarkey scores of the 2012 presidential candidates.

    2012 prez candidates

    Truth-O-Meter only (observed)

    candidate malarkey
    Obama 44
    Biden 48
    Romney 55
    Ryan 58

    The Fact Checker only (observed)

    candidate malarkey
    Obama 53
    Biden 58
    Romney 60
    Ryan 47

    Averaged over fact checkers

    candidate malarkey
    Obama 48
    Biden 53
    Romney 58
    Ryan 52

    2012 Red prez vs. Blue prez

    Collated bullpucky

    ticket malarkey
    Obama/Biden 46
    Romney/Ryan 56

    Average bullpucky

    ticket malarkey
    Obama/Biden 48
    Romney/Ryan 58

    2012 prez debates

    1st presidential debate

    opponent malarkey
    Romney 61
    Obama 56

    2nd presidential debate (town hall)

    opponent malarkey
    Romney 31
    Obama 33

    3rd presidential debate

    opponent malarkey
    Romney 57
    Obama 46

    collated presidential debates

    opponent malarkey
    Romney 54
    Obama 46

    average presidential debate

    opponent malarkey
    Romney 61
    Obama 56

    vice presidential debate

    opponent malarkey
    Ryan 68
    Biden 44

    collated debates overall

    ticket malarkey
    Romney/Ryan 57
    Obama/Biden 46

    average debate overall

    ticket malarkey
    Romney/Ryan 61
    Obama/Biden 56

    the raw deal

    You've come this far. Why not just check out the raw data Maslark-O-Meter is using? I promise you: it is as riveting as a phone book.

    archives

    June 2013
    May 2013
    April 2013
    January 2013
    December 2012
    November 2012
    October 2012

    malark-O-dex

    All
    2008 Election
    2012 Election
    Average Malarkey
    Bias
    Brainstorm
    Brier Score
    Bullpucky
    Caveats
    Closure
    Collated Malarkey
    Conversations
    Dan Shultz
    Darryl Holman
    Debates
    Drew Linzer
    Election Forecasting
    Equivalence
    Fact Checking Industry
    Fallacy Checking
    Foreign Policy
    Fuzzy Portraits
    Gerrymandering
    Incumbents Vs. Challengers
    Information Theory
    Kathleen Hall Jamieson
    Launch
    Logical Fallacies
    Longitudinal Study
    Malarkey
    Marco Rubio
    Meta Analysis
    Methods Changes
    Misleading
    Model Averaging
    Nate Silver
    Origins
    Pants On Fire
    Politifactbias.com
    Poo Flinging
    Presidential Election
    Ratios Vs Differences
    Redistricting
    Red Vs. Blue
    Root Mean Squared Error
    Sam Wang
    Science Literacy
    Short Fiction
    Simon Jackman
    Small Multiples
    Stomach Parasite
    The Future
    The Past
    To Do
    Truth Goggles
    Truth O Meter
    Truth O Meter