In 2015, I will launch SoundCheks, a revolutionary web application that will measure, compare, and report the truthfulness and logical validity of political statements and the political figures who make them. Note that, unlike most fact checkers--such as PolitiFact.com or The Fact Checker at the Washington Post--SoundCheks will score both the truth and logic in political statements. If the premises of an argument are both true and logically valid, the argument is sound. So when you check both truth and logic, you could call it soundness checking. We'll call such rulings SoundCheks. Get it?

SoundCheks are better than fact checking because fact checkers check logical validity only implicitly. Consequently, they get mixed up about what a fact is, confuse their readers, and make overly subjective rulings (and, as we'll see, overly subjective choices about what to rule on). Earlier today, Glenn Kessler, the Fact Checker at the Washington Post, inadvertently demonstrated the weaknesses of his profession. 

Kessler's recent article explains why his team did not fact check a pair of statements made by Obama and Boehner, respectively. Let's focus on the Obama statement, which was about the "transparency" of the Foreign Intelligence Surveillance Court, and which he made in an interview with Charlie Rose about recent controversies concerning foreign and domestic intelligence gathering. Here's what Kessler had to say. 
From our reading, it appears as if Obama is using the phrase “transparent” to mean there is “a system of checks and balances.” In other words, this is a sensitive area and records are available to a select group of trustworthy people.
Kessler argues that you could interpret the word this way because:
“While the details of the day-to-day work of the FISC is necessarily classified, it operates pursuant to public statute,” a senior administration official explained. “The FISC also receives regular reports on the conduct of both disclosed programs, providing more rigorous oversight than was in place before. Congress also receives these reports, and we have also increased the briefings we do for the Hill on our activities and have given them access to information about the FISC’s activities. And the DNI [Director of National Intelligence] has recently released lots of information in recent weeks on how the data is collected and can be used.”
I find it interesting that Kessler premises the supposed double-meaning of transparency on a senior administration official's claims rather than...y'know, the accepted definition of transparency in government. But let's give him the benefit of the doubt that are two possible meanings for the word in this context. (Still, note that there aren't.)

Why did Kessler's team decide against fact checking Obama's claim that FISA is transparent? Here's what he says:
We certainly could have used the quote to write a column to explore the inner workings of the court. But in the end we thought the exchange was too confusing to warrant the Four-Pinocchio treatment, even after PolitiFact made its ruling. That’s because Obama appeared to be avoiding the question rather than directly answering it. It is not even quite clear whether Obama is saying the court itself is transparent, or something else.
So according to one fact checker (PolitiFact), the discourse about transparency isn't too confusing for the public. (Indeed, PolitiFact gave Obama its dreaded if obnoxious "Pants on Fire" ruling.) But The Fact Checker thinks it is. Why? Because Obama was being ambiguous. 

Well here's the kicker. Purposive ambiguity is a logical fallacy!

Now a soundness checker (we'll call them SoundChekrs because we're uber-kewl) would call Obama out on his ambiguity fallacy, which is--according to yourlogicalfallacyis.com, most logicians, most members of junior high school debate teams, and actually most people I've had an intelligent argument with over drinks--"...a double meaning or ambiguity of language to mislead or misrepresent the truth."

I'm fairly confident that almost all politicians commit the ambiguity fallacy almost all of the time. But you wouldn't know it from the way some people fact check.

Listen, I am a Democrat and a bleeding heart liberal. I voted for Obama twice. If those elections happened again with the same candidates, I'd still vote for him. But I will not accept unchecked falsehood or fallacies in any politician in any party, nor will I accept its tacit approval by any fact checker. That I give to you as a preemptive SoundCheks guarantee. Stay tuned.
 
 
The other day, I did some math to find out how much I'd have to increase my ten year income above an assumed baseline to justify taking out the maximum possible loan from Upstart backers for my fact-checking startup called SoundCheks. I concluded that my ten year income would have to increase by at least 5% above an assumed baseline of $500,000/decad to justify the loan. Not bad.

That said, to take the maximum award comes with an agreement to pay 7% of my personal income over ten years. At this point, I'm not sure if I'm willing to personally take that amount of financial risk for the startup. I'd rather get my name out there somehow so that I can convince someone to invest in me at a return rate determined by their personal opinion of me rather than the opinion of a computer algorithm.

Maybe one way to get my name out there is to ask for a smaller Upstart loan, and try to attract some mentor/backers in addition to investment-only backers. And maybe that smaller award could retire my student loans. Last time, the question was how much my income needed to increase to justify a large loan. This time, the question is whether I should use one loan to retire another assuming conservatively that Upstart has no effect on my ten year income, which I conservatively estimate at marginally above the level at which I could defer my Upstart payments (i.e., $30k).

If I don't take an Upstart loan, my ten year income is:

G - d

where G is my gross income, which the parameter d discounts for my student loan payments over that ten year period. Given a ~$30/year gross income, my ten year gross income is $300k. So

G = 300,000

Note from last time that my student loan payments over ten years after graduation are $14,549. So

d = 14,549

My current student loan balance, however, is ~$11,305. If I make a request for that amount, I will have to pay Upstart a proportion p of my income over a ten year period. So my gross income given my decision to take the Upstart loan is

(1 - p)G

I should take the Upstart loan if my ten year income under that scenario is greater than if I didn't take the loan. Expressed as an inequality, as substituting p = p* as the threshold proportion of my income below which it makes sense to retire my loans, this means

(1 - p*)G > G - d

which, after rearranging, becomes

p* < d/G

Although this is a very simple analysis, it helps to get some intuition from the result. The higher my loan repayment relative to my gross income, the higher the proportion of my income I should be willing to pay to Upstart since I'll end up paying less to them than I would to Uncle Sam. Anyway, plugging in d = 14,549 and G = 300,000 yields the threshold proportion.

p* < 14,549/300,000 = 0.0485

Under my current funding rate, Upstart only asks for 2.39% of my income over ten years, which is lower than the threshold above. So under my current assumptions, it pays to do the Upstart.

Note that I have to check up on my ten year loan repayment estimate. But I'm pretty sure that my monthly payment would not be affected by the income-based repayment option. The only question is whether I've calculated the ten year payment correctly. If I underestimated the ten year payment, it's no problem because I would just be underestimating the threshold proportion. If I overestimated the ten year payment, that's a potential problem.

So let's say that I was the luckiest student alive and paid 0% interest. Then threshold proportion becomes

p* < 11,305/300,000 = 0.0377

So it still pays. But there's another potential problem. What if I underestimated my ten year income? What is the maximum amount of income I can earn before it doesn't make sense to do the Upstart? We can check this by rearranging the inequality, substituting p* = p = 0.0238, d = 14,549, and G = G*, then solving for G*, which is

G* < d/p = 14,549/0.0238 = 611,302

Interestingly, this is less than the ten year income that Upstart projects for me in their income chart tool (which you can access once they approve your profile). Upstart has incentive to report an overestimate of your income to make you feel good, but it also has incentive to report an underestimate because it would show lower monthly payments and make you more willing to opt in. But let's hope that Upstart is honest and just reports what their algorithm estimated for your future income.

Of course, I'm assuming in this analysis that Upstart will have no effect on my income. So long as I get a mentor/backer, I doubt Upstart will have no effect on my income. The trouble is, I don't know what effect it will have. Probably a positive one if I connect with a good, high profile mentor. Plus, if I get a high profile mentor out of Upstart who could help me raise startup capital for SoundCheks, it gives me a networking opportunity I otherwise wouldn't have.

So here's what I am going to do. I am going to do the Upstart to retire my loans. But the message of my profile will be that, while I'm only using Upstart's loan feature to retire my loans, my real interest is in developing relationships with people who can help me start my business. Because in the end, it's not just about the money. It's also about chasing a dream about a kickass idea.


I hope this post helps other potential Upstarts make their decisions. I also hope it demonstrates to my potential backers that I know how to make good decisions with the help of s
 
 
Upstart is a very cool company. They give big dreamers the chance to borrow money from backers (who may also serve as mentors) in exchange for a small percentage of the "upstart's" income over a fixed period. Say you want to start a company, and you'd like a lot of time over a year to get it off the ground. Wouldn't it be nice to have, say, $20k-$30k of startup money, and some clout that you could leverage to get further startup funding? But wouldn't it suck if the debt cost of borrowing that money exceeded its benefit to your ten year future income?

Yeah, it's a tradeoff. And you know what is good for examining tradeoffs? Math. I'm going to do some to help me decide whether or not I want to kick off my Upstart profile, and for how much. I developed my Upstart profile as a means to help fund the startup called SoundCheks. I hope that, by doing this analysis, I help other potential Upstarts decide how to make their own decisions.

Funny story. The first time I did this calculation, I made a serious error, which caused me to dramatically inflate the necessary income benefit for the Upstart to make any sense. The thing that makes it funny is that I sent a heartfelt email to Upstart that said I am likely going to wthdraw my profile. That's what happens when (a) you are naturally risk averse, (b) you did the calculations on the standing room only bus ride home, and (c) you did them while somewhat emotional that the funding rate you were awarded is lower than average (because you aren't a Princeton grad who scored high on his SAT). Thankfully, I caught the mistake, and that's why you're reading this right now.

The other reason why you're reading this is that this is big money we're talking here, at least to a (currently) broke graduate student who is married and has a three-almost-four-year-old daughter. So I'm not going to make decisions about it lightly. Hopefully, one day it will be chump change.

Here's how Upstart works. They allow you to raise R amount of dollars in exchange for 1% of your income over the next ten years. So if you request mR dollars (and you are funded at that level), you owe your backers (who are qualified investors, a lot of them in the tech field) m% of your income over the next decade. Most Upstarts, says the company's founder, should be able to raise between R = $6,000 and R = $8,000 dollars per 1% of income owed. You now what my funding rate is?

R = $4,750 per 1% income owed.

I was very disappointed. Upstart optimizes their funding rate based on your GPA, SAT scores, schools attended, and some other stuff. I don't know what the optimization algorithm is because, if I did, then I would just launch another Upstart-like company. Understandably, the algorithm is proprietary.

So I bet the reason why my funding rate is so low is because (a) I went to a third tier undergraduate teaching college, (b) I go to UW now, which is a great university, but it ain't Harvard!, and (c) I've never averaged much over 3.7 GPA. Who knows, maybe I'm not all that intelligent. Then again, I have been able to raise over $130k in scholarships and fellowships over the last six years, and that's not even counting tuition and medical insurance benefits. I'm also a Fulbright scholar. But you can only include so many variables in an optimization model, I guess.

Anyway, R = $4,750. That's what I've got.

What I want to do with this money is leverage it to start a company. That means I want as much of it as possible to help keep me and my family housed and fed for a year while I work my ass off (hopefully joined by an awesome team of employees at some point in the near future). That means I would probably want to ask for the maximum possible award, which is:

mR = $33,250, where m = 7

Therefore, I will owe m% = 7% of my income to my backers for ten years. Sounds like a lot. Furthermore, because I'll be paying my backers back for a decade, I might not want to worry about my student loan payments. So I might want to use this money to pay off a portion of my student loans.

Let's proceed.

Let p = m/100 = 7/100 = 0.07 be the proportion of my income that I am required to pay back to my backers if I do Upstart.

Let G be my ten year income if I do not do Upstart, assessed prior to my total student loan repayment over that period. Let's set G = 500,000. I think it is reasonable (if perhaps a little conservative) to assume I can make $50k/year on average over the decade following my graduation from UW.

Let d = 14,549 be my total student loan repayment over that ten year period, accounting for interest. Sad, I know (but better than many Americans have it!).

Let H = bG = b(500,000) be my ten year income if I do Upstart, assessed prior to my total Upstart loan repayment over that period.

Let f = 0.03 be the proportion of my award that I must pay as a fee to Upstart.

I want to solve for b*, the number of times greater my Upstart-influenced ten year income must be than my baseline income in order to justify taking the loan. Let's evaluate this with an inequality.

(1 - p)b*G - fmR > G - d

After some rearranging, we come up with a lower-bound for b*.

b* > 1/(1 - p) + (fmR - d)/(G(1 - p))

Plugging in p, f, m, R, and G and rounding to the nearest hundredth makes the threshold more concrete.

b* > 1.05

So in order to justify taking this loan and spending some of it to pay down my student loan, it needs to increase my ten year income by at least more than 5%.

Now let's take a look at what I have to work with in terms of adding value to..well..to myself with this Upstart money.

Let's say I get the full funding, $33,250. I have to discount that by the 3% fee to Upstart, which is $33,250(0.03) = $997.50. I also have to discount it by the principal balance of my loan, which is currently $11,304.40. Thankfully, loans are not taxed, so I don't have to lop off any taxes from the award. After all that, what am I left with?

$33,250 - $997.50 - $11,304.40 = $20,948.10

Let's be clear. That's not a high enough annual income for a husband and father in the Seattle area. Not if he wants to stay married, and not if he doesn't want to live off food stamps, and not if he doesn't want to burden his wife with being the primary bread winner. As an annual full-time income, I find that unacceptable. Furthermore, I would be out of the workforce for an entire year. This is just too much risk to accept for a married father. I cannot live in my mother's basement for a year eating Ramen at this point in my life, which is cool because I don't really want to do that.

But what if I could settle for doing the "idea" phase of the project part-time? What if I could raise another $10-$20k to supplement, plus some other funding to recruit a co-founder? And what if my savvy Upstart fundraising, plus the publicity it might help me get, could attract mentors and team members who could help me raise more funding and reach my goals?

I hope this analysis serves as a useful model to others making similar decisions. As it stands, I think this startup loan would be a net benefit to me and my family. I would really appreciate your comments, though. And remember...I need to convince not only myself, but my wife. And my wife is a no bullshit kind of lady.
 
 
Al Pacino as Corleone
I checked Twitter this morning and found this from Brendan Nyhan, aka "my hero":
It's a post by John Sides at The Monkey Cage about a recent press release by the Center for Media and Public Affairs (CMPA) on the fact checking reports of PolitiFact. Basically, CMPA researcher cross-tabulated the percentage of statements ranked by PolitiFact fact checkers as true or false with the political party of the person who made the statement. Republicans' statements were rated as false with a 3 to 1 margin compared to Democrats, and Democrats' statements were rated as true with a 2 to 1 margin compared to Republicans.

CMPA hinted (but, as usual, did not outright say) that these results arise (at least in part) from PolitiFact's liberal bias. U.S. News & World Report contributing editor Peter Roff took that idea and ran with it, writing:
As the first person to empirically demonstrate the liberal, pro-Democrat bias in the Washington press corps, [CMPA Director] Lichter's analysis is worth further study and comment. [my emphasis]
But, as Sides wrote at The Monkey Cage:
Politifact [sic] isn’t randomly sampling the statements of Republicans and Democrats. They’re just examining statements they consider particularly visible, influential, or controversial. The data are consistent with any number of interpretations and so we can’t say all that much about the truthfulness of political parties, about any biases of Politifact, etc.
I've said as much when I calculated the possible extent of both liberal and centrist bias in statements by both PolitiFact and Glen Kessler, The Fact Checker at WaPo (who is very sad that Michelle Bachmann isn't running again). I also debated with some conservatives about how uncertain we are in the extent and type of bias among fact checkers.

But while I agree that we don't know how biased fact checkers are (simply because we've never measured their bias relative to some reasonable baseline), I've always disagreed with John Sides' and Brendan Nyhan's opinions that the newsworthiness bias among fact checkers makes the sample of fact checked statements too biased for comparisons to be useful.

Fact checkers cover statements that actually matter to people because they want people to read what they write. What matters to the public is the truthfulness of the statements that politicians make about issues that we find important. If we just took a random sample of political statements, we'd come across a lot of innocuous ones about things that aren't that important. While a systematic analysis could come up with some reasonable way to weight statements by the importance of the issue to the public (or by some other rubric), looking at fact checkers' statements is a good first pass.

As I design the methods for SoundCheks, a fact checking (well, fallacy checking) research institute I dream of founding and making successful, I'll think a lot about my statement sampling methods. And I've always been thinking about how to measure fact checker bias compared to nonprofessionals and obvious partisans.
 
 
Last year, while living away from my family for a year to do ethnographic fieldwork in a remote village on a tiny Lesser Antillean island, I kept myself sane and connected to the political news in my home country by creating a new hobby. I applied my knowledge of inferential statistics and computational simulation to use fact checker reports from PolitiFact.com and The Fact Checker at the Washington Post to comparatively judge the truthfulness of the 2012 presidential and vice presidential candidates, and (more importantly) to measure our uncertainty in those judgments. 

The site (and its syndication on the Daily Kos) generated some good discussion, some respectable traffic, and (I hope) showed its followers the potential for a new kind of inference-driven fact checking journalism. My main conclusions from the 2012 election analysis were:

(1) The candidates aren't as different as partisans left or right would have us believe.

(2) But the Democratic ticket was somewhat more truthful than the Republican ticket, both overall, and during the debates.

(3) It's quite likely that the 2012 Republican ticket was less truthful than the 2008 Republican ticket, and somewhat likely that the 2012 Democratic ticket was less truthful than the 2008 Democratic ticket.

Throughout, I tempered these conclusions with the recognition that my analyses did not account for the possible biases of fact checkers, including biases toward fairness, newsworthiness, and, yes, political beliefs. Meanwhile, I discussed ways to work toward measuring these biases and adjusting measures of truthfulness for them. I also suggested that fact checkers should begin in earnest to acknowledge that they aren't just checking facts, but the logical validity of politicians' arguments, as well. That is, fact checkers should also become fallacy checkers who gauge the soundness of an argument, not simply the truth of its premises. 

Now, it's time to close up shop. Not because I don't plan on moving forward with what I'm proud to have done here. I'm closing up shop because I have much bigger ideas.

I've started writing up an master plan for a research institute and social media platform that will revolutionize fact checking journalism. For now, I'm calling the project Sound Check. I might have to change the name because that domain name is taken. Whatever its eventual name, Sound Check will be like FiveThirtyEight meets YouGov meets PolitiFact meets RapGenius: data-driven soundness checking journalism and research on an annotated social web. You can read more about the idea from this draft executive summary.

Anyway, over the next three years (and beyond!), I hope you're going to hear a lot about this project. Already, I've started searching for funding so that I can, once I obtain my PhD in June 2014, start working full time on Sound Check.

One plan is to become an "Upstart". Upstart is a new idea from some ex-Googlers. At Upstart, individual graduates hedge their personal risk by looking for investor/mentors, who gain returns from the Upstart's future income (which is predicted from a proprietary algorithm owned by Upstart). Think of it as a capitalist, mentoring-focused sort of patronage. Unlike Kickstarter or other crowd-funding mechanisms, where patrons get feel-good vibes and rewards, Upstart investors are investing in a person like they would invest in a company.

Another plan is, of course, to go the now almost traditional crowd-funding route, but only for clearly defined milestones of the project. For example, first I'd want to get funding to organize a meet-up of potential collaborators and investors. Next I'd want to get funding for the beta-testing of the sound checking algorithm. After that I'd get funding for a beta-test of the social network aspect of Sound Check. Perhaps the these (hopefully successfully) crowd-funded projects would create interest among heavy-hitting investors.

Yet another idea is to entice some university (UW?) and some wealthy person or group of people interested in civic engagement and political fact checking to partner with Sound Check in a way similar to how FactCheck.org grew out of the Annenberg Public Policy Center at University of Pennsylvania.

Sound Check is a highly ambitious idea. It will need startup funding for servers, programmers, administrative staff, as well as training and maintaining Sound Checkers (that's fact checkers who also fallacy check). So I've got my work cut out for me. I'm open to advice and new mentors. And soon, I'll be open, along with Sound Check, to investors and donors.
 
 
Science journalist Chris Mooney recently wrote about "The Science of Why Comment Trolls Suck" in Mother Jones magazine. His article covers a study by researchers at the George Mason University Center for Climate Change Communication, who asked over a thousand study participants to read the same blog article about the benefits and risks of nanotechnology. The comment section that subjects experienced varied from civil discussion to name-calling flame war. The researchers found that witnessing flame wars caused readers' perceptions of nanotechnology risks to become more extreme.

Mooney argues that these findings don't bode well for the public understanding of climate science. He also argues that this
....is not your father's media environment any longer. In the golden oldie days of media, newspaper articles were consumed in the context of…other newspaper articles. But now, adds Scheufele, it's like "reading the news article in the middle of the town square, with people screaming in my ear what I should believe about it."
Finally, based on his interpretation of the evidence, Mooney advocates that we ignore the comments section.

I agree with Mooney that flame wars are detrimental to rational discourse, and that the George Mason study highlights the pitfalls of what Daniel Kahneman calls "System 1" thinking. Yet I counter that a purely civil discussion about climate change may also be counter-productive. Furthermore, the comments section doesn't mark a huge departure from "your father's media environment". Finally, the George Mason University study demonstrates that there is good reason to pay very close attention to the comments section, even if it is littered with bridges beneath which trolls dwell.

How can civil discourse be counter-productive? Agreeable people tend to groupthink, which is when people make poor decisions for the sake of harmony and conformity. Another possible consequence of over-civility is the middle ground fallacy, which is the false assumption that the middle point between two extremes must be the truth (medio tutissimus ibis, anyone?). A flame war presses emotional buttons that polarize a discussion and inhibit rationality. But an overly civil discourse massages the human tendency to conform, which may also lead us astray. The key is to balance investment in our beliefs with the willingness to abandon invalid arguments and discard false premises.

As for whether or not it is still your father's media environment, of course it isn't. Still, flame wars are anything but new. Trust me. I just spent eleven months in a rural village where people have limited access to electronic media, much less the Internet. People don't have to hide behind anonymous screen names to behave immaturely during a heated debate, quickly drowning out the radio program that prompted the discussion.

Finally, why shouldn't we ignore the comments section? It's not because we're likely to find high quality debate there. We should steel ourselves against the polarizing effects of flame wars, becoming part of the mob to better understand how people form and defend their beliefs. We shouldn't limit ourselves to observing the effects of flame wars from afar, as did the George Mason research team. There's something to be said for allowing yourself to get bated by trolls a few times to experience for yourself how easy it is to be led astray. If you look at my user history at Reddit or the Daily Kos, you'll see I speak from experience. I've learned valuable lessons about the limits of my rationality on those sites. We anthropologists would call this method of inquiry participant observation.

Apart from these three criticisms, I read Mooney's article and the research it covers in light of one of Malarkometer's missions, which is to quantify and correct for the bias and uncertainty inherent in measures of political figures' factuality. Another of my goals is to eventually host a site where both professional fact checkers and non-professionals engage in fact checking and discourse about fact checking while regularly answering questionnaires about political philosophy. The idea is to compare the influence of political philosophy on fact checking performance of professionals versus non-professionals. Based on the George Mason group's findings, I might also want to experiment with varying the levels of comment moderation across the site, and then examine the influence of forum comment vituperativeness on aggregate fact checking performance. And, yes, I might also want to suggest to my future fact checking staff to kindly avoid the comment section while they write their reports!
 
 
Today, I met with Sarah Stuteville, Lecturer in Communications at University of Washington, and Co-founder/Editor of the Common Language Project (CLP). The CLP is a nonprofit, multimedia journalism organization that focuses on international reporting, local reporting in the Puget Sound region, and journalism in education. I asked for Stuteville's advice about the direction of Malark-O-Meter because CLP has a strategic relationship with the University of Washington that I would like Malark-O-Meter (or whatever it becomes) to mirror.

So the questions I had for Sarah were understandably about how to build an organization that the UW would want to team up with. Based on Sarah's description of the history of CLP's relationship with UW, I gleaned the following valuable pieces of advice.

To attract university paternship, I need to develop credibility and get some independent funding. Makes sense. I also should be sure to have a set of deliverables that the UW would see as providing it with return on investment of resources and time. Also makes sense. 

Furthermore, I should be sure that what Malark-O-Meter's deliverables engage not only faculty, but students. One of CLP's strengths is that its multimedia journalism activities are fully embedded into the curriculum of the Department of Communications. I explained to Sarah an idea I have that would engage students in fact checking activities, which would prime crowd-sourced data collection instrument that I could use to comparatively assess the supposed biases of professional fact checkers relative to nonprofessionals.

Sarah also provided me with an awesome list of highly relevant contacts to reach in the meantime. I cannot wait to meet some of the people whom she mentioned.

Anyway, thanks to Sarah Stuteville for agreeing to a helpful meeting.
 
 
This week, two political science blog posts about the difference between political engagement and factual understanding stood out to Malark-O-Meter. (Thanks to The Monkey Cage for Tweeting their links.) First, there's Brendan Nyhan's article at YouGov about how political knowledge doesn't guard against belief in conspiracy theories. Second, there's voteview's article about issues in the 2012 election. (Side note: This could be the Golden Era of political science blogging) These posts stand out both as cautionary tales about what it means to be politically engaged versus factual, and as promising clues about how to assess the potential biases of professional fact checkers in order to facilitate the creation of better factuality metrics (what Malark-O-Meter is all about).

Let's start with Nyhan's disturbing look at the interactive effect of partisan bias and political knowledge on belief in the conspiracy theory that the 2012 unemployment rate numbers were manipulated for political reasons. The following pair of plots (reproduced from the original article) pretty much says it all.

First, there's the comparison of Dem, Indie, and GOP perception of whether unemployment statistics are accurate, grouped by party affiliation and low, medium, and high scores on a ten-question quiz on political knowledge.
Republicans and maybe Independents with greater political knowledge perceive the unemployment statistics to be less accurate.

Here's a similar plot showing the percent in each political knowledge and party affiliation group that believe in the conspiracy theory about the September unemployment statistic report.
Democrats appear less likely to believe the conspiracy theory the more knowledgeable they are. Republicans with greater political knowledge are more likely to believe the conspiracy theory. There's no clear effect among Independents. What's going on?

Perhaps the more knowledgeable individuals are also more politically motivated, and so is their reasoning. It just so happens that motivated reasoning in this case probably errs on the side of the politically knowledgeable Democrats.

Before discussing what this means for fact checkers and factuality metrics, let's look at what voteview writes about an aggregate answer to a different question, posed by Gallup (aka, the new whipping boy of the poll aggregators) about the June jobs report.
Picture
Click to enlarge.
In case you haven't figured it out, you're looking at yet another picture of motivated reasoning at work (or is it play?). Democrats were more likely than Republicans to see the jobs report as mixed or positive, whereas Republicans were more likely than Democrats to see it as negative. You might expect this effect to shrink among individuals who say they pay very close attention to news about the report because, you know, they're more knowledgeable and they really think about the issues and... NOPE!
Picture
Click to enlarge.
The more people say they pay attention to the news, the more motivated their reasoning appears to be.

What's happening here? In Nyhan's study, are the more knowledgeable people trying to skew the results of the survey to make it seem like more people believe or don't believe in the conspiracy theory? In the Gallup poll, is "paid very close attention to news about the report" code for "watched a lot of MSNBC/Fox News"? Or is it an effect similar to what we see among educated people who tend to believe that vaccinations are (on net) bad for their children despite lots and lots of evidence to the contrary? That is, do knowledgeable people know enough to be dangerous(ly stupid)?

I honestly don't know what's happening, but I do have an idea about what this might mean for the measurement of potential act checker bias to aid the creation of better factuality metrics and fact checking methods. I think we can all agree that fact checkers are knowledgeable people. The question is, does their political knowledge and engagement have the same effect on their fact checking as its does on the perceptions educated non-fact-checkers? If so, is the effect as strong?

I've mentioned before that a step toward better fact checking is to measure the potential effect of political bias on both the perception of fact and the rulings of fact checkers. Basically, give individuals a questionnaire that assesses their political beliefs, and see how they proceed to judge the factuality of statements made by individuals of known party affiliations, ethnicity, et cetera. To see if fact checking improves upon the motivated reasoning of non-professionals, compare the strength of political biases on the fact checking of professionals versus non-professionals. 

What these two blog posts tell me is that, when drawing such comparisons, I should take into account not only the political affiliation of the non-professionals, not only the political knowledge of the non-professionals, but the interaction of those two variables. Then, we can check which subgroup of non-professionals the professional fact checkers are most similar to, allowing us to make inferences about whether professional fact checkers suffer from the same affliction of motivated reasoning that the supposedly knowledgeable non-professionals suffer from.
 
 
Recently, the Nieman Journalism Lab reported on OpenCaptions, the creation of Dan "Truth Goggles" Schultz. OpenCaptions prepares a live television transcript from closed captions, which can then be analyzed. I came across OpenCaptions back in October, when I learned about Schultz's work on Truth Goggles, which highlights web content that has been fact checked by PolitiFact. Reading about it this time reminded me of something I'd written in my critique of the fact checking industry's opinions about factuality comparison among individual politicians.

At the end of that post, I commented on a suggestion made by Kathleen Hall Jamieson of the Annenberg Public Policy Center about how to measure the volume of factuality that a politician pumps into the mediasphere. Jamieson's suggestion was to weight the claims that a politician makes by the size of their audience. I pointed out some weaknesses of this factuality metric. I also recognized that it is still useful, and described the data infrastructure necessary to calculate the metric. Basically, you need to know the size of the audience of a political broadcast (say, a political advertisement), the content of the broadcast, and the soundness of the arguments made during the broadcast.

OpenCaptions shows promise as a way to collect the content of political broadcasts and publish it to the web for shared analysis. Cheers to Dan Schultz for creating yet another application that will probably be part of the future of journalism...and fact checking...and factuality metrics.
 
 
Yesterday, I argued that fact checkers who rate their rulings on a scale should incorporate the number and type of logical fallacies into their ratings. I also argued that the rating scales of fact checkers like PolitiFact and The Fact Checker are valuable, but they conflate soundness and validity, which causes their ratings to be vague. As usual, I syndicated the post on the Daily Kos. Kossack Ima Pseudoynm provided valuable constructive criticism, which we'll consider today.

The aptly titled comment by Ima Pseudonym was,
Great in theory, but...

Validity is a nice standard for mathematics and logic but it is not often found in public discourse. Even scientific conclusions are rarely (if ever) backed by valid reasoning as they typically rely on induction or inference to the best explanation.

A few nitpicks:
-Not every claim is an argument. An argument must offer evidence intended to support a conclusion. I can claim "I am hungry" without thereby offering any sort of argument (valid, inductive, fallacious or otherwise) in support of that claim. One cannot test the validity of a single proposition.
-No need to check for "both" soundness and validity. If you check for soundness, then you have already checked for validity as part of that. Perhaps you meant to say you would check for both truth of basic premises and validity of reasoning.
-It depends a bit on which notion of fallacy you are working with, but arguments can fail to be valid without committing a common named fallacy. A far simpler check for validity is simply to find counterexamples to the reasoning (logically possible examples in which the basic premises of the argument are all true and in which the conclusion of the argument is false).

Don't mean to discourage the project - it is a very worthwhile one and one that would be interesting to see play out.
This is Internet commenting at its best: constructive, well-reasoned, and mainly correct. Let's address the comment point by point.

"Validity is a nice standard for mathematics and logic but it is not often found in public discourse."

I can't agree more. This unfortunate fact should not, however, discourage us from specifying and enumerating the logical fallacies that public figures commit. It should encourage us to do so, as it has encouraged the establishment of the fact checking industry.

"Even scientific conclusions are rarely (if ever) backed by valid reasoning as they typically rely on induction or inference to the best explanation."

I agree that scientists stray from valid (and sound) argumentation more often than they should. I do not, however, agree that scientists rarely if ever make sound or valid arguments. I also agree that scientists often use inductive reasoning. Scientists will continue to do so as Bayesian statistical methods proliferate. I do not, however, agree that inductive inference is immune to the assessment of soundness and, by inclusion, validity. Inductive reasoning is probabilistic. For instance, a statistical syllogism (following Wikipedia's example) could go,
  1. 90% of humans are right-handed.
  2. Joe is a human.
  3. Therefore, the probability that Joe is right-handed is 90% (therefore, if we are required to guess [one way or the other] we will choose "right-handed" in the absence of any other evidence).

You can assess the validity of this statistical syllogism by considering whether the steps in the argument follow logically from one another. You can assess its soundness by furthermore considering whether its premises are true. Are 90% of humans right-handed? Is Joe a human? Inductive logic is still logic.

"Not every claim is an argument. An argument must offer evidence intended to support a conclusion. I can claim 'I am hungry' without thereby offering any sort of argument (valid, inductive, fallacious or otherwise) in support of that claim. One cannot test the validity of a single proposition."

I agree that not every claim is an argument, either in the formal or informal sense. Every claim is, however, a premise. In such cases, we can simply determine whether or not the premise is true. Furthermore, many claims that fact checkers care about imply or support an informal (or even formal or legal) argument. In such cases, you can assess the implied informal argument's validity. Lastly, in any case where a public figure makes a claim that ties vaguely to an informal argument, that public figure deserves to be criticized for committing the ambiguity fallacy. Many politicians often commit the ambiguity fallacy. As much as possible, we should call them on it whenever they do it.

"No need to check for 'both' soundness and validity. If you check for soundness, then you have already checked for validity as part of that. Perhaps you meant to say you would check for both truth of basic premises and validity of reasoning."

Correct. To be sound, an argument must be valid. What I should have said is that fact checkers conflate truth with validity.

"It depends a bit on which notion of fallacy you are working with, but arguments can fail to be valid without committing a common named fallacy. A far simpler check for validity is simply to find counterexamples to the reasoning (logically possible examples in which the basic premises of the argument are all true and in which the conclusion of the argument is false)."

I hope that Ima Pseudonym will elaborate on the logical counterexample part of this statement. If it's a viable shortcut, I'm all for it. That said, I suspect that there are many logical fallacies that do not yet have a name. Perhaps Malark-O-Meter's future army of logicians will name the unnamed!

Thank you again, Ima Pseudonym. Your move if you wish to continue playing. I like this game because you play it well. I encourage constructive criticism from you and all of Malark-O-Meter's readers. Cry 'Reason,' and let slip the dogs of logic.
 

    about

    Malark-O-blog published news and commentary about the statistical analysis of the comparative truthfulness of the 2012 presidential and vice presidential candidates. It has since closed down while its author makes bigger plans.

    author

    Brash Equilibrium is an evolutionary anthropologist and writer. His real name is Benjamin Chabot-Hanowell. His wife calls him Babe. His daughter calls him Papa.

    what is malarkey?

    It's a polite word for bullshit. Here, it's a measure of falsehood. 0 means you're truthful on average. 100 means you're 100% full of malarkey. Details.

    what is simulated malarkey?

    Fact checkers only rate a small sample of the statements that politicians make. How uncertain are we about the real truthfulness of politicians? To find out, treat fact checker report cards like an experiment, and use random number generators to repeat that experiment a lot of times to see all the possible outcomes. Details.

    malark-O-glimpse

    Can you tell the difference between the 2012 presidential election tickets from just a glimpse at their simulated malarkey score distributions?

    Picture
    dark = pres, light = vp
    (Click for larger image.)

    fuzzy portraits of malarkey

    Simulated distributions of malarkey for each 2012 presidential candidate with 95% confidence interval on either side of the simulated average malarkey score. White line at half truthful. (Rounded to nearest whole number.)

    Picture
    (Click for larger image.)
    • 87% certain Obama is less than half full of malarkey.
    • 100% certain Romney is more than half full of malarkey.
    • 66% certain Biden is more than half full of malarkey.
    • 70% certain Ryan is more than half full of malarkey.
    (Probabilities rounded to nearest percent.)

    fuzzy portraits of ticket malarkey

    Simulated distributions of collated and average malarkey for each 2012 presidential election ticket, with 95% confidence interval labeled on either side of the simulated malarkey score. White line at half truthful. (Rounded to nearest whole number.)

    malarkometer fuzzy ticket portraits 2012-10-16 2012 election
    (Click for larger image.)
    • 81% certain Obama/Biden's collective statements are less than half full of malarkey.
    • 100% certain Romney/Ryan's collective statements are more than half full of malarkey.
    • 51% certain the Democratic candidates are less than half full of malarkey.
    • 97% certain the Republican candidates are on average more than half full of malarkey.
    • 95% certain the candidates' statements are on average more than half full of malarkey.
    • 93% certain the candidates themselves are on average more than half full of malarkey.
    (Probabilities rounded to nearest percent.)

    Comparisons

    Simulated probability distributions of the difference the malarkey scores of one 2012 presidential candidate or party and another, with 95% confidence interval labeled on either side of simulated mean malarkey. Blue bars are when Democrats spew more malarkey, red when Republicans do. White line and purple bar at equal malarkey. (Rounded to nearest hundredth.)

    Picture
    (Click for larger image.)
    • 100% certain Romney spews more malarkey than Obama.
    • 55% certain Ryan spews more malarkey than Biden.
    • 100% certain Romney/Ryan collectively spew more malarkey than Obama/Biden.
    • 94% certain the Republican candidates spew more malarkey on average than the Democratic candidates.
    (Probabilities rounded to nearest percent.)

    2012 prez debates

    presidential debates

    Simulated probability distribution of the malarkey spewed by individual 2012 presidential candidates during debates, with 95% confidence interval labeled on either side of simulated mean malarkey. White line at half truthful. (Rounded to nearest whole number.)

    Picture
    (Click for larger image.)
    • 66% certain Obama was more than half full of malarkey during the 1st debate.
    • 81% certain Obama was less than half full of malarkey during the 2nd debate.
    • 60% certain Obama was less than half full of malarkey during the 3rd debate.
    (Probabilities rounded to nearest percent.)

    Picture
    (Click for larger image.)
    • 78% certain Romney was more than half full of malarkey during the 1st debate.
    • 80% certain Romney was less than half full of malarkey during the 2nd debate.
    • 66% certain Romney was more than half full of malarkey during the 3rd debate.
    (Probabilities rounded to nearest percent.)

    aggregate 2012 prez debate

    Distributions of malarkey for collated 2012 presidential debate report cards and the average presidential debate malarkey score.
    Picture
    (Click for larger image.)
    • 68% certain Obama's collective debate statements were less than half full of malarkey.
    • 68% certain Obama was less than half full of malarkey during the average debate.
    • 67% certain Romney's collective debate statements were more than half full of malarkey.
    • 57% certain Romney was more than half full of malarkey during the average debate.
     (Probabilities rounded to nearest percent.)

    2012 vice presidential debate

    Picture
    (Click for larger image.)
    • 60% certain Biden was less than half full of malarkey during the vice presidential debate.
    • 89% certain Ryan was more than half full of malarkey during the vice presidential debate.
    (Probabilities rounded to nearest percent.)

    overall 2012 debate performance

    Malarkey score from collated report card comprising all debates, and malarkey score averaged over candidates on each party's ticket.
    Picture
    (Click for larger image.)
    • 72% certain Obama/Biden's collective statements during the debates were less than half full of malarkey.
    • 67% certain the average Democratic ticket member was less than half full of malarkey during the debates.
    • 87% certain Romney/Ryan's collective statements during the debates were more than half full of malarkey.
    • 88% certain the average Republican ticket member was more than half full of malarkey during the debates.

    (Probabilities rounded to nearest percent.)

    2012 debate self comparisons

    Simulated probability distributions of the difference in malarkey that a 2012 presidential candidate spews normally compared to how much they spewed during a debate (or aggregate debate), with 95% confidence interval labeled on either side of the simulated mean difference. Light bars mean less malarkey was spewed during the debate than usual. Dark bars less. White bar at equal malarkey. (Rounded to nearest hundredth.)

    individual 2012 presidential debates

    Picture
    (Click for larger image.)
    • 80% certain Obama spewed more malarkey during the 1st debate than he usually does.
    • 84% certain Obama spewed less malarkey during the 2nd debate than he usually does.
    • 52% certain Obama spewed more malarkey during the 3rd debate than he usually does.
    Picture
    (Click for larger image.)
    • 51% certain Romney spewed more malarkey during the 1st debate than he usually does.
    • 98% certain Romney spewed less malarkey during the 2nd debate than he usually does.
    • 68% certain Romney spewed less malarkey during the 3rd debate than he usually does.

    (Probabilities rounded to nearest percent.)

    aggregate 2012 presidential debate

    Picture
    (Click for larger image.)
    • 58% certain Obama's statements during the debates were more full of malarkey than they usually are.
    • 56% certain Obama spewed more malarkey than he usually does during the average debate.
    • 73% certain Romney's statements during the debates were less full of malarkey than they usually are.
    • 86% certain Romney spewed less malarkey than he usually does during the average debate.

    (Probabilities rounded to nearest percent.)

    vice presidential debate

    Picture
    (Click for larger image.)
    • 70% certain Biden spewed less malarkey during the vice presidential debate than he usually does.
    • 86% certain Ryan spewed more malarkey during the vice presdiential debate than he usually does.

    (Probabilities rounded to nearest percent.)

    2012 opponent comparisons

    Simulated probability distributions of the difference in malarkey between the Republican candidate and the Democratic candidate during a debate, with 95% confidence interval labeled on either side of simulated mean comparison. Blue bars are when Democrats spew more malarkey, red when Republicans do. White bar at equal malarkey. (Rounded to nearest hundredth.)

    individual 2012 presidential debates

    Picture
    (Click for larger image.)
    • 60% certain Romney spewed more malarkey during the 1st debate than Obama.
    • 49% certain Romney spewed more malarkey during the 2nd debate than Obama.
    • 72% certain Romney spewed more malarkey during the 3rd debate than Obama.

    (Probabilities rounded to nearest percent.)

    aggregate 2012 presidential debate

    Picture
    (Click for larger image.)
    • 74% certain Romney's statements during the debates were more full of malarkey than Obama's.
    • 67% certain Romney was more full of malarkey than Obama during the average debate.

    (Probabilities rounded to nearest percent.)

    vice presidential debate

    • 92% certain Ryan spewed more malarkey than Biden during the vice presidential debate.

    (Probabilities rounded to nearest percent.)

    overall 2012 debate comparison

    Party comparison of 2012 presidential ticket members' collective and individual average malarkey scores during debates.
    • 88% certain that Republican ticket members' collective statements were more full of malarkey than Democratic ticket members'.
    • 86% certain that the average Republican candidate spewed more malarkey during the average debate than the average Democratic candidate.

    (Probabilities rounded to nearest percent.)

    observe & report

    Below are the observed malarkey scores and comparisons form the  malarkey scores of the 2012 presidential candidates.

    2012 prez candidates

    Truth-O-Meter only (observed)

    candidate malarkey
    Obama 44
    Biden 48
    Romney 55
    Ryan 58

    The Fact Checker only (observed)

    candidate malarkey
    Obama 53
    Biden 58
    Romney 60
    Ryan 47

    Averaged over fact checkers

    candidate malarkey
    Obama 48
    Biden 53
    Romney 58
    Ryan 52

    2012 Red prez vs. Blue prez

    Collated bullpucky

    ticket malarkey
    Obama/Biden 46
    Romney/Ryan 56

    Average bullpucky

    ticket malarkey
    Obama/Biden 48
    Romney/Ryan 58

    2012 prez debates

    1st presidential debate

    opponent malarkey
    Romney 61
    Obama 56

    2nd presidential debate (town hall)

    opponent malarkey
    Romney 31
    Obama 33

    3rd presidential debate

    opponent malarkey
    Romney 57
    Obama 46

    collated presidential debates

    opponent malarkey
    Romney 54
    Obama 46

    average presidential debate

    opponent malarkey
    Romney 61
    Obama 56

    vice presidential debate

    opponent malarkey
    Ryan 68
    Biden 44

    collated debates overall

    ticket malarkey
    Romney/Ryan 57
    Obama/Biden 46

    average debate overall

    ticket malarkey
    Romney/Ryan 61
    Obama/Biden 56

    the raw deal

    You've come this far. Why not just check out the raw data Maslark-O-Meter is using? I promise you: it is as riveting as a phone book.

    archives

    June 2013
    May 2013
    April 2013
    January 2013
    December 2012
    November 2012
    October 2012

    malark-O-dex

    All
    2008 Election
    2012 Election
    Average Malarkey
    Bias
    Brainstorm
    Brier Score
    Bullpucky
    Caveats
    Closure
    Collated Malarkey
    Conversations
    Dan Shultz
    Darryl Holman
    Debates
    Drew Linzer
    Election Forecasting
    Equivalence
    Fact Checking Industry
    Fallacy Checking
    Foreign Policy
    Fuzzy Portraits
    Gerrymandering
    Incumbents Vs. Challengers
    Information Theory
    Kathleen Hall Jamieson
    Launch
    Logical Fallacies
    Longitudinal Study
    Malarkey
    Marco Rubio
    Meta Analysis
    Methods Changes
    Misleading
    Model Averaging
    Nate Silver
    Origins
    Pants On Fire
    Politifactbias.com
    Poo Flinging
    Presidential Election
    Ratios Vs Differences
    Redistricting
    Red Vs. Blue
    Root Mean Squared Error
    Sam Wang
    Science Literacy
    Short Fiction
    Simon Jackman
    Small Multiples
    Stomach Parasite
    The Future
    The Past
    To Do
    Truth Goggles
    Truth O Meter
    Truth O Meter