One reason the media hasn't done a lot of analyses like this is because they are skeptical of what the average truthfulness measures. The thing is, so am I! But I am skeptical of almost everything that hasn't been systematically analyzed.
There is a good article by Michael Sherer on Time's Swampland blog in which he interviews the biggest names in fact checking about the difficulties with measuring individual truthfulness. My first mission with Malark-O-Meter is to eventually address each of the issues the fact checkers raise in that article. And by that, I mean refuting the claim that they make truthfulness scores meaningless. My second mission with M-O-M (hee hee, Mom!) is to actually show people, with analysis and graphs, how useful truthfulness scores are and are not, rather than just wave my hands in the air.
So I am glad you're on board! On with the rest of your comments.
1. What I realized right after my last post was, standard deviations from what? Normal truthfulness or political truthfulness? There might be already some data collected through previous research on truthfulness, my background is in pyschology and admittedly I took stats and psych stats as an undergrad and have since not retained as much as I would have liked. The other option would be to do an aggregate of statements from other american political figures or previous presidential candidates to get some normative political data for comparision. This might be more of an undertaking than you would be prepared for, and I am not sure if in fact there is data collected for that.
That is my plan! Eventually, I will write a web scraping script that collects all Pinocchio Tracker, PolitiFact, and other fact checker scores together. Then I can use that data set to create an observed probability distribution of truthfulness. From this observed probability distribution, I could give people percentile rankings. The cool thing is that percentile rankings are themselves a random variable! And I know how to calculate the uncertainty in percentiles, which I would then report. So, the data is there, and I am (eventually) going to use it. For now, we're going to look at absolute bullpucky scores and their comparisons.
2. Again I apologize for my ignorance here. I know from my background that a couple things that experimental psychologists have to deal with is rater error, and social desirability bias. I guess what I'm accusing Politifact of is riding the fence so to speak. I am pre-supposing a theoretical mean of .50 truthfulness and I'll guesstimate a standard deviation of .10 for arguments sake that would mean that at .54 Obama is just slightly better than average on accurate statements, and at .35 Romney would be a more than a standard d from normal honesty. Ok this is all nonsense because I'm making up my own numbers. Anyway, based on your data analysis one could reasonably assume that on a given number of talking points, Obama is more factual than Romney. That being said, the debate data shows both candidates being very close to the mean, my theoretical mean, but closer no matter how you view it. So my question then is, was Romney being more accurate than his statistical history or was there some sampling error in the data? There are ways to test a group of data points for sampling error using a chi squared, but again it's been so long. (my underline)
I haven't yet done the analysis using my new scoring and simulation methods, but I'll copy-paste my analysis results from my 1st presidential debate (and my first ever Malark-O-Meter-esque) statstical analysis here:
- We can be 95% confident that Barack Obama's credibility was between ~82% of what it normally is and over 2.5 times what it normally is. (So, yes, there is a lot of sampling error)
- The probability is over 80% that Barack Obama's overall credibility is greater than his 1st debate performance. (We can be somewhat confident he was less truthful than usual, but not certain.)
- We can be 95% confident that Mitt Romney's credibility was between ~52% and more than ~2 times what it normally is. (Again, lots of sampling error.)
- The probability that Mitt Romney's credibility was higher than normal is over 60%. (Note: this is different than what I wrote in the article because I accidentally flipped the inference around. LOL. Give me a break! It was the first analysis I did! Anyway, we can't be very certain of it, but maybe Mitt Romney was more truthful than usual.)
I don't do frequentist hypothesis testing (i.e., assign a p-value that says, yup, he's being more truthful!). But I do calculate the probabilities that we would get a certain truthfulness measure or truthfulness comparison measure from a sample of the size we have, given what we have observed (I don't yet make any explicit prior probability assumptions). Basically, I simulate the distribution and either integrate probabilities or calculate percentiles of the distribution.
So does this analysis answer your question?
3. I disagree! First of all the N is so much higher for overall statements. Of course, that means higher reliability statistically. What accounts for this variability in the debates? I agree more scrutiny. Perhaps, Romney chose his answers more carefully than when he is "firing up the crowd" at stump speechs and advertising. I feel it necessary at this point to say I'm not using your blog to bolster support for either candidate. I am just fascinated by the numbers here.
You know what. You're right. Even though we have a higher sampling fraction for debate statements, it provides no advantage because we still only have a sample of rated statements. It also provides no disadvantage. Since we are sampling with replacement, we don't need to do any finite population correction. The thing that matters, as you point out, is sample size.
As to you last statement. Anything I can do to encourage you to keep it up! It has worked! I don't know why I was suprised by this but the highest percent of both candidates statements was half truths, around 25%. LOL, how political of them. My assumption was more polarizing. I assumed that both were mostly telling the truth or full of it than the data shows.
Yes! My research has enlightened someone! This feels good.
I have a crazy hypothesis! Ok, I noticed that 45% of what Romney says is less than half true and exactly 45% of what Obama says is more than half true! Why is this? Well, the obvious answer is Romney is a liar and Obama is a saint, but I am curious if there is a natural skew here. Is it possible since Romney is coming from an attacking standpoint and Obama a defensive they are mostly talking about current policy and that Romney is finding evidence to support his arguement that doesn't tell the whole truth but isn't necessarily a lie, and the Obama is finding evidence to support his that isn't necessarily the whole truth? Would the statistics flip flop if the shoe were on the other foot, and Romney were the incumbent? This is why this blog is awesome b/c we really don't have any past numbers to base this theroy on.
So you're asking if there's any evidence that Romney is less truthful because he is on the attack, and has more at stake. My own analyses can't say much about that (yet), but there's (very weak) evidence refuting your hypothesis. For example, Romney may have been more truthful during the debates, when he had more to prove.
And c'mon man! You can't beat yourself up for the lack of data, and that Politifact is the only source. You gotta use what you got. The fact that there is a Politifact and dude crunching the numbers out there... maybe we are getting more scientific in our politics. This is a good thing! We could start looking at all kinds of things. Like, if Romney is out there spitting out facts at a rating of .35 and all the sudden states fact in the debate at .44 is this why he won the debate in the minds of so many?
Sadly, I think the truth has less to do with why politicians win or lose political debates that we'd like. But yeah, maybe people subconsciously digest some of the fact checking that goes on after the debates. Also, there is some evidence that people can subconsciously detect lies. But I'm just waving my hands at this point. Still, interesting stuff.
Is this why we see him at times in the nominations "I'll bet you $10,000" as a cartoon character and now as the next leader of the free world based on the polls? Because we hear things we're not that sure of and then BAM on public tv we see this guy's not that far out there and Obama is just hanging around .52, well that's "Obama" big deal.
Now there you have something. Because Obama is not really as honest as people think he is, and because there is a lot of uncertainty about his honesty, maybe it is one of the things that left him open. Or maybe Romney just got lucky, as the town hall meeting results suggest!
And people all around me are saying I'm kinda liking Romney now or he won the debate. And I say why? They can't articulate it. You don't just feel something, we are perceiving machines wheter or not we can articulate it. Maybe, your approach to understanding politics is getting us a little closer. Thanks again it's really cool and definitely a new idea. Which both sides in Washington could use a few of!
I agree. I hope that Malark-O-Meter catches on.