Data has become the currency of the digital age. Everything we do, everything we post online, every website and business we interact with creates more data. It is estimated that 2.5 quintillion bytes of data is generated every single day. A quintillion is 1 followed by 18 zeros. It is massive.
As such, data analysis and data science are two of the hottest careers with almost 100,000 data analysts employed in the US alone.
One of the important aspects they teach about being a data analyst is not to apply bias to the data. The goal is to let the data tell the story. Along with the results of data analysis, analysts are meant to include information on data/polling methodologies, data that was excluded, any over-sampling that was performed, and important context.
The power of data to create a narrative that can steer public opinion and government policy guarantees that bad actors will misuse data to spread disinformation.
For decades companies have paid for studies that give them the result they want in order to prove their product is safe or that it has benefits to consumers. Combine that with the media being all too willing to report on preliminary studies that have yet to go through the scientific process and you have a situation where data is misused and misrepresented.
You might remember years ago when the news started reporting about how chocolate was actually a healthy food. Millions of people were ecstatic to hear that they could eat their favorite treat guilt free, so they never looked deeper. Chocolate manufacturers had poured millions into sponsored studies to get the results they wanted. Analysis of these reports found plenty of flaws in the methodologies and found studies whose results couldn’t be independently recreated.
Or maybe you are old enough to remember the cable TV days where every network seemed to simultaneously declare that they were the most watched network. Looking into the details, most of the networks were selectively choosing data such as being the most watched network on Thursday nights from 6-8pm and then declaring simply that they were the most watched.
The reason for this misdirection is clout. At a time before streaming, you had to choose which network to watch at a given time and had to miss the shows on all of the other stations. Hearing that a network was the most watched made people more likely to tune in to that network.
We’ve seen the approach of using misleading data become far more nefarious. Despite the overwhelming research and data showing vaccine safety, a significant portion of the American population no longer trusts them. This trust was eroded by bad actors using false data to mislead the public.
Climate change is another topic that has been tainted by misleading data. Instead of addressing an issue that the entire world is facing together, many Americans believe it is all a scam despite severe weather getting worse, damage becoming more expensive, and more people being displaced.
The same problem is now happening to polls. There are polling organizations who have an agenda and set up a poll to get a desired result. Other organizations may not intend to input bias but have flawed methodologies. In the mix are the polling organizations who make it a point of striving for truth and accuracy. But how many Americans know which sources can be trusted?
The issue is that some of the media, and most of the internet pundits, don’t make it a point of reporting only on quality polls. And rarely does the public take the time to investigate the polls presented. When the polls don’t match up with results, trust is eroded in polls altogether.
In the lead up to the 2016 election, polls had Hillary Clinton leading Donald Trump by a meaningful margin, enough to win the election. When Trump supporters were asked about it they said they didn’t believe the polls. And when election day came around, Trump emerged victorious.
Pollsters will point out that those polls were showing popular sentiment, not the electoral college, and that Hillary Clinton did win the popular vote. And while the margin of victory in the popular vote was more narrow than earlier polls suggested, the margin shown in polls also narrowed just before the election. They’ll also say that pollsters took a hard look at their methodologies and made adjustments after the 2016 election to make polls more accurate.
All of that is true, but nevertheless the age of selectively accepting and dismissing poll results had begun.
The moment when everyone agrees that a portion of the information being used is either wrong or misleading, the door is opened for everyone to ignore any information they don’t like.
Harvard Harris polls are an example of intentionally misleading polling. Because the name “Harvard” is attached to the polls, it elevates them and gives them credibility. But Harvard isn’t involved with these polls. This brand was purchased in 2017 by the Stagwell Group whose founder is Mark Penn. Penn writes articles for Fox News and the New York Post where he cites his own polling to make his case. But there are many issues with his polls. There are people who won’t even look at a poll conducted by Mark Penn due to its bias.
Here is just one example of issues with Harris polls:
The same December survey that found 60% of 18-24 year olds believed Israel was committing “genocide” in Gaza found 70% also believed Israel was “trying to avoid civilian casualties.”
When a poll creates results that directly contradict each other, the polling is not being performed correctly.
That inconsistency shows another problem with polls. The way that questions are written can be leading. There may also be limited options for answers which forces a desired result. Polls made specifically for political groups tend to have these types of questions and answers so that they can produce a poll that shows support or opposition for a particular issue. They can manufacture data to give credibility to their position.
Even when the questions aren’t leading and the answers are set up correctly you can still get incorrect information due to a misinformed public.
The most recent example of this is a recurring Gallup poll administered in November of 2023 which asked respondents if there was more crime in the US now than there was a year ago.
As you can see, 77% said that yes there was more crime. But they are wrong.
Crime is down nationwide by a significant amount. Homicides dropped 13% and property crime was down over 6%. Overall, violent crime was close to a 50 year low. And yet polling showed a large majority believed the exact opposite.
Even when polls are done correctly, the data may not be accurate.
This might not seem significant because most polls are asking how people feel about an issue or who they personally will vote for. Therefore it isn’t an issue about factual information as the polls are sourcing opinion data. Here’s the issue: politicians can’t necessarily make decisions based on opinion polls because the public might not be properly informed on the issue. If they were more informed, their opinion may change.
For example, if the people being polled by Gallup had been asked if police budgets should be significantly increased, there is a good chance they would say yes. After all, they believe crime is out of control and getting worse. However, if those same people were shown the data about how much crime had dropped and how low crime levels are historically, they might instead want those extra funds to go towards fixing other problems instead of to the police budgets.
This creates a difficult situation. Which polls do we trust? What results do we use to influence policy? And when is it a case of needing to better inform the public or needing better polling methodologies?
There isn’t a magic solution to this problem. The best we can do is to educate ourselves and help educate others. Most polls that are released have a link to their sample size, survey demographics, methodologies, questions and possible answers. This means you can review how the poll was conducted, how representative it is, and how accurate it should be. If that seems too time consuming to do on a regular basis, you can find various sources of poll rankings online.
For example, FiveThirtyEight, a popular political polling website, has a pollster ranking page that looks at methodology and past accuracy. If you’re searching purely for election outcome polls, this is a good reference for which political polls are more likely to be accurate and which political party their results tend to skew towards.
The best thing we can do to get opinion polling back on track is to combat disinformation. If we look back at the crime poll, the result being so far off from reality is directly due to Republicans discussing day in and day out how crime ridden the country is and specifically saying that crime is getting worse when it is actually getting better. Democrats are doing a poor job of correcting the disinformation and so those lies become reality to the public.
Unless we hold our politicians to a higher standard of honesty and until we directly and repeatedly call out the lies and intentionally misleading polls, there will be an ongoing lack of trust in all information.
Without that trust, it becomes increasingly difficult to convince people of what is really happening in our country and what issues need our attention.
This level of mistrust grinds progress to a halt. Which may be a factor in why the current 118th House of Representatives has been the least productive House since the Great Depression.