Over the years I’ve written on this blog about the quality of search. A lot.
Filter bubbles were my biggest concern back in 2011. Adding in results from social search that diluted real search results worried me in 2012. In 2016 it was clickbait clogging up our results. Last year AI-written articles with health advice that could actually kill someone worried me. Those are just a few examples but if you search on our site for the term “google” or “search engine” you’ll see that the quality of search results has kept me up at night on more than a few occasions over the years.
This week I re-watched the 2011 sharply-awakening Ted Talk by Eli Pariser about filter bubbles and I thought, “Wow. How quaint it is that filter bubbles were my biggest professional worry back then.”
Especially because there was a work-around then. You could go into Verbatim mode or incognito mode and pop the filter bubbles fairly well. You could use multiple search engines and get a wider variety of links to track down information. You could use a VPN to get around virtual geographical walls.
I’ve got two big concerns now. One is the quality of the search engines themselves, and the other is the quality of the information they’re caching.
Search engine quality
You may have heard about a big study that came out this week about the quality of search engines. Titled “Is Google Getting Worse? A Longitudinal Investigation of SEO Spam in Search Engines,” the academic research was conducted by Janek Bevendorff, Matti Wiegmann, Martin Potthast, and Benno Stein from the University of Leipzig, the Bauhaus-Universität Weimar and ScaDS.AI, in Germany. They studied search engine results pages (SERPs) from the major search engines: Google (by proxy of Startpage), Bing, and DuckDuckGo.
Two caveats before I go further in recommending that you read the report (and if you’re a professional searcher, I really do think you should):
- It’s academic research, so I can’t honestly say that it’s beach reading, and
- They studied web pages geared toward product reviews, so it’s not exactly our use-case.
Still. What they studied is enormously relevant.
The tl;dr two main takeaways are:
- Search engines are concentrating on websites that are optimized for SEO (which makes sense, SEO is “search engine optimization” after all.) But what that means in practice is that the most successful sites are using lots of keywords in alt text and other generally-unseen areas specifically geared toward getting search engines to notice them. As the authors say:
higher-ranked pages are on average more optimized, more monetized with affiliate marketing, and they show signs of lower text quality.”
- Search engines are trying to fight against caching low-quality pages, and they have brief periods of doing better before the spammers figure out their filtering system and how to beat the algorithm again. The search engines wage battle after battle, but overall they’re losing the war.
The authors write:
search engines measurably target SEO and affiliate spam with their ranker updates. Google’s updates in particular are having a noticeable, yet mostly short-lived, effect. In fact, the Google results seem to have improved to some extent since the start of our experiment in terms of the amount of affiliate spam.”
Even so, the researchers say that they “see an overall downwards trend in text quality.” The report concludes with an unsettling thought:
If multi-billion-dollar companies whose specialty it is to stay ahead of spammers can’t arrest the supervolcano of sludge under the surface, what hope do the rest of us have to stay on top of it?
Information Quality
You may have read about the debacle at Sports Illustrated this past November when it was discovered that they had hired an AI company to write articles for their website. The articles had bylines, but the authors weren’t real (even though they had bios!).
Men’s Journal, owned by the same parent company as SI, had an even worse moment back in February of last year when they published an article about testosterone use written by artificial intelligence that had doctors immediately jumping up yelling about its inaccuracies and potentially dangerous recommendations.
Ever since I started in this profession, I’ve been concerned about the quality of the sources that we use to provide information to our end-users who need good, solid data they can rely on. It tended to be fairly easy 10 years ago, with only a few extra minutes needed to fact-check or find a confirming source.
But now AI is being used to create articles by sources we’ve trusted for years, like Forbes, the Washington Post, Bloomberg, Reuters, and many, many more. AI is famously good about hallucinating reference material – and people are using AI-generated references in court cases, dissertations, and more only to discover that they’re basing their futures (and our health) on stuff a machine dreamed up.
So what can we do?
As much as it’s going to add to the time it takes us to do our work, Brandolini’s Law is going to be our companion from now on.
- Check the bylines to see if a real journalist wrote that bio. If you’re getting vibes that an article you’re reading sounds a little too plastic-y, double down on your bs-ometer.
- Be an informed end-user. If search is your profession, spend some time reading up to find out what the limitations and opportunities are for the tools you’re using, and spend what money you can on reliable, fee-based sources.
- Check out alternative search options – for example, have you tried Perplexity? If you have a New York Times account, check out this article about it.
As always, communication is key
I say this a lot, but now more than ever it’s important to use this moment in time to communicate any changes in your workflow to end users. If you’re concerned that the quality of information that you’re providing is taking you longer to produce, it’s important for you to communicate that so that end-user expectations are managed and there’s less frustration all around.
Be safe out there.
FOR FURTHER READING
“Is Google Getting Worse? A Longitudinal Investigation of SEO Spam in Search Engines” by Janek Bevendorff, Matti Wiegmann, Martin Potthast, and Benno Stein. January 2024.
“Researchers confirm what we already knew: Google results really are getting worse.” by Brandon Vigliarolo, The Register, January 17, 2024.