Universities all around the world teach a course called 鈥渞esearch methods鈥. This mixes philosophy of science and epistemology with hypothesis formulation, significance testing and a study of quality, quantity and bias in statistics. I鈥檝e taught it quite a few times. It鈥檚 an important bedrock of academic research. But it needs to catch up with the horrors of the modern digital age.
Internet searching has become a central research methodology for all academics. All modern research rests on an assumption of some accessible network of information 鈥渙ut there鈥. Sure, academics have better tools at their disposal than the average web user, including private databases for searching out papers. Mainly, though, everyone uses the traditional advanced features of common search engines, such as , sorting and filtering.
Recently, the quality of search results across all popular engines has fallen, however, and we need to ask whether university research, as currently taught, can survive.
The , produced by Cern, depicts a very different paradigm from today鈥檚 鈥渨eb鈥, describing the pioneering venture as an 鈥渋nformation retrieval initiative aiming to give universal access to a large universe of documents鈥. However, you would be unlikely to easily find things just following the links from one URL to another as you did in Gopher, the predecessor of the modern web. Hence, in the early days of the web, search engines built indexes by 鈥渃rawling鈥 the network with their 鈥渟piders鈥, looking for content.
糖心Vlog
That worked reasonably well before artificial 鈥渋ntelligence鈥 went mainstream. But, today, content is generated, 鈥渙n the fly鈥, as it were 鈥 and the spiders can鈥檛 catch these flies. They are no sooner there than gone. This is partly聽because of the mushrooming glut of low-grade content spam spewed out by aggressively search engine-optimised (SEO) (designed to boost the ranking of a particular website 鈥 usually a commercial one 鈥 in search engine results).
In a longitudinal investigation of SEO spam in search engines, a recently published 聽of about 8,000 product queries suggested that all search engines have significant problems with this SEO spam drowning out useful information. Google鈥檚 to, 鈥淲ell, we鈥檙e still doing a little better than the other search engines that are failing鈥.
糖心Vlog
The spammers can instruct a large language model, such as ChatGPT, to spin a piece of advertising copy into 100,000 articles, each introducing the product under a different subject lead-in, such as cookery, sports, medicine, pet care, and then create 100,000 websites to host it that are indistinguishable to search engines from organic content even though they are poorly written and full of errors. Troll farms can do something similar to promote disinformation.
Traditional URLs are 鈥渦niversal resource locators鈥 but get replaced by internalised tracking links, which redirect a URL so as to track whether it is shared, and with whom, or shortened URLs, or are simply hallucinated by the 鈥淎I鈥 generators, so the number of dead links in circulation is rising. This means that even if Google were the most benevolent of gatekeepers, it still couldn鈥檛 sort wheat from the chaff.
But it isn鈥檛 the most benevolent of gatekeepers: in common with other search engines, its business priorities have long since turned away from users toward advertisers.
Google鈥檚 flaws matter enormously because of its dominance. Despite being only one window into the vast public network of URLs, through which any traveller could freely walk, Google has set itself up as the sole gatekeeper (apart from its wannabee, Bing). We coined the verb 鈥渢o google鈥, meaning to consult one special URL as a path to all others 鈥 even if even Google鈥檚 mighty spider can鈥檛 penetrate the worldwide walls built by its Silicon Valley neighbours, the social media companies.
糖心Vlog
Once Google established its dominance, it鈥檚 fair to say that the organic web, and cultural knowledge of roads through it, died. So much so that even in university research methods classes, students still get told to use Google or Bing as first port of call.
But surely this must end given the increasing uselessness of the results these engines throw up. For too long, we've free-ridden on commercial applications instead of building solid, home-grown information systems that also serve the public interests. We need to revise the idea of search and reconsider what tools are really best for it. We need to start confronting students with the reality of the internet as it actually is, rather than as it was idealised in the 1990s.
Andy Farnell has been a visiting and associate professor in signals, systems and cybersecurity at a range of European universities. With Helen Plews and Ed Nevard, he now co-hosts聽, which seeks to restore understanding, safer use and control of everyday technology to ordinary people.
Register to continue
Why register?
- Registration is free and only takes a moment
- Once registered, you can read 3 articles a month
- Sign up for our newsletter
Subscribe
Or subscribe for unlimited access to:
- Unlimited access to news, views, insights & reviews
- Digital editions
- Digital access to 罢贬贰鈥檚 university and college rankings analysis
Already registered or a current subscriber?








