The burial site of Viking kings, a pagan centre in medieval times, and the seat of Sweden鈥檚 Catholic archdiocese since 1164, Uppsala, north of Stockholm, is today watched over by the Carolina Rediviva, one of the world鈥檚 great academic libraries, which sits atop a hill above the city.
Among other things, the library of Uppsala University 鈥 the oldest university in the Nordic countries 鈥 holds every book published in Sweden as well as a huge number of foreign works seized from all over Europe by Swedish kings, including the Codex Argenteus, a Bible transcribed in the 6th century.
But good luck finding what you need from some of the materials that predate the invention of moveable type. Researchers here, as all over the world, have been stymied in their attempts to digitise handwritten manuscripts.
Optical character recognition, which can convert type into searchable digital files, cannot decipher handwriting; handwritten text recognition, which can read simple things such as figures on bank cheques, is not yet sophisticated enough to reliably transcribe archival texts.
糖心Vlog
Now a five-year, Skr13.7 million (拢1.08 million) project at Uppsala 鈥 From Quill to Bytes, which is referred to as Q2B, or 鈥淕oogle for handwriting鈥 鈥 is seeking to develop ways of deciphering these works.
鈥淚f you take a modern printed text and OCR it, you get it right up to 99 per cent of the time,鈥 said Per Cullhed, strategic development manager for the Uppsala University Library. 鈥淏ut if we use the same applications for a text from the 14th century, half of it could be just nonsense.鈥
糖心Vlog
The result, he said, is that 鈥渁ll archival materials, everything that鈥檚 handwritten, all the books from medieval times 鈥 all of that is not available in a way we鈥檇 like to have available鈥. Discovering how to make them searchable, Mr Cullhed said wistfully, is 鈥渢he holy grail鈥.
The team in Uppsala is up against competitors from three Swiss universities, who are working on the Historical Document Analysis, Recognition and Retrieval (HisDoc) research project, as well as a European Union collaboration called tranScriptorium, which has brought together experts from universities in Spain, Austria, Greece, the Netherlands, University College London and the University of London Computer Centre.
Typical success rates on these projects have ranged from 8.9 to 33.5 per cent of handwritten text being recognised by optical sensors. A big problem is isolating characters, which are often faded, smudged or too close to the 鈥渘oise鈥 of margins and illustrations.
鈥淭he basis of what they鈥檙e working on is even more irregular than printed text: different hands, different manners of writing, different writing tools from different times,鈥 Mr Cullhed said.
糖心Vlog
He is optimistic that these problems will be solved, he said, but 鈥渢o have something that really opens this up, it鈥檚 quite far away鈥.
When that solution arrives, he said, people will still be needed to check the work, perhaps through crowdsourcing 鈥 which the Smithsonian Institution in the US already employs, using human 鈥渄igital volunteers鈥 to read and transcribe handwritten documents.
Yet even then, he continued, 鈥測ou will always need the primary sources鈥. But if one day researchers of ancient texts can quickly home in on what they seek in a vast library, it will mean the holy grail has been found.
POSTSCRIPT:
Article originally published as:聽View to a quill: illuminating manuscripts for the 21st century (18 June 2015)
Register to continue
Why register?
- Registration is free and only takes a moment
- Once registered, you can read 3 articles a month
- Sign up for our newsletter
Subscribe
Or subscribe for unlimited access to:
- Unlimited access to news, views, insights & reviews
- Digital editions
- Digital access to 罢贬贰鈥檚 university and college rankings analysis
Already registered or a current subscriber?




