Digitizing real world documents

As part of the new duties I have taken along with this summer internship is the assistance with the digitized version of the Newfoundland Collection. Mostly in making that material available though my position as CAP Intern. In the case of these documents, they are historical documents that have to be scanned into digital format to be made available online. But that got me thinking about electronic archiving in general.

As just about every university student knows, libraries have access to massive online databases of materials suitable for research. Memorial University of Newfoundland spends thousands upon thousands a year to have access to some of the best of these databases. Last I checked there were a few hundred accessible though the universitys computers. The same applies to the public library I am working at. They use one of the larger electronic resource databases, but they only have one. Now keep in mind there is a vast difference between the QEII at Memorial and the Newfoundland and Labrador Public Libraries Board. One is an academic institute serving an academic research based community and the other is a public accessible organization. Both serve different communities that have different aims and needs. But I digress.


In searching the various systems I find a lot of the resources available are made available in either HTML format, or in PDF format. Now Adobe�s PDF format is an excellent way to digitize an article or document so that the end user cannot manipulate it without recreating it whole. This helps protect the originality of the document and can save on transcription errors. To do this many magazines and newspapers scan the printed copies of their issues and make them available that way. But I find this a less than acceptable version of the document. Since the original is likely created in a digital form using programs such as Microsoft word or Corel Wordperfect, then it should be simple to created a PDF or HTML file directly. But since these publishers would want to keep the layout of their paper or magazine intact, they choose to scan it post-printing. But could the final digital version that is sent to the printers be made available for conversion to PDF – or other format – before printing? This would remove some of the issues that come with scanning a paper version of the document. With a scanned version of a paper copy you lose resolution and sharpness to lettering and images. You also sca the grain of the paper in the process, not to mention other scanning issues. A PDF version created from the original source would retain sharper image quality, text sharpness and avoid the dreaded paper “graying” effect seen by scanning newsprint or glossy magazine paper.

Since the world is moving more and more to online resources, it is advisable to make higher quality versions of documents available through these databases and other resource sites. By doing so the digitized versions are easier to deal with from a readership standpoint and would be easier to convert from PDF to any future standards that may arise.

Later I will look at some of the issues of long term archiving of digital documents.

About Christopher 119 Articles
I run this place.

5 Comments on Digitizing real world documents

Leave a Reply