Thoughts on Digitization

This week’s readings focused on the digitization of historical documents and the various considerations that surround this modern approach to preserving the past. While large scale projects may benefit from outsourcing due to cost-effectiveness rather than scanning materials in-house, the issue of OCR accuracy rates is a factor that cannot be overlooked. Across the readings, it is evident that not all projects are the same, yet all successful projects must be well-planned, organized, and accurate.

The readings were interesting to me as I have played a hand in some small-scale digitization projects as an undergrad working in my college’s archives as a student worker. I performed some of the scanning process in one project that involved glass plate negatives, and in another I helped organize back issues of the school’s student-run newspaper as it was being prepared to be microfilmed and digitized at an outside facility. At that time, I had no real idea about about the OCR accuracy rate of the newspapers (or even if they would be even using OCR), although I assumed that they would be more easily searchable than browsing through the collections in the archives. Clearly, a reading like Cohen’s and Rosenzweig’s Becoming Digital chapter would have been helpful to me at the time to understand the overall layout of the land of digitization, but it was written two years too late for that project.

I find Ian Milligan’s Illusionary Order article interesting in that the author stresses the importance of transparency while researching. His findings that two Canadian newspapers that have been digitized are showing up more frequently in dissertation citations is definitely worth some consideration. One one hand, it’s great that these collections are now searchable online. However, there appears to be a dependency on these newspapers that have been digitized, leaving behind the still print-only materials that may alter the nature or direction of one’s research. Mulligan points out that dissertations have been leaning towards more Toronto-based research due to the scope of the newspapers available in Pages of the Past and Canada’s Heritage since 1844. The fact that the dissertations are more often citing just the newspapers and not the actual online database that the authors’ used to access the articles is something we briefly talked about in class last week that I’ve been spending a lot of time thinking about. While citing the original newspaper is technically correct, I agree that for transparency’s sake in research, it’s important to give credit to the databases that house these sources. I know in my own research, I have failed to do this but will now be making more of a conscientious effort to do so.

Leave a Reply

Your email address will not be published. Required fields are marked *