Digitization of the Archives

Introduction

In the IST discipline, digitization of previously manual processes and artifacts can have a profound effect on the organization ranging from more efficient execution of daily tasks to changes in how and what an organization can deliver to stakeholders in the form of products and services. For educational institutions, and individuals pursuing scholarship in disciplines heavily impacted by the existence of archives, the case for digitization can be summed up as such: Digitization of the archives achieves “preservation and access of history for future generations” so that they may “know of our heroism and struggles, our accomplishments and failings. Above all, they should know about the power of little things in our everyday life” (Archive Digitization, 2012).

Technological advancements in data storage and imaging continue to evolve exponentially. Increasing efficiency in servers coupled with advances in virtualization technology allows a firm to scale data center resources as needed, and we can now achieve this within hours or days as opposed to months or years. This trend contrasts with past data center management practices where some servers were consistently over-utilized while others remained woefully underutilized. But shifting utilization on a moment’s notice to ease capacity constraints was risky as it could negatively impact other applications and databases sharing server resources. Now with cloud computing and server capacity on demand from services like EC2 by Amazon Web Services, companies can scale their data centers within hours and avoid the hefty architectural build outs, electricity demands and cooling resources required to maintain their own data centers.

Additionally, standards and best practices have emerged as non-profits like National Archive and government organizations like NARA have dedicated themselves to the digital preservation of archival materials. Some of these best practices include:

  • Optimal resolution settings for images and film
  • Calibration of monitors and other digitization devices
  • What metadata should be captured and how to do so automatically
    and manually
  • Digitization environment setup based on ISO standards (Puglia et. al.,
    2004).

Digitization presents both opportunities and risks that can challenge longstanding values and practices around the role of physical interaction with archival materials as people go through the process of producing advanced scholarship. Scholars can incorporate more sources into their analysis as the size and availability of digital archives increases. Digitization improves preservation efforts by reducing handling of fragile archives without sacrificing their availability for scholarly consideration, especially if the materials are of suitable worth for preservation as opposed to being destroyed after digitization.

Digital archives can also help in recovery when physical archives are lost due to disintegration from age, or from disasters like the fire that destroyed a portion of the film vault at Universal Studios (Cieply, 2008). Additionally, the decreased cost associated with less physical travel can facilitate richer analysis within a domain as scholars are no longer dissuaded from pursuing a knowledge area for lack of funding and/or time.

As our team, which spanned the IST, religion, history and English disciplines, analyzed the ongoing debate over archival digitization, we sought to understand the transdisciplinary concerns that arise as we move towards a more digital experience in scholarship. Advances in technology not only impact the humanities and social sciences, but they also redefine the role of IST in the discussion of archival digitization and management. In this paper, we approach the topic of digitization from an IST perspective of technology as strategic advantage.

Evolution from Technology to Solutions

The value that IST brings to the table then becomes one of technology strategy for the organization and the departments within the organization tasked with preserving archival materials. The Chief Information Officer (CIO) is a relatively new role in organizational structures. The CIO role is in response to the realization that technology is not just about solving a problem within one silo of an organization, but rather technology must be structured to support the larger business objectives (Chun et. al., 2009). As organizations increased their reliance on technology to manage growing market complexity, it became clear that perceived failures in technology implementation were actually failures in alignment between business processes, other silos interconnected with those processes, and the disparate solutions implemented within different silos (Beath et. al., 2010).

Since the questions around how we should digitize and where we should store digitized archives have strong and proven options in the market, IST must look at the impact of technology on people, processes and the organization’s ability to execute its business objectives and compete in the market. On a project basis, these questions would be answered by pulling together a multidisciplinary team representing the appropriate range of stakeholders. IST has experience in the more tactical project management, vendor sourcing and technical development concerns that would be involved in a digital project. Ethnography, quantitative analysis, use cases and other requirements elicitation techniques would be used to identify and document the personas, goals, as-is and to-be processes, information needs and resulting technology requirements for addressing those goals and needs.

But it is the technology decisions at the organizational level that have the greatest impact on organizations as these decisions can have intended and unintended impact on day-to-day and long-term operations, customers, constituents and employees. To further discuss the role of IST as strategy, we will speak about digitization of the archives from the perspective of a university. The complexity and reach of the university system allows us to examine the wide-reaching effect IST strategy has on a broad set of decision points including infrastructure, educational policy and practices, management of library facilities and archival stacks, and university culture.

A Strategic Framework Approach

In the IST discipline, frameworks have become important to understanding how a firm can evolve from being provider of technology or services to a provider of customer solutions. Many frameworks acknowledge that a ‘big bang’ approach is both costly and risky to a firm. This is why frameworks are typically implemented in phases, sometimes referred to as levels of maturity. Likewise, the way technology is delivered should also be done incrementally to reduce overall project risk.

One of the concerns that arises is the sheer volume of archival materials yet to be digitized. Depending on available resources, even completing routine manual cataloguing activities can consume large amounts of time, thus obscuring valuable materials from discovery until they can be integrated into the circulation process. From an agile framework perspective, IST would address this issue by defining a process of executing digitization incrementally. For example, Stage 1 could be the capturing of basic metadata such as the author, title, year of publication and an estimation of what period the material belongs to within the literary canon. Basic metadata allows the organization to begin reaping certain benefits immediately including increasing its collection of digitally searchable works, and gathering metrics over time of which materials have enough interest and traffic to warrant moving them to the next stage of digitization.

Metadata increases discoverability, and discoverability increases opportunities for knowledge creation informally as conversations, and more formally as peer-reviewed scholarship. Simple capture of metadata can alert scholars to available work, prompting them to visit the archives or, at a minimum, increase their awareness of other primary and secondary information sources worthy of consideration. The next question that arises becomes one of capturing knowledge about this work in a way that increases the repository of searchable information. Other collaborative efforts can be built around digitized materials including forums, blogs and sharing on social media. This helps to build a web of knowledge and facilitate knowledge sharing among experts and enthusiasts.

More formally, professors and advisors can now guide students towards these undiscovered works through course assignments, or as part of thesis level research and analysis. In this manner, the culture of the organization becomes incentivized to visit these archival works, drive formal and informal discussion, expand our knowledge of their contents and eventually queue these materials for full digitization in the future.

During the panel discussion in the TNDY 404H Interpretive Inquiry seminar, we learned that digitization democratizes information access by reducing the burden of travel and expense for those who study or follow a particular topic (CGU, 2012). Through online portals, more people are able to access archived materials, and thus contribute to greater diversity of thought as well as an expanded pool of knowledge future scholars can draw from.

Easier access to knowledge lowers the overall cost of knowledge acquisition for institutions and people as the cost to create, store and access digital information decreases over time. This is a trend we have observed in organizations, like Accenture, that were successfully able to democratize information needed to serve customers, while also improving their internal agility around core activities like financial reporting (Jeffery, 2010).

Indeed, digitalization has the power to shift an organization away from a culture of heroics toward an evidence-based management culture that bases decision-making on optimized processes and business intelligence (Ross et. al., 2011). However, this shift can also challenge cultural norms of scholarship that have been built around perceived levels of hard work. The A-players may be defined within a discipline based on how onerous their research is due to a heavy reliance on undigitized archival materials that require herculean effort and substantial cost to access (Huselid, et. al., 2005).

As archival materials become easier to acquire, an organization may need to redefine the notion of an A-player and incorporate new factors including how many additional works are included in analysis because of digital availability; whether the work represents new insights into previously undigitized domains; and, how a scholar’s work contributes to other strategic objectives of the organization, especially as they relate to becoming subject matter experts in a discipline where the organization owns large quantities of archival material through acquisition.

Finally, digitization can increase an organization’s value within the overall market. If the organization amasses a large share of digitized materials within a specific discipline, they can capitalize on strong scholarly credentials created by a larger population of knowledge workers who are well versed in that discipline. Richer scholarship in less time and for lower cost can be achieved through access to more digitized archival materials.

A digital organizational culture also facilitates rapid publication in the form of digitally connected classrooms that facilitate blogging and microblogging in real-time, as well as the processes and expectations for publishing scholarly works more often through digital peer reviewed journals (CGU, 2012). This increased public knowledge presence in turn can have a positive effect on marketing to interested donors, and on attracting new scholars through recruiting and admissions. Partnerships and alliances with other firms, like Google, can increase discoverability of the digitized archives thus increasing portal traffic from external stakeholders (Google, 2012; Puglia, 2004). This can help to address issues of authenticity since people would be more willing to confidently incorporate materials acquired from a trusted source and through a trusted portal (Harrell, 2006).

While the benefits of archival digitization are many, there are also risks. The experience of interacting with a physical material can differ substantially when interacting with only its digitized copy. Perceptive judgments associated with smell and touch can be lost along with the human connections established through the process of accessing materials in the archives. Conversely, this can create opportunities to expand transdisciplinary engagement by involving scholars in the sciences who can provide more quantitative analysis around chemical composition and contamination that aids historians in understanding period events (CGU, 2012).

The concern over lazy scholarship, codes of conduct around plagarism and ethical practices when ascertain authenticity of digital materials must also be addressed through policies and pedagogy. The ways in which scholars approach study may need to be altered as well, and this can surface generational gaps where educators need to learn advanced methodologies of digital information access in order to relay these best practices to a hyper connected population of incoming scholars. Additionally, if more information is now available, it can dramatically impact accepted facts within a discipline as scholars make new connections between disparate works — connections facilitated by greater libraries of digitized archival materials, and more time in which to study those materials in greater detail.

Digitization and the Future of Scholarship

Future scholars seeking to understand and interpret current events will need to rely heavily on digitized information being created today through digital devices, online record keeping, and social media conversations. Having digitized archival data will facilitate historical analysis of how current happenings have been influenced by, or confirm trends of, past historical writings and events. For example, the 2012 presidential election has been dubbed the first ever social media election (Rudenko, 2012). Those studying politics or history will want to understand how the heavy digital presence in 2012 compares and contrasts with elections of the past.

The social and cultural trends emerging from a highly digital population have implications for how technology could influence political events, social movements and government policies moving forward, and to do so in ways that differ from pre-digital eras. Another example of the impact of archival digitization is in the study of photography and filmmaking. As traditional film companies, like Kodak, go bankrupt, and as major movie studios transition from film to full digital production and theatrical distribution, scholars may want to study changes in composition and style over time, or ascertain how older styles impact today’s photographers and why (e.g., Instagram filters mimicking different types of color film and processing from the 1930s or 1970/80s).

Digital motion capture along with crowdsourced platforms like YouTube and Vimeo may provide insights to future researches about historical events captured on digital ‘film’ that may not have been possible 50 years ago due to high costs or lack of ubiquitous availability (e.g., key events of the Arab Spring caught on video and shared globally in real-time via social media). In addition, the domestic outsourcing of film production away from Hollywood, which has enjoyed a distinct advantage because of its large concentration of robust filmmaking facilities and talent, could change the way major films are greenlit, financed (e.g., Kickstarter.com), filmed and distributed. Studying archives of physical production versus digital production can enhance interpretive inquiry of the visual arts, and their impact on societal trends including social norms, literature, spirituality and religion.

Much the way scientists struggle with getting their message across to the general public, scholars of all fields may need to contend with the notion of professional relevance. Democratization of information also means that authors without the academic credentials can become experts in a field if they can reinterpret archival materials in a way that is highly accessible, and increases general interest in previously niche domains. As these non-academic practitioners emerge, existing scholars may need to analyze what actions they can take to still create knowledge at a high level, and simultaneously publish knowledge that the general public can partake of, thus increasing awareness, learning and future interest for their domain in spite of the democratization creating by archival digitization.

An example of this democratization in IST is the Information Week publication, which is targeted towards IST practitioners like Fortune 500 CIO executives. The publication’s core audience of business technology executives (Information Week, 2012) may not have the time or desire to read a peer reviewed journal, but would still be interested in a shortened, perhaps less complex, explanation of relevant IST research. Scholars within IST would need to rework their higher-level writing for the CIO audience who wants knowledge in a form that is quickly digestible, actionable and clearly demonstrates to them how they can put theory into practice in a financially viable way.

Conclusion

With digitization comes the pressure for scholars to publish more quickly since anyone with access to knowledge can write about a knowledge domain by authoring a blog (e.g., WordPress, Tumblr), self-publishing books (e.g., Lulu) or engaging in community editing (e.g., Wikipedia). Democratization potentially evens the playing field between scholars and the general public allowing someone lacking ‘academic credentials’ to become an expert if they are fast to publication, and can interpret the material in a way that the general public perceives to be accessible as well as enlightening. Additionally, the ease of digital publishing today, and the rise of non-academic entrepreneurs like Bill Gates and Mark Zuckerberg, creates an environment where people can become historical figures and revered experts without engaging in peer reviewed scholarship.

Even with the many risks and competitive challenges, digitization of the archives offers a way forward for organizations. We have outlined some of the ways in which archival digitization can confer strategic benefits to the firm. As technology begins to change the way we capture history as it happens, developing a strong culture of digital excellence is required. This involves understanding the business and people aspects of technology as they impact the organization from a strategic perspective. The actual implementation of a specific technology or digitization technique becomes a smaller consideration when compared to the impact of digital archives on the people, culture and day to-day operations of the firm.

As industry has learned over the past thirty years, the CIO role has become increasingly valuable in understanding where and how to leverage technology for maximum benefit. Similarly, this paper proposes that digitization of the archives necessitates the same strategic level considerations where IST is not just a technology provider, but a solutions enabler as we move forward into a more digitally connected world of scholarly.



Article References

Amazon Web Services LLC. (2012). Amazon Elastic Compute Cloud (Amazon EC2). Retrieved from http://aws.amazon.com/ec2/

Archive Digitization. (2012). Our Mission. Retrieved from http://www.archivedigitization.org/our-mission Chun, M., & Mooney, J. (2009). CIO roles and responsibilities: Twenty-five years of evolution and change. Information & Management, 46(2009), 323–334.

Cieply, M. (2008). Large fire strikes Universal studio lot. The New York Times. Retrieved from http://www.nytimes.com/2008/06/01/us/01cnd-fire.html?_r=0

Claremont Graduate University (CGU). (2012). Digital scholarship: Online libraries, archives, and the future of scholarly publishing. Retrieved from http://www.youtube.com/watch?v=My4aGBF5aE4&feature=youtu.be

Beath, C. M., & Ross, J. W. (2010). PepsiAmericas: Building an information savvy company. Cambridge, MA: MIT Center for Information Systems Research.

Google. (2012). Google Scholar. Retrieved from http://scholar.google.com/intl/en/scholar/about.html

Harrell, J. F. (2006). An investigation of fair information practices on American websites. Claremont Graduate University.

Huselid, M. A., Beatty, R. W., & Becker, B. E. (2005). A Players or A Positions?: The strategic logic of workforce management. Watertown, MA: Harvard Business Review.

Jeffery, M. (2010). Strategic IT transformation at Accenture. Evanston, IL: Kellogg School of Management.

Rudenko, A. (2012). 2012 US Presidential Campaign: The first “Social-Media” presidential election ever. Retrieved from http://popsop.com/59456

Puglia, S., Reed, J., & Rhodes, E. (2004). Technical guidelines for digitizing archival materials for electronic access: Creation of Production Master Files – Raster Images. U.S. National Archives and Records Administration (NARA). Retrieved from http://www.archives.gov/preservation/technical/guidelines.html

Ross, J. W., & Quaadgras, A. (2011): Working Smarter: The next change management challenge. Cambridge, MA: MIT Center for Information Systems Research.