By: John Mancini on November 6th, 2017
Revisiting Digital Preservation – a Tale of Two Undergraduate Majors
In the true confessions category, I will admit to not having a formal IT background. Yes, I know. Shocking. As a further step down the true confessions path, I will admit that I was an economics and history double-major as an undergraduate.
For those who have read my materials for a while, the economics major will likely come as no real surprise, and perhaps explains why I am inordinately preoccupied with the business side of technology – how it is procured, how it is implemented, what drives business to adopt it, and the benefits (and challenges!) they experience.
As for the history major, Wikipedia says “Historians are concerned with the continuous, methodical narrative and research of past events as relating to the human race; as well as the study of all history in time.” Haha. No hubris there at all.
The history major perhaps explains why I am so intrigued by the question of digital preservation.
There are so many aspects of digital preservation that are important to me as a history major – things like “digital memory” and “preserving our cultural heritage” and libraries and formal archives.
All of which I love. All of which we at AIIM and I personally have tried to advocate for. And will continue to do so.
So let’s start with definitions. I would propose this for digital preservation – “the capabilities to ensure the readability and usability of digital information that must be retained for longer than 10 years.” And while I’m at it, let’s all agree that preservation is more than just a retention period and it’s more than a back-up tape – the points about readability and usability are critical.
But moving forward, I’d like us to also adopt the mindset of an economics major with regards to digital preservation. Because truth be told, in the rush to digitize, we’ve left a lot of business risk on the table – and also failed to capitalize on the potential value that resides in all of that dark data hidden away in traditional archiving and backup systems.
It wasn’t always this way. In the pre-digital era at AIIM, microfilm did a pretty good job of taking care of both the short-term information management needs of many organizations (better, faster, cheaper, easier to manage than paper) and their long-term information preservation needs (eye-readable for more than 500 years).
But the two objectives – management and preservation -- got separated along the digital journey. And in the process, long-term preservation of digital assets took a back seat to their short-term utilization and optimization in business processes.
The net-net of this is that we’ve tended to adopt an image of digital preservation as something frozen in time and inaccessible and hidden away in the less-traveled parts of the organization.
I’ve been struck by recent conversations that view digital preservation as something different, and frankly, more alive. A set of capabilities that can be used not only to reduce business risk, but also mined to create new value. And – shock of shocks – not located in the basement, but in the cloud.
So let’s look at digital preservation for a moment through the hard-edged lens of a business economist.
Let’s take snapshots at three places where the capabilities to ensure the readability and usability of digital information that must be retained for longer than ten years is a business requirement, and why a pro-active approach to digital preservation ought to be a strategic priority. And use this as a jumping off place to talk in subsequent Tip Sheets about the business case in particular vertical industries and horizontal applications for digital preservation.
Example 1 – Financial Services Industry
Global financial services companies must meet increasingly stringent privacy and FINRA, FCA, PRA and EU regulations, including MiFID II and GDPR. The retention requirements for data and content to meet these regulations can range from 10 years to 75 years to permanent. The complexity and volume of long-term digital information assets will rise as digital transformation, application retirement and paper to digital initiatives gather momentum.
The long-term digital information that global financial services organizations need or want to keep for longer than 10 years can run into millions of files in hundreds of different formats. Core business information that meets the “10 year+” test includes records of corporate governance, contracts and agreements, and audit and financial reports. The longevity of customer accounts is key for most companies. Master policies in the insurance industry need to be kept for 25 years. Pension and beneficiary files need to be kept for 75 years.
Example 2 – Pharmaceutical Industry
The pharmaceutical sector is a high-risk and strictly regulated industry. Secure immediate access to trustworthy long-term digital information is fundamental to successful product discovery, development, manufacturing and marketing.
In particular, R&D lifecycles can run into decades, and regulations require digital information to be retained for as long as the product is on the market plus decades after. Good Practice (GxP) guidelines and FDA 21 CFR Part 11 require documented access controls, system validations, and audit trails for software and systems that process and maintain electronic data.
The science-based innovation that drives pharmaceutical organizations results in tens of millions of files in hundreds of different, often proprietary formats. This includes laboratory and experimental work, research models, toxicology studies, regulatory submissions and correspondence, clinical trials, predecessor records from acquisitions and mergers, patents and trademarks, clinical safety management, quality control, and marketing assets.
In addition, the widespread use of outsourcing to third parties, such as Contract Research Organizations (CROs), means that vital information is often spread across various networks and jurisdictions - putting it at risk.
Example 3 – HR Administration
Every core business function has a set of information management policies and rules, many of which carry retention and access requirements in excess of 10 years.
According to the HR Specialist, the following common HR documents carry long-term retention requirements. And note that this is just with HR. Multiply these kinds of requirements across multiple functions, applications, and geographies, and you can get a sense of how quickly long-term digital preservation and access requirements can expand. It is also important to keep in mind that retention requirements are only part of the preservation equation, which also includes readability and usability.
Type of Information | Retention Requirement |
Earnings records | Permanent |
Employee service records | Permanent |
Injury frequency charts | 10 years |
Medical folders, employee | 30 years after termination |
Paychecks | Permanent |
Payroll records, after termination | Permanent |
Pension plan | Permanent |
Pension plan, applications | Permanent |
Pension plan, claims | Permanent |
Pension plan, correspondence | Permanent |
Salary and rate changes | 10 years |
Salespeople performance records | Permanent |
Training manual | Permanent |
Union (collective bargaining) agreements | Permanent |
Workers' compensation reports | 11 years |
These are just three examples of where the integrity, usability and findability of digital information over long periods of time is mission critical. An ECM system or system(s) is not enough to ensure long-term digital preservation. Failure to have capabilities to meet these long-term preservation requirements – in essence, kicking the problem down the road – is a recipe for significant e-discovery and legal risks as well as unbudgeted “big bang” storage migration costs.
The cost-effective way to address these requirements – the way an economist would look at the problem -- is by taking advantage of the economics of the cloud and by shifting the migration responsibility and expertise to a third party in order to:
- Eliminate hidden and accumulated risk: digital information stored for the last 10+ years might already be compromised or held in obsolete unreadable formats;
- Remove the susceptibility of information on shared network drives and SharePoint to loss, deletion and tampering (malicious or accidental); and
- Accelerate retirement of outdated applications – by providing a safe home for the critical longterm information they contain.
About John Mancini
John Mancini is the President of Content Results, LLC and the Past President of AIIM. He is a well-known author, speaker, and advisor on information management, digital transformation and intelligent automation. John is a frequent keynote speaker and author of more than 30 eBooks on a variety of topics. He can be found on Twitter, LinkedIn and Facebook as jmancini77. Recent keynote topics include: The Stairway to Digital Transformation Navigating Disruptive Waters — 4 Things You Need to Know to Build Your Digital Transformation Strategy Getting Ahead of the Digital Transformation Curve Viewing Information Management Through a New Lens Digital Disruption: 6 Strategies to Avoid Being “Blockbustered” Specialties: Keynote speaker and writer on AI, RPA, intelligent Information Management, Intelligent Automation and Digital Transformation. Consensus-building with Boards to create strategic focus, action, and accountability. Extensive public speaking and public relations work Conversant and experienced in major technology issues and trends. Expert on inbound and content marketing, particularly in an association environment and on the Hubspot platform. John is a Phi Beta Kappa graduate of the College of William and Mary, and holds an M.A. in Public Policy from the Woodrow Wilson School at Princeton University.