The AIIM Blog - Overcoming Information Chaos

Revisiting Digital Preservation – a Tale of Two Undergraduate Majors

Written by John Mancini | Nov 6, 2017 3:30:00 PM

In the true confessions category, I will admit to not having a formal IT background. Yes, I know. Shocking. As a further step down the true confessions path, I will admit that I was an economics and history double-major as an undergraduate.

For those who have read my materials for a while, the economics major will likely come as no real surprise, and perhaps explains why I am inordinately preoccupied with the business side of technology – how it is procured, how it is implemented, what drives business to adopt it, and the benefits (and challenges!) they experience.

As for the history major, Wikipedia says “Historians are concerned with the continuous, methodical narrative and research of past events as relating to the human race; as well as the study of all history in time.” Haha. No hubris there at all.

The history major perhaps explains why I am so intrigued by the question of digital preservation.

There are so many aspects of digital preservation that are important to me as a history major – things like “digital memory” and “preserving our cultural heritage” and libraries and formal archives.

All of which I love. All of which we at AIIM and I personally have tried to advocate for. And will continue to do so.

So let’s start with definitions. I would propose this for digital preservation – “the capabilities to ensure the readability and usability of digital information that must be retained for longer than 10 years.” And while I’m at it, let’s all agree that preservation is more than just a retention period and it’s more than a back-up tape – the points about readability and usability are critical.

But moving forward, I’d like us to also adopt the mindset of an economics major with regards to digital preservation. Because truth be told, in the rush to digitize, we’ve left a lot of business risk on the table – and also failed to capitalize on the potential value that resides in all of that dark data hidden away in traditional archiving and backup systems.

It wasn’t always this way. In the pre-digital era at AIIM, microfilm did a pretty good job of taking care of both the short-term information management needs of many organizations (better, faster, cheaper, easier to manage than paper) and their long-term information preservation needs (eye-readable for more than 500 years).

But the two objectives – management and preservation -- got separated along the digital journey. And in the process, long-term preservation of digital assets took a back seat to their short-term utilization and optimization in business processes.

The net-net of this is that we’ve tended to adopt an image of digital preservation as something frozen in time and inaccessible and hidden away in the less-traveled parts of the organization.

I’ve been struck by recent conversations that view digital preservation as something different, and frankly, more alive. A set of capabilities that can be used not only to reduce business risk, but also mined to create new value. And – shock of shocks – not located in the basement, but in the cloud.

So let’s look at digital preservation for a moment through the hard-edged lens of a business economist.

Let’s take snapshots at three places where the capabilities to ensure the readability and usability of digital information that must be retained for longer than ten years is a business requirement, and why a pro-active approach to digital preservation ought to be a strategic priority. And use this as a jumping off place to talk in subsequent Tip Sheets about the business case in particular vertical industries and horizontal applications for digital preservation.

Example 1 – Financial Services Industry

Global financial services companies must meet increasingly stringent privacy and FINRA, FCA, PRA and EU regulations, including MiFID II and GDPR. The retention requirements for data and content to meet these regulations can range from 10 years to 75 years to permanent. The complexity and volume of long-term digital information assets will rise as digital transformation, application retirement and paper to digital initiatives gather momentum.

The long-term digital information that global financial services organizations need or want to keep for longer than 10 years can run into millions of files in hundreds of different formats. Core business information that meets the “10 year+” test includes records of corporate governance, contracts and agreements, and audit and financial reports. The longevity of customer accounts is key for most companies. Master policies in the insurance industry need to be kept for 25 years. Pension and beneficiary files need to be kept for 75 years.

Example 2 – Pharmaceutical Industry

The pharmaceutical sector is a high-risk and strictly regulated industry. Secure immediate access to trustworthy long-term digital information is fundamental to successful product discovery, development, manufacturing and marketing.

In particular, R&D lifecycles can run into decades, and regulations require digital information to be retained for as long as the product is on the market plus decades after. Good Practice (GxP) guidelines and FDA 21 CFR Part 11 require documented access controls, system validations, and audit trails for software and systems that process and maintain electronic data.

The science-based innovation that drives pharmaceutical organizations results in tens of millions of files in hundreds of different, often proprietary formats. This includes laboratory and experimental work, research models, toxicology studies, regulatory submissions and correspondence, clinical trials, predecessor records from acquisitions and mergers, patents and trademarks, clinical safety management, quality control, and marketing assets.

In addition, the widespread use of outsourcing to third parties, such as Contract Research Organizations (CROs), means that vital information is often spread across various networks and jurisdictions - putting it at risk.

Example 3 – HR Administration

Every core business function has a set of information management policies and rules, many of which carry retention and access requirements in excess of 10 years.

According to the HR Specialist, the following common HR documents carry long-term retention requirements. And note that this is just with HR. Multiply these kinds of requirements across multiple functions, applications, and geographies, and you can get a sense of how quickly long-term digital preservation and access requirements can expand. It is also important to keep in mind that retention requirements are only part of the preservation equation, which also includes readability and usability.

Type of Information Retention Requirement
Earnings records Permanent
Employee service records Permanent
Injury frequency charts 10 years
Medical folders, employee 30 years after termination
Paychecks Permanent
Payroll records, after termination Permanent
Pension plan Permanent
Pension plan, applications Permanent
Pension plan, claims Permanent
Pension plan, correspondence Permanent
Salary and rate changes 10 years
Salespeople performance records Permanent
Training manual Permanent
Union (collective bargaining) agreements Permanent
Workers' compensation reports 11 years

These are just three examples of where the integrity, usability and findability of digital information over long periods of time is mission critical. An ECM system or system(s) is not enough to ensure long-term digital preservation. Failure to have capabilities to meet these long-term preservation requirements – in essence, kicking the problem down the road – is a recipe for significant e-discovery and legal risks as well as unbudgeted “big bang” storage migration costs.

The cost-effective way to address these requirements – the way an economist would look at the problem -- is by taking advantage of the economics of the cloud and by shifting the migration responsibility and expertise to a third party in order to:

  1. Eliminate hidden and accumulated risk: digital information stored for the last 10+ years might already be compromised or held in obsolete unreadable formats;
  2. Remove the susceptibility of information on shared network drives and SharePoint to loss, deletion and tampering (malicious or accidental); and
  3. Accelerate retirement of outdated applications – by providing a safe home for the critical longterm information they contain.