Common Problems with Content Migrations
The international standard on records management, ISO 15489 defines migration as the:
"Act of moving records from one system to another, while maintaining the records' authenticity, integrity, reliability, and usability."
But, migration is much broader than just records. Essentially, any time a legacy system is decommissioned, it should be reviewed to determine:
- What information is stored there?
- Does this information still need to be actively accessible?
After asking these questions, you'll be able to determine when migration is needed.
But, before you begin, I wanted to cover some problems that are commonly encountered with content migrations. Preparing for these challenges ahead of time can save you time, money, and countless headaches.
Migration Issues to Consider
In planning and executing a migration, there are a number of issues to consider. AIIM's training groups these into the following areas:
Now let's look at each of these in a bit more detail.
File Format Issues
Any migration involving unstructured data, that is, individual files, is bound to run into issues migrating certain file formats. These issues include:
- Proprietary Formats: The older these files are, the more likely they are to have issues. Special care should be taken to ensure that they need to be migrated and that they were migrated successfully.
- Complex Formats: These are similar to proprietary file formats; in fact, most proprietary formats are also complex.
- Linked Formats: Engineering drawings with linked external reference drawings, spreadsheets or PDFs that link to each other, or any other types of linked files often run into issues with the paths to the linked documents.
- Unknown Formats: Most repositories can store any kind of digital data, but if you run into unknown formats, there is a question as to whether to even bother migrating them.
- Duplicate Files: Depending on which research you believe, organizations have anywhere from 3 to 10+ copies on average of every document they store. This is often because the legacy system in question is network file shares, and when users finally locate a long-sought resource, they download it to their own personal file stores. One of the outcomes of the migration, and frankly one of the first ones, should be to identify those duplicate files, determine which one should be the official copy, and mark the other copies for deletion.
When migrating from one system to another, there are a number of specific issues to take into account. Let's take a look:
- Integrated Systems: Systems that have been tightly integrated may have unanticipated dependencies around data structures and output files.
- Reports: Integrated systems may also be generating reports that rely on data from both/all systems involved. And even single system repositories may be generating reports in a unique way that can't be done identically in the new system.
- Process Dependencies: Work processes, both manual and automated, may rely on how a system works, what its reports contain, how its metadata is structured, and more. This is exacerbated in automated processes where workflow rules are very specific as to the conditions for a particular task or step.
- Bandwidth and Processing Issues: This doesn't seem like it would be a huge issue until you start transmitting 10 TB of data across the network – or from one data center to another thousands of miles away.
Data Quality Issues
Data quality issues are a huge concern for a migration project – what's the point of doing it if the end result is inaccurate, inconsistent, or ends up in data actually being lost? Some of the issues to consider here:
- Redundant, Outdated (or Obsolete), and Trivial Content (ROT): A migration takes long enough without including terabytes of outdated, personal, or irrelevant information. Where that information can be isolated (and it can be), it should not be migrated, and in fact, it should be disposed of in accordance with the records management program.
- Lifecycle Considerations: What we mean here is that if some of the data in the system to be migrated has met its retention requirements and there are no other legal, operational, or historical reasons to keep it, it doesn't really make sense to migrate that data just to turn around and delete it.
- Missing Metadata: This is often the case because new fields were added; in many instances, these new fields are also mandatory. As the target system and its data structures are being designed, attention should be paid to this to determine how best to fill in that missing metadata. This is also known as metadata enrichment.
- Inconsistent Metadata: This is very common as different systems use different data structures. The way to approach this is generally to map the field in the legacy system to the one in the new system, either through a middleware application or by actually transforming the legacy metadata value into the new one during the migration process.
There are a lot of process-related issues to consider during a content migration, including:
- Accuracy and Quality Control: This is perhaps the one most thought about. That is, was the migration accurate, and how can you verify it? The migration tool and process can provide some metrics around the number of items, etc. but in all likelihood, you won't *know* the migration was accurate until your users start to interact with their content using the new system.
- Timing and Duration: How long will the migration take? Longer than you think, but it shouldn't be a never-ending effort. When does it start? Well, when can you freeze any additional changes to the legacy system, and what's the impact of that on your work processes? And of course, when is the migration, and therefore the migration project, complete? And maybe as importantly, when do you cut off all access to the legacy system? Because if you don't, users will continue to use it.
- Communication: It should be clear that the communication of all the previous points to those affected should be a high priority. In the absence of consistent, regular communication, rumors will fly, and users may take counterproductive steps like saving all their content to a less governed location like their own computer or a flash drive.
- Users: Finally, where you have individual users participating in the migration process, getting them to do it is often a challenge, and when you do get them to do it, they often want to keep everything "just in case." You should ask the question of whether they really need all that information, and why. One tool that can really be helpful here is a report that can show users that they haven't accessed a particular document or folder or repository in X number of years, that that document or folder hasn't been accessed by ANYBODY in X number of years, etc. These are easy to generate for many repositories.
"What Do I Do with My Old System and Data?"
There's one last common problem to consider. But, what makes this one unique is it comes after the migration. Once you're done with the migration (hopefully it was a huge success since you've addressed all of the issues in this post), you may be wondering what do you do with the legacy system, and with the data on that system?
In both cases, the answer needs to involve IT, legal, records management, data protection, etc.
- IT will ultimately be executing whatever the decommissioning strategy is.
- Legal needs to determine whether there are any legal or other holds on the system or the data.
- Records management needs to determine whether there are any retention requirements that still need to be met.
- Data protection needs to determine whether there is any personal or sensitive data in the legacy system that needs to be handled differently.
Once all those issues have been considered and addressed, the system can be decommissioned according to the appropriate decommissioning process for that type of system.
But, in a lot of ways we're getting head of ourselves. This is one to keep in mind, but your initial focus will be on the problems listed above.
Migrations are a necessary part of business - technologies advance, systems become outdated, and new solutions are added all the time. Working through a migration is one of the key skills covered in our Foundations of Intelligent Information Management Training Course and one of the steps required to support your digital transformation journey.