The AIIM Blog - Overcoming Information Chaos

8 Things You Need to Know about Devising an Information Governance Strategy

Written by John Mancini | Jul 8, 2010 11:41:00 AM

1. Knowing Why You Need It.

Before going about any initiative or technology implementation, it’s essential to know why you’re doing it. What are the business goals? Who should be the stakeholders involved? Information Governance ultimately means being able to transform unmanaged information into valuable business assets. And it provides enterprise readiness. Readiness means proactively servicing the legal and compliance policies in today’s business, and it requires continuous visibility, trust, and control across all of your digital information. With the combination of new government mandates, increased corporate accountability, and the digital information explosion, it is a necessity to have a holistic view of all information. With the right governance strategy, the business will have insight into unstructured content while complementing existing investments in content management, email, archiving, and storage management.

2. Consider the Source.

When devising an information governance strategy, you must first consider all the different sources of information that reside within your organization. For instance, there are Network Attached Storage devices with potentially hundreds of millions of files, NT file server farms that contain shared repositories of virtually invisible information, ubiquitous SharePoint farms that are sprouting like spring flowers, and email repositories and perhaps several types of content management systems (Documentum, FileNet, OpenText, etc.), to name a few. To govern this information, these sources all need to be accessed, their contents analyzed, and the results made visible from a single interface. In most environments, this single view of data across multiple sources or repositories is impossible to achieve and leads to incomplete collection and identification, let alone governance of data.

3. Data Analysis Must be Virtualized.

The strategy cannot omit critical sources of information or assume that one single archive will be built (thus doubling the aggregate size of the information involved in the strategy). It is important to have an email archive, but it is incorrect to assume that all the sources of relevant information across an enterprise will be re-committed to an archive for purposes of a governance strategy. The governance of information must be undertaken from a system that can understand data where it lives.

4. Search is Not Enough.

The logical first response by IT people is to specify an enterprise search platform and use that as the basis for the corporate governance strategy. The problem with this is that governance transcends the mere identification of information. A search platform is built to identify results and provide the user with reference to them based on keyword search. This approach is fairly inflexible and misses the other aspects of a governance strategy: insight and control or management of content. A governance strategy must include identification of relevant information, insight into the information that was identified (how does the information relate to other content within the enterprise, regardless of location) and control or management of the information. The first is merely search; the second is accomplished with search combined with analytics.

5. Automation is the Necessary Ingredient.

The above aspect of the problem highlights that governance requires technology to aid the process by automating the identification of similar content. A self-classification capability is the key to making data governance possible by making relevant data visible. Automation has been the missing ingredient that has kept true governance from being possible. A machine-learning platform that can guide the human reviewer to content with similar characteristics is the key to solving the problems that surface when attempting to implement a strategy of governing information. Human beings must make governance decisions but often don’t know where to start; automated learning techniques give them the place where they should start the process.

6. Scale, Extensibility, and Ease of Deployment.

The previous point also speaks to the volume of data being accumulated in enterprise and government organizations. Organizations contemplating a true governance initiative cannot possibly gather, classify (with human intervention), and set policies on all the information that they need to understand to feel confident that they can comply with intelligent procedures for governing content.

To date, there have been appliances that provide some insight into the content being accumulated for a specific purpose or project, but there has not been a scalable governance platform that can aggregate a view of all the data in place that is relevant. To cost-effectively apply the requisite analysis techniques to allow IT professionals to govern enterprise information, governance architecture demands certain criteria. It must be extensible across commodity hardware, fit into the virtual server environments of modern data centers, and deploy easily.

In the modern governance era, solutions will include an ability to extend almost infinitely across larger and larger data sets with little or no provisioning of storage and server capacity being necessary. These solutions will also have a portable indexing capability that can be expanded as the data within the enterprise expands.

7. You Need to Go Global.

Data is global, and therefore, solutions must be global in scope. Different strategies for analysis and local control must be supported by technologies being contemplated for governance activities. This is a concept that has eluded the governance solution community as we wrestle with the more exigent issues around the topic. Local control and analysis of content with a minimum amount of data processing is important even though this problem is not easily solved. Data is everywhere and, in some cases, must be indexed where it lives.

Having an additional capability to move an entire index and analyze it without having to remove the data from a particular country is a key component of a governance strategy. Emerging technologies will allow an index, not the data itself, to be removed to a remote location where the data can be analyzed by forcing it to be removed from the local country. These kinds of features are part of a total governance strategy that would be ideal in a global information environment.

8. Don’t Forget Security.

Data governance, particularly for legal matters, is an ongoing process with a definite life-cycle. Over the course of reviewing content pertinent to legal and other regulatory matters, different individuals with different levels of permitted access will be required to view certain documents. Allowing different classes of users to view data with certain characteristics at different times is a key attribute of a governance platform, and this must be accounted for in a governance strategy.