8 Things You Need to Know about Content Classification and ECM

Written by John Mancini | Jun 25, 2009 6:30:00 AM

Classification is key to realizing value from your content.

Why is classification important? Anytime you want to do something more than blindly storing your unstructured content, you need to classify and organize it to help those tasks along. Better organized, classified information can be searched more effectively, archived, managed as records, or incorporated into business processes. There's a reason libraries classify and organize all those books.
Timing is everything.

We've been talking about it in the Enterprise Content Management (ECM) community for some time, but automating the classification of information is now a necessary element of your ECM architecture. Why do I think automated classification's time has come? Well, that takes us to #3.
Volume and variety of information are driving the adoption of automated classification.

There are ever-increasing volumes of unstructured information created every day in our organizations. Email growth continues worldwide. And the variety of communication and collaboration methods continues to expand. SMS and instant messaging are in the mainstream. Blogs and wikis are entering it. Twitter is the hot communication tool du jour. And the innovations continue with Google's recent announcement of its Wave product.
Our employees simply can't keep up.

With more and more information being generated, the number of employees is certainly not growing at the same rate. The human being as a source of all classification decisions simply can't scale. We need to automate the process of organizing this information if we're going to maximize the value we get from it and manage its lifecycle cost-effectively.
Our employees are inconsistent.

Relying on our employees for these content-centric decisions is fraught with problems: they are inconsistent in their participation in these tasks; each employee uses different logic to make a decision; their logic is difficult to audit.
You can trust the folks with Ph.D.'s.

There are a variety of options for automating your content classification, ranging from simple rules to highly sophisticated, training based approaches. It’s easy for the layperson to understand the simpler rule-based methods. It’s not easy to understand the more advanced methods. But you should trust them because, guess what, those smarty-pants Ph.D.'s have automated classification methods that are proven to be more accurate and effective.
Automated classification will save you money.

Organizations typically take two approaches to classification. Let’s take the email archiving problem as an example. One typical approach acknowledges that users shouldn't be trusted to determine what emails should be saved -- so they save everything. Rather than solve the problem, they avoid the classification problem altogether. Now they're simply saving everything, regardless of its value. Though disk is cheap, it’s not free. Classification, for these organizations, will help you select only that information that merits being saved and save storage costs.
Automated classification will save you time.

The flip side to the email archiving argument above is that some organizations do trust their employees and ask them to select emails for archival and management. As we've established above, these organizations are likely to get inconsistent participation and as such low-quality results. Why won't your employees participate? Because they understand the value of their own time. And dreary (though well-intentioned) manual classification tasks are not well aligned to why they are being handsomely compensated. It’s a poor use of their time, they know it, and are acting on that implicit ROI analysis.

View full post