4 Information Capture Challenges to Prepare for the Era of AI and Move to the Next Level
John Mancini

By: John Mancini on July 27th, 2017

Print/Save as PDF

4 Information Capture Challenges to Prepare for the Era of AI and Move to the Next Level

Capture and Imaging

In a recent AIIM survey, we investigated the question of what information capture “leadership” looks like in user organizations. What does information capture look like in leading organizations that want to position this competency not only as a source of immediate competitive advantage, but also as a long-term competency critical to the coming era of machine learning?

What are the problems that organizations are experiencing with their capture implementations as they consider this evolution? Here are four key problem areas that surfaced in our survey.

1. Multiple Document Types

Most organizations are struggling with capture complexity that is driven by the sheer volume of document types that must be managed. 64% of the organizations in our survey are dealing with more than 10 document types. In reality, this probably understates the complexity that organizations face at scale. As one survey taker commented about the number document types in their environment: “Myriad, Multiple, Many, Mucho, Massive, Mega.”

New, advanced data extraction solutions move beyond OCR. Using advanced technologies on top of OCR is the only way to properly identify the document type and then accurately locate, extract, and validate the data. The best solutions build on a strong OCR base and then leverage machine learning, content and image pattern recognition, and automated classification to provide a solution that more than satisfies business users’ needs.

2. Uncertainty About Processing Accuracy

Data is at the heart of the Digital Revolution. And data quality is at the heart of the challenge facing organizations as they attempt to make their data fit for purpose and fit for use. According to Gartner, at any moment in time, up to 40% of an enterprise’s data is inaccurate, missing or incomplete. When leaders at the top Business Process Outsourcers were asked by Parascript how they rated the accuracy of their data results from document processing, 10% rated results “very low” and 50% rate their results as “somewhat low.”


Get Your Free eBook: Intelligent Capture - A Key Element in Your Digital  Transformation Strategy


65% of organizations do not approach accuracy from the vantage point of statistically predicting the accuracy of the system, and instead rely on measuring accuracy from the individual document level by manual inspection of a small production run or a small sample. Measuring and tuning the capture system itself – not just automating the processing of individual documents -- must be automated so that the system continues to classify, locate, extract and verify data with great accuracy over time. This is challenging in a dynamic production environment where documents and images continually change and new types are added to the system.

3. Poor Usability

OCR software is inadequate for businesses that want to use this extracted data to efficiently process transactions, organize their documents for better control and governance, search important documents quickly and easily, access the right data for decision making, and find the content necessary to support business. OCR software supplies text and numbers devoid of context. This data might serve useful for a full-text search. However, as so many businesses have already realized, full-text search is insufficient and fails to provide a basis for knowledge management and information governance.

62 percent of respondents in our survey rate their capture software “very difficult” or “somewhat difficult” to configure. 44% of organizations do not have expertise or staff available to tune accuracy, and thus “out-of-the-box” functionality is important.

4. Information Capture is More Complicated Than Imaging

Capture is often assumed to be synonymous with scanning. The reality is that most organizations need to do far more than just process images. According to AIIM, 42% of organizations will be spending more on inbound workflow automation over the next 12 months.

Information and data are coming into business organizations from all types of devices and in all types of formats. In fact, when you look at the broader spectrum of the Internet-of-Things, information sources are now extending to remotely connected devices that include security systems, health monitors, and more. Consider the % of organizations that are trying to automatically extract data from the following “non- image” document sources:

Document type % trying to extract data from this source
PDFs 97%
Active PDF forms 62%
Excel Spreadsheets 76%
Power Point (PPT/PPTX) 53%
Word documents (DOC/DOCX 87%
E-Forms 70%

31% of organizations say they approach this challenge by “Processing digital documents in a different system from the one that handles scanned documents.” The same percentage say, they convert digital documents to images and processes them in the same workflow as they do scanned images. Clearly simplifying, standardizing, and automating this process is key to improving performance.

 

Free eBook: Intelligent Capture - A Key Element in Your Digital Transformation Strategy

About John Mancini

John Mancini is the President of Content Results, LLC and the Past President of AIIM. He is a well-known author, speaker, and advisor on information management, digital transformation and intelligent automation. John is a frequent keynote speaker and author of more than 30 eBooks on a variety of topics. He can be found on Twitter, LinkedIn and Facebook as jmancini77. Recent keynote topics include: The Stairway to Digital Transformation Navigating Disruptive Waters — 4 Things You Need to Know to Build Your Digital Transformation Strategy Getting Ahead of the Digital Transformation Curve Viewing Information Management Through a New Lens Digital Disruption: 6 Strategies to Avoid Being “Blockbustered” Specialties: Keynote speaker and writer on AI, RPA, intelligent Information Management, Intelligent Automation and Digital Transformation. Consensus-building with Boards to create strategic focus, action, and accountability. Extensive public speaking and public relations work Conversant and experienced in major technology issues and trends. Expert on inbound and content marketing, particularly in an association environment and on the Hubspot platform. John is a Phi Beta Kappa graduate of the College of William and Mary, and holds an M.A. in Public Policy from the Woodrow Wilson School at Princeton University.