This is an article about names and whether or not they matter. My first intuition was to open with the quote, “What's in a name? That which we call a rose by any other name would smell just as sweet,” but I thought better of it. Not because it was too cliche, which it was, but because it immediately transported me back to my sophomore year of high school, when the greatest achievement of my life up to that point was winning the role of Romeo in the drama club's production of Romeo and Juliet.
Please spare me the reminder of my nerdy teenage years!
Anyway, onto the matter at hand.
The purpose of this article is to begin a conversation about data and content and whether or not the long-held distinction between the two is still important. I'm admitting to you upfront that I don't have the answer. Instead, I only aim to share what I know.
To give us a starting point, let's first examine how AIIM and our community have been using these terms.
What is data?
Values, facts, figures, or any other forms of information that is organized in a specific format, making it easy for computers to process and analyze. Data is structured, meaning it lives inside a database.
What is content?
Text, images, audio, video, or any other form of information that is unstructured, meaning it lives outside a database.
What is information?
There's less consensus on a definition of information. Some would argue that information is data and content. Others might use content and information interchangeably. Still, others would argue information is the context or meaning behind the data.
It's important to note that data analysts, CIOs, other executives, and line-of-business types all have their own definitions of these words. Acknowledgment of this fact will be important later in this conversation.
AIIM published an eBook written by John Mancini in May 2017 called From ECM to Intelligent Information Management. In it, Mancini identified several emerging trends:
Explosive growth in volume and variety of data AND content. Billions of new connections between objects — the Internet of Things.
Rise of new data-centric technologies — Hadoop, NoSQL, Blockchain.
Incorporation of core content management capabilities directly in file platforms themselves (Office365, Amazon, Google, IBM/Box, DropBox) and collapsing prices for storage.
Expanding and increasingly challenging national and regional compliance and regulatory demands — and the growth of cloud and privacy “nationalism.”
A clear shift among the leading solutions providers to cloud-first R&D investment strategies; large-scale end users with major on-premise legacy systems are left playing catch-up.
Reflecting on these trends, Mancini said:
This is not to imply that the world of content management is the same as the broader world of data management. Nor is it to imply that the emergence of Hadoop and Blockchain and other analytics and cognitive technologies replace the need for organizations to understand how, where, and why they must manage their information.
But we do need to understand that the broader enterprise technology story influences the more niched content management story. The broader story influences how content and information management capabilities are utilized and consumed by real-life end users trying to solve ever-more complex problems.
As time goes on, content management capabilities are going to be viewed much less as a monolithic “solution” and much more as a set of capabilities that will be consumed in a much more modular fashion — tied to the needs of particular business processes. Content capabilities will be tied to processes — both custom and SaaS. In the end analysis, this is the world that Gartner now calls Content Services.
It was AIIM's opinion at the time that the separate worlds of data and content, while different, were on a path of convergence. We envisioned a new world that is all about Data AND Content, not Data OR Content. "We’ve operated in the past with a convenient dichotomy between data management and content management. If this dichotomy ever made sense, it makes less and less as time goes on. The kinds of customer-centric problems that must be solved require competencies and technologies from BOTH the data management and content management worlds." Mancini said.
A lot has changed across the enterprise IT landscape since Mancini's comments in 2017. The Covid-19 pandemic saw a massive shift to remote and hybrid work environments. Advances in technology, specifically in the realm of AI, are being made at breakneck speed. In October 2022, Microsoft rocked the content management world with the announcement of Syntax. Analyst Alan Pelz-Sharpe of Deep Analysis called it "the most significant Microsoft news drop since the launch of SharePoint in 2001."
All of this radical disruption seems to correlate with a shift in how data and content are being managed inside organizations and by whom. Member organizations, from Pacific Life to Merrick Bank and beyond, report that data and content are being managed by a single unified department.
In AIIM's annual State of the Intelligent Information Management Industry report published in April 2022, we found that 38% of organizations do not make the distinction between information and data, while 45% still do.
Here are some real quotes from survey respondents from the 38% of organizations that don't make a distinction between data and information:
“Data is raw while information is analyzed data with insights for business.”
“Information is amassed in reports that are understandable to the users. Data is just ‘out there’; it may or may not be integrated into the organization.”
“Data includes the individual pieces that make up information and that is most often viewed as systems-related and in the hands of technology while information is what the business user needs.”
“We are looking at info packages in context; it does not matter if it’s data or information or both.”
“We have definitely improved our alignment with the business through the increased use of analytics.”
“We’ve been able to make huge strides in the use of our data and data analytics to really tell our story internally, which in turn drives program acceptance.”
Meanwhile, the 45% of organizations that do make a distinction between data and information refer to data as being the sole purview of IT.
“Information management does not make a distinction. IT does, and believes they are best equipped to manage data. The distinction they’ve made is information means records, anything else is simply data.”
“Information is for compliance and data is for enablement.”
“IT prioritizes data in all areas mentioned and barely recognizes information to be the same.”
“Information is generally still seen in everyone’s mind as a paper file folder at a desk that just needs to be thrown in a bin when it piles up.”
Interestingly, although perhaps not surprisingly, our research suggests key differences between the information management teams at organizations that do make a distinction between content and data and those that don't.
The information management teams that manage content separate from data are focused more squarely on compliance, and the information management strategy is not aligned with the business strategy. Meanwhile, the information management teams that work alongside IT to manage content and data together prioritize their internal and external customers' experiences and have an information management strategy that is aligned with the business strategy.
This correlation will need to be explored further in future AIIM research, but one can start to form a hypothesis.
In April 2022, I sat in on a number of roundtables at The AIIM Conference, where two common pain points emerged.
Getting users to comply with information management policies was hard.
Getting executives to support information management initiatives was even harder.
Earlier, we'd acknowledged that different functions have their own definitions of data and content. Business executives and line-of-business types, most notably, are using data and information interchangeably. Often, they're not using the word content at all.
If information management teams want to improve policy compliance and gain support from the c-suite, a good place to start might be to speak in terms they understand. The findings of AIIM's 2022 State of the Intelligent Information Management Industry report seem to suggest some benefit in adopting the language our key stakeholders are already using versus trying to make them fluent in ours.
Still, others disagree, and there's likely an equally compelling case for maintaining the distinction.
Clearly, this is an important dialogue to continue within the extended AIIM community to decide if our convenient distinction between what is meant by structured and unstructured information still holds up. Do we need to revisit our long-held definitions of these terms, or should we give in and use them as interchangeably as our executives and internal customers do?