A Step-by-Step Guide to Metadata Management
“For me, context is the key – from that comes the understanding of everything.” – Kenneth Noland, American painter
Effective metadata management in an enterprise provides the right context and description to data. Also, to understand and trust data, we need to understand its background – how the data originated, and how we have used it till now. Further, we need to know, what are the decisions made based on this data and how can we leverage it for a better competitive advantage.
For success in this new digital age, organizations need to create meticulous data products. Data products are not just reports or analytics but are a comprehensive solution. They present an analytical, comparative, insightful information to the right people at the right time and on the proper device.
Without a complete metadata management solution, it's difficult to create these data products. With a growing amount of data, and an explosion of big data technologies, CDOs (Chief Data Officers) must look at managing their data more efficiently through Metadata. As per the latest estimate, the metadata management industry would be about 7.85 billion by 2022 and would grow by 27% year after year.
What is Metadata?
Metadata is “data [information] that provides information about other data. This understanding comes from setting the data in context, allowing it to be reused and retrieved for multiple business uses and times.” According to Indian University, “metadata is data about data. It is descriptive information about a particular data set, object, or resource, including how it is formatted, and when and by whom it was collected. Although metadata most commonly refers to web resources, it can also be about physical or electronic resources. It may be created automatically using software or entered by hand.”
Some typical metadata elements for structured or unstructured data are: – Title, description, and abstract – Tags and categories – when was it created and by whom – Who last modified and when – Who can access or update it. Other than that, we categorize metadata in an enterprise as:
Metadata for Structured DataIt includes – column structure of a database table, header rows of a CSV file, column definition from JSON, XML, and Avro files.
Business MetadataIt includes – security levels, privacy levels, and acronym levels. Both IT and business need quality metadata to understand the info on hand. Without useful metadata, the organization is at risk for making the wrong decisions based on faulty data.
What is Metadata Management?
The library catalog is a classic and one of the oldest examples of metadata management. To find a book one used to look for the book author or topic in the library catalog and search for the desired book. Next came Yahoo! search engine, where it indexed all the metadata from various websites. Finally, the revolution happened with Google when it devised metadata by processing actual data.
It gave the user an in-depth search experience like never before. It enabled the user to search within the desired context. Enterprises metadata management is, however, still either at library catalog level (done manually) or at Yahoo level (done by using various metadata management products).
An ideal metadata management program should be data-driven and derive from the context. Providing answers to all common questions like who, what, when, where & why about data is Metadata Management.
How Should We do Effective Metadata Management?
Here are a few steps to ensure it:
Layout Policies & Procedures
Effective metadata management starts with the policies, procedures, tools, and human curation of metadata. Employees are the center of metadata management. A company has to have tools for smooth interaction between employees about data and metadata. The following should be the roles for effective metadata management:
Role of CDO & Executives
Define rules for metadata management, and use some tools to enforce them. These rules should encompass various security aspects and metadata change methodology.
Role of an Analyst and Other Data Citizens
Analysts should follow the rules of metadata management. Also if they ask profound questions about data and metadata, these questions and comments can be saved. Later, this can benefit other analysts when they are researching the same data.
Features of an MDM tool
There should be robust tools to provide access to metadata and they should enforce all the rules defined by executives. Some of the features these tools can provide are:
1. Sample Data
Here we turn the tables on data where we generate sample data to give data context to metadata. Thus we enrich our understanding of metadata.
2. Data Stats (Profiles)
Stats provide answers to some common questions like count, distinct values, top used values, null count, maximum and minimum values.
Lineage helps you understand the origination of data, how it traveled, and what various transformations happened before it reaches you. Further, it also enables you to realize where else this data is being used.
4. Previous Communication
Communication is the key to effective metadata management, so it’s important to tie all the conversations related to metadata in one place. Also, all the comments and remarks regarding that metadata should also be available here.
5. Relationship with Other Metadata
For MDM tools it is crucial to find a relationship amongst data so that data search becomes possible. There are various ways to achieve this – manual, human curation, automatically through metadata semantic matching, or automatically through data matching.
Best Metadata Management Tools
A metadata management tool provides access to crucial metadata, categorizes it, and organizes it in a secure, risk-averse manner. A good metadata management tool will enforce the strategy, policies, and regulatory processes laid out by business executives.
A metadata management tool usually includes a series of critical, yet common, features. It will create sample data sets to put metadata into context so business users find it easier to understand. It will provide statistics surrounding common data points such as maximum or minimum values.
It will give users access to the lineage of your data so they can get a better understanding of where it has originated from and what has happened to it during its life cycle. It will tie together all of the communication that has been made surrounding the data, including any comments or suggestions. Finally, it should be capable of finding and mapping any relationships between data sets.
As per Gartner and our research, these are multiple metadata management tools available in the market:
OvalEdge is a complete metadata management tool that also supports ETL. OvalEdge boasts next-generation UI to aid collaborative efforts and also includes a proprietary algorithm that detects and maps all of the relationships between your company's data assets. There is even the option to predefine specific access and usage rules to ensure your organization remains compliant without hindering user access to data assets.
The Alation Data Catalog enables users to complete many of the essential metadata management tasks required. Alation is a relatively small player in the industry, but since launching its data catalog, the company has received positive feedback on the technology. One drawback is a lack of data lineage and impact analysis functionality.
Collibra Connect is the data governance tool provided by the major data company Collibra. In regards to metadata management, the tool is well-equipped to support compliance but it struggles in some other areas. For example, as its support for semantic frameworks, tracking the life cycle of data, and impact reporting could be better.
Informatica has a series of metadata management solutions including its Business Glossary, Axon, Metadata Manager, and Enterprise Information Catalog. Collectively, these solutions provide all of the technical support a user could ask for when it comes to managing metadata. However, the solutions are dispersed and users looking for a single, integrated platform where all of a company's metadata management tasks could be completed will need to look elsewhere.
Some organizations still rely on spreadsheet software such as Microsoft Excel to manage the metadata in their organization. Although capable of storing and organizing metadata effectively, this route provides none of the automation, advanced reporting, or regulatory support of the solutions we have mentioned above.
Schedule a Demo
Fill the information below to set up a demo.