Data lineage is about understanding - data's origination, how it is transformed, who is using it and for what purpose and everything in between. Generally, data travels between various business applications. That could be to support various business functions or for data analytics. A good data lineage tool should depict all the data movement graphically. The tool should also be able to provide some automation in building the lineage.
When you are taking an important decision for your organization, it's crucial to convert data to knowledge and knowledge to wisdom. This can only happen when you trust data. It is important to build trust. Trust comes by understanding the roots of data and its transformation.
Most BI transformational processes are utterly complex and hard to understand and written and maintained by multiple people. Any change in these processes affects various downstream data objects, reports. Change is the nature of business and stalling these changes directly affects your business performance. So whenever you are making changes, you need to understand their impact.
Data lineage is the way where users can understand where the data is coming from, how it is used in other units of which the user has no idea. This provides understanding to the users, improving data literacy in turn.
Business processes in a company are complex and so is the data journey of their data. A lot of transformation happens before you see data in reports or data applications. This transformation is usually complex. The knowledge about the transformation is either in the source code or with multiple people. OvalEdge algorithms parse various kinds of source code to build the lineage automatically and then it is enhanced by experts with proper descriptions.
A data catalog is having all the information about the data and its statistics
Build lineage automatically by parsing source code
Experts learn, validate and rectify these lineage entries
AI algorithms help humans to correct these lineage entries