Flexilant || Flexilant Works

Myth 1 - There is such a thing as critical data.

What was critical during MIS era and EDW era are vastly different to what is critical in the BIGDATA era. For example, the classic definition of an EDW is all about summarized data only. Transactional level data was not considered part of EDW. But today, even application logs are considered critical data.

Myth 2 - Data Management applies to EDWs and Data Lakes only.

In fact, increasingly, organisations are focused so much on Data Analytics at any cost, that more silos being created. Data should be managed from its birth. Regardless of the usage, data gets its characteristics at birth, in the application where it is created. It is there that it becomes an asset. For example, Data Quality can only be fixed at the origin. Fixing it anywhere else would only be temporary. Similarly, Metadata of an asset should be completely specified at the point of origin. Any data processing performed without accurate metadata at the origin can result in Data debt.

Myth 3 - Expediency in enabling data scientists is of paramount importance.

A common conversation these days is “Data management activities take too long”. To source data, cleanse it, integrate it, and then make it available for consumption is time-consuming to meet analytical needs. Many organisations procure a Bigdata platform, source data into the Bigdata platform, and let data scientists have access to the data and go about building analytic solutions. Most organizations that jumped onto this bandwagon soon realized that their ROI, which initially showed signs of being ahead of the curve, never met their targets. According to Forbes, data scientists spend around 80% of their time on preparing and managing data for analysis. 76% of data scientists also view data preparation as the least enjoyable part of their work On the otherhand, if organisations went back to the wild-wild west scenario, where analytic solutions are created with scant regard to Data Management, it will only lead to more data quality issues, proliferation of data orphans and turn the vision of a data-driven organisation into data swamp reality. It is worth looking at a real life example. An organisation that rents vending machines wanted to build an analytical model to efficiently plan when vehicles should go to collect cash and replenish merchandise. The project team felt it is time-consuming and unnecessary for their use case to meet all the pre-requisites prescribed by the Data Management office. They chose not to address Reference Data and Master Data concerns raised because it would take a long time. The project rolled out the analytic model and saw their efficiency jump 50% in the first year. However, in the second year, they only saw a 7% increase in their efficiency. In fact, on certain months they did worse than the median. Upon doing root cause analysis it was discovered that their process for maintaining vending machines Master Data was fraught with inconsistencies. It was manual and produced erroneous data. There were instances where vehicles turned up to replenish only to find there were no machines at the specified location. In some cases, they did not have the correct vending machine model information. As a result, they could not replenish fully. They had to do an extra trip to replenish. In the end, it was concluded that they can’t reach their target of 90% efficiency unless they deal with Master and Reference data strategically. And they would reach their 90% efficiency 2 years later and a few hundred 1000 dollars over what was budgeted.

Myth 4 - Data lake is the answer to all data problems.

In many organisations, Data Management is synonymous with EDW or Data Lake. Issues that confound the management of data as an asset are to do with methodology and not technology. Be it MIS or EDW or Data lake; they don’t solve data issues by themselves. They all should play the right role in the Data Management methodology.

Myth 4 - Data lake is the answer to all data problems.

In many organisations, Data Management is synonymous with EDW or Data Lake. Issues that confound the management of data as an asset are to do with methodology and not technology. Be it MIS or EDW or Data lake; they don’t solve data issues by themselves. They all should play the right role in the Data Management methodology.

Myth 5 - Metadata is a documentation exercise.

Enough can't be said about this myth. The single largest myth contributing to the downfall of most data initiatives is that Metadata is a documentation initiative. Data as an asset will have its share of issues. It is never going to be pristine. There will be inconsistencies. There will be data quality issues. The key to getting most out of the asset is knowing the asset well, including its flaws. Getting Metadata accurate is the single most important goal that every data initiative should have. Here is an example of the specification of a blue tooth speaker. There are two key observations on this product specification. • It fully describes the product. • The details were actual design inputs. It was not the case that the product was made and then the specifications were written. The product was made to meet these requirements. A data asset should be no different. Without all the metadata attributes specified it should not be possible to create a data asset.

Myth 6 - Master Data Management and Reference Data Management are not important.

Master Data Management and Reference Data Management are not important. It is only for large organisations. Our use cases don’t use much. 80%-90% of data quality issues arise from these two data domains. Not to mention the amount of redundant cost, time and effort wasted in coming up with workarounds. IBM recently discovered that in the US alone, businesses lose $3.1 trillion annually due to poor data quality.

Myth 7 - Keeping it simple means doing less.

Data Management Activities are unfortunately still very manual. Automated mechanisms for managing data are in their infancy. For example, capturing business metadata is a manual exercise. The simplest way to capture the metadata is to get as many pairs of hands as possible on the deck and get it done. Instead, many organisations adapt an approach to capture only a few pieces of Metadata, only to realise later that it does not suffice. In the end, despite spending effort, Data is not understood well enough and expensive retrofit efforts are needed.

Myth 8 - Technical Debt and Data Debt are considered as one.

Technical Debt is where a technology component is used knowing that at some point it has to be replaced. Data Debt is where a data element is erroneously processed knowing that it needs to be remediated. Many organisations treat them both with the same severity. For example, if we wrote a piece of Java code and need to replace it with Python code there is an effort involved. But the underlying data remains the same. However, if a piece of data was classified as “Public” but later needs to be classified as “Secret”, the impact is multifold. It percolates from Source to Data Repositories, Engines, Analytic Models, Reporting systems, BI tools, and so on. Comparatively the Data Debt could cost multi folds more to remediate than the technical debt ( Sourced from the book Enterprise Data. Reference Architecture by Muralidharan Govindaraajan )

Data Management

Data Management

Myth 1 - There is such a thing as critical data.

Myth 2 - Data Management applies to EDWs and Data Lakes only.

Myth 3 - Expediency in enabling data scientists is of paramount importance.

Myth 4 - Data lake is the answer to all data problems.

Myth 4 - Data lake is the answer to all data problems.

Myth 5 - Metadata is a documentation exercise.

Myth 6 - Master Data Management and Reference Data Management are not important.

Myth 7 - Keeping it simple means doing less.

Myth 8 - Technical Debt and Data Debt are considered as one.

Partners

We Make Bright Your Business

Architecture identifies components and formalises their specifications. In the case of Data Architecture Data Topics are the components and Metadata is their specification.
To enable a guided path to build the process towards Data management is our goal.

The Strategic Advantage of AI in Data Management

Enhancing Data Accuracy With AI

Streamlining Data Governance With AI

AI-Driven Automation for Real-Time Data Updates

Links

Address

Contact

Data Management

Data Management

Myth 1 - There is such a thing as critical data.

Myth 2 - Data Management applies to EDWs and Data Lakes only.

Myth 3 - Expediency in enabling data scientists is of paramount importance.

Myth 4 - Data lake is the answer to all data problems.

Myth 4 - Data lake is the answer to all data problems.

Myth 5 - Metadata is a documentation exercise.

Myth 6 - Master Data Management and Reference Data Management are not important.

Myth 7 - Keeping it simple means doing less.

Myth 8 - Technical Debt and Data Debt are considered as one.

Partners

We Make Bright Your Business

Architecture identifies components and formalises their specifications. In the case of Data Architecture Data Topics are the components and Metadata is their specification. To enable a guided path to build the process towards Data management is our goal.

The Strategic Advantage of AI in Data Management

Enhancing Data Accuracy With AI

Streamlining Data Governance With AI

AI-Driven Automation for Real-Time Data Updates

Architecture identifies components and formalises their specifications. In the case of Data Architecture Data Topics are the components and Metadata is their specification.
To enable a guided path to build the process towards Data management is our goal.