How to Start Doing Data-driven Supply Chain Optimization

How to Start Doing Data-driven Supply Chain Optimization

What is Supply Chain Optimization (SCO)?

A supply chain is the sequence of processes and activities involved in the production and distribution of a commodity to the ultimate consumer. Supply Chain Optimization (SCO) is the series of steps we take to reduce the costs incurred in carrying out those processes and activities. It includes the optimal placement of inventory and resources within the supply chain, minimizing operating costs.

Let’s take the example of a hair salon. Even it needs to maintain its supply chain by hiring more stylists on the weekends and less on weekdays and so on. If they haven’t planned well, sometimes customers have to endure long wait times. Other times they have to pay the stylists even if they are idle. If the salon can predict the demand, it can schedule the stylists accordingly. Hence it can optimize its supply chain.

Importance of Data in Supply Chain Optimization

Let’s consider the same example – a hair salon. If it can keep the history of its existing customers, sales as per the time slots, missed opportunities and can collect the data of weather, local events, they would be able to predict the demand. Then they can schedule their stylists accordingly.

Data-driven optimization is the new age supply chain optimization. So if a hair salon has dozens of data points to consider, imagine the scope for a retail company (Walmart/Amazon) or auto manufacturer (Tesla/GM). Data management is the only possible way to achieve supply chain optimization.

Challenges in dealing with Data Diversity

Any organization uses various applications to manage its operations. A company stores sales data in an ERP system, customer information is in a CRM system, and employee information is in the HR system. If a company has recently merged with another company, it makes matters more complicated. Now if one needs consolidated sales data, it may exist in two different ERP systems of varying brands like SAP, Oracle, Peoplesoft, etc. So existing Supply chain data-warehouse or data lake might not have all the data required for creating a predictive model.

Other times, it is not structured correctly and understood. Data Scientists and analysts have to spend many hours discovering data and incorporate it into the model. Let’s divide the challenges into two categories.

  1. Challenges in developing or enhancing the supply chain optimization model

  2. Challenges in operationalizing a model into production.

Challenges in developing or enhancing the supply chain optimization model

  1. A company stores data in myriad systems

  2. Not all employees know their data and how to interpret it

  3. There are differences in working and opinion between the application team and the supply chain optimization team

  4. Who is a data steward of a specific dataset (table or file) is not common knowledge

  5. Getting support across the organization is difficult within the project timeline.

  6. Once the data is found, it takes a long time to move it to a warehouse where it can be analyzed

  7. Involvement of IT in data movement makes dependencies larger and things slow

  8. Cleaning up all the data is a daunting task

Challenges in Operationalizing the model

  1. The Supply chain data warehouse is not able to scale, and queries are taking a long time to complete

  2. Data movement takes a long time

  3. If a query fails by any reason, its impact is not known


An organization can deal with most of the challenges faced in creating a model by incorporating a data catalog. The team should devote a week or two to organizing all the data in a data catalog. Next, the organization should conduct a workshop for the SCO team. If the data catalog is an IT department initiative then multiple teams (Supply Chain team, M&A team, Analytics team) can benefit from it.

The more teams use the data catalog, the better for the whole organization. A data catalog enables the supply chain team to look at the data comprehensively and understand it quickly. Now the team can smoothly discover the data sources they need. Once they have identified these sources and data, they can bring all the data into a central and dynamic data lake. Then they can perform an in-depth analysis.