WiseAnalytics | Challenges moving data science proof of concepts (POCs) to production

Insights

Challenges moving data science proof of concepts (POCs) to production

12 min read

By Julien Kervizic

Many companies want to be able to leverage the power of data and look to invest in data science proof of concepts as a way to tiptoe into it. Unfortunately, a high number of proof of concept initiatives fail to make it to production. From my experience, there are multiple reasons why this happens. The challenges with operating data science are more than purely about creating a predictive model out of sample data. There are organizational, project, data, and infrastructure that an organization must face along their journey to be data-driven.

Organizational Issues:

Multiple organizational factors can impact how likely data science projects are to work out. Having teams empowered to put application code into production, the composition of the teams, and the organization’s mentality all contribute to the success of the project.

Empowerment

In large traditional companies, it is often the case that the data science team is not empowered to put models into productions. From not having access to production data, to not being allowed to push code or applications to productions.

Some organizations put specific restrictions on pushing code to production. From requiring support contract, specific S.L.A.s to be respected, and well someone to blame so as not to take the full accountability if something goes wrong.

This situation can lead to unreasonable conditions for data scientists to put code in production. For instance, maintenance contracts can only cover specific programming languages, for example, C# rather than a programming language typically used for data science like Python or Scala. Some organizations might be asking to rebuild the wheel to put models into prod effectively.

This difficulty of access and ability to put code into production in some large companies contrasts with some technology companies such as Facebook, where, within their first month, Facebook data scientists get introduced to putting code into production and how to access the company’s data.

Empowerment and accountability are particularly important to be able to give larger data science teams the ability to deploy their models to production.

Organizational compartmentalization and office politics

ML and A.I. projects tend to span across multiple departments and can be faced with quite a high degree of compartmentalization and politics, even more so than I.T. projects.

Large enterprises typically deal with Enterprise Architecture and rely on an approval process before setting up applications in production. The E.A function might not have been particularly exposed to data science in the past and might not recommend the most pragmatic way forward.

Stakeholders can also be problematic with moving the models to production. They are different reasons why stakeholder would be losing appetite. Often, Data Science projects are pushed down from the top, and stakeholders can become warry of the impact of their implementation.

Stakeholders can face a loss of appetite by having the decision process handled by a machine rather than controlled by them. Take, for instance, the use case of personalization of communication. The CRM team in charge might be open to leveraging predictive models in production. When faced with the reality of operationalizing this, they might be stuck in the old ways of segmentation.

Another reason why stakeholders might lose appetite is that the success factor from the P.O.C. might include a “time saved” metrics. The stakeholders might be unwilling to commit to a metric that could be translatable to an FTE reduction.

Team / Organization

The team in place and its dynamics impact the chance of putting models into production. The dynamics and inclusion of third parties, as well as the presence of functions supporting the initiative, can drastically impact the likelihood that a data science application ends up in production.

Third parties

Dealing with third parties can complicate the productionalization process of data science applications into production. Agency, Coordination, or administrative issues can come to complicate the process.

Agency issues: Relying on third-party/external consultants rather than in-house teams for development can create agency issues. With multiple companies doing work on the same project, some might be attempting to jeopardize part of the project to get a more significant piece of the pie. Another thing to consider is that some of the project’s companies might be losing future business, should the project prove a success.

Coordination: Coordination with multiple third party can become problematic, especially when people are not working full time on the project, or when layers of project managers are involved.

Paperwork: Leveraging third parties can slow down the process of putting models into production due to the administrative procedures. It might require to request additional budget, defining specific statements of work, signing contracts, etc.

Supporting functions

Other functions than data science plays a large part in being able to productionalizing models. A productionalization effort can require input from product/project management, data engineering, Software Engineering, DevOps, Q.A., …

Often, when building a proof of concept, these functions are either not present within the project, and if they are present, they are usually not included in the conversation at the early stages. This lack of involvement can make it particularly difficult to convert the P.O.C. into a production-ready application. Including these functions can be the key to ensuring the success of the P.O.C.

Mentality

The mentality of the organization is essential in achieving data science success. Two main factors contribute positively, agility and willingness to experiment.

Embracing agility within the organization and a culture of looking at providing incremental value rather than wanting the project to be perfect from day one is a core factor at being successful with data science. Agility is required both on the side of the stakeholders and of the data science team. Stakeholders should not ask too high of a prediction power, or automation, and data science teams should not develop too complicated model from the first day.

Another important enabler to successful data science projects, is the organization’s willingness to experiment. Data science require a leap of faith, that data and prediction models will eventually create a positive impact on the organization. The results might not always be positive or controllable, but iterative development should bring the project to a phase of control. Risk aversion is one of the reasons why some organizations have decided to split development activities between an I.T. and a digital department.

Organizational Knowledge

The knowledge required to move an application to production might not be readily available within the organization. It might require specialized application knowledge, such as integration on a legacy point of sale system.

Data Issues:

There are different issues related to data that may arise when developing the application or looking at integrating it into production systems. The overall data quality might not prove appropriate. There may not be enough data or there may be differences in the underlying data between when the P.O.C. phase took place and when it is being integrated into praoduction.

Data Quality

The data used might be “quality data”. The data used for the proof of concept might not have been previously looked at. There could be some systematic gaps, or there might have been specific events within the training datasets that were not initially well documented, and that can influence the initial training.

Data can also be too noisy to be able to extract anything of value. The right information might not be surface, such as not having captured the information as part of the dimensions, or with a manual capture mechanism. The data ending up not having enough predictive signal because of this. Obtaining the information in a way that could provide enough signal can require specific development. Looking at acquiring information from both internal and external can help fix the data quality gap. Depending on the issue at hand, another approach is to rely on additional processing, one example of this is the merging of customer data to provide a holistic view of customer behavior.

Data Volume

The overall data volume might not be enough to detect statistically significant changes and train machine learning models effectively. Some models, such as deep learning models, need significantly more data to gain prediction power.

Besides the overall data volume, having enough historical variation within the training process is needed to avoid regression when pushing the model into production. Think about a model meant to detect apples, that has only been trained with data captured during the summer. The model might be good at detecting ripe apples, but might not be able to identify unripe apples. The data might not contain enough diversity for it to have predictive power.

Data has changed

Between the time taken to move the proof of concept to production, user behavior, or conditions might have changed. In machine learning, we refer to this change in the underlying data as data drift. Data-drift causes a decrease of performance in prediction models and typically require the to retrain models when the performance metrics drop below a certain threshold.

There can also be differences between the way data has been collected for the proof of concept and production. In such a case, it will require to redo some of the pre-processing steps, recompute the different features, and retrain the model.

Project issues

Multiple issues can occur at the project level that might make it improbable that a P.O.C. would be push to production. Foundational, scope, integration issues, wrong data uplift assumptions, or relying on manual processes are all factors that affect how likely the P.O.C. will be productized.

Foundational issues

There are often issues arising from not starting the project with the right foundation. Often, Data Science projects are pushed from top-down, and it doesn’t leave the opportunity to properly structure and ensure that the project an be based on the right foundations.

One of such foundational issues can be starting the project with data collected ad-hoc rather than relying on existing data pipelines. There could be a difference between the different datasets. When looking at moving the product to production, it will require the data pipeline to collect this information to be build before being able to push the model to production.

Another major foundational issue is not making sure that there is a feasible path to automation before beginning it. When possible, it is also necessary to make sure the necessary access or permissions have been granted prior to kick off the project. Not having them prior to the start of the project can cause delays and place the project at risk.

Scope issues

Scoping issues at the proof of concept stage can also be a reason why applications don’t end up in production.

One of the scope issue that can arise is not tackling a big enough scope at the proof of concept phase. For those who have watched the Silicon Valley series think about the application “Hot dog / Not Hot dog” from Jian-Yang. The app is good at demonstrating the concept of how it would be able to identify specific types of food, but not so much at identifying a variety of food. For many companies, the proof of concept phase is a time-bound endeavor. It might not be possible within the allocated time-frame to tackle a scope big enough for the problem, that we would be able to push the P.O.C. to production without significant risks.

Manual process

Relying on manual processes for the application of decisions made by machine learning models proves to be detrimental. People might not necessarily follow the recommendation of machine learning, making performance evaluation particularly difficult..

Data uplift

Sometimes the P.O.C. doesn’t bring the uplift expected and doesn’t justify pushing the application to production. Ending up with lower than expected results can arise for multiple reasons. The method used to come up with the estimate might in itself be faulty, or there might need to be more than just a better data input to obtain the expected uplift.

Relying on industry benchmark or uplift numbers provided by consultants without considering your full business context can lead to substantial differences between expected uplift and those obtained during the P.O.C. . For example, personalized product recommendation, on average e-commerce website, reports a significant increase in revenue through personalization. But if your website doesn’t have a discovery problem and only 10–20 items, it is unlikely that personalized product recommendations would significantly increase revenues.

Sometimes, the gap in uplift is not due to a data problem, but rather what is done with the data might not be fully appropriate to generate the expected improvement. For example, this can be the case after identifying the gender of users, only changing the color and pronouns in the communication rather than the offers/product suggested.

Integration issues

Not looking at integration from the start can raise issues when attempting to move to production. Either it might prove too costly, complex, or the integration might not even be possible.

There are multiple reasons why integration costs might increase when looking at productionalizing the models. Fragmented I.T. systems or differences in business processes between different parts of the organization can make cost-prohibitive for deploying the application across the entire company. Integration might also need to rely on expensive resources, which were not foreseen to be that costly. The complexity of integration might prove higher than expected when digging into the details.

There might also be other hurdles in trying to put the models into production. The integration might prove infeasible, either due to limitations in the current systems, or because specialized resources needed for the integration (for instance, third parties) might not be available.

Infrastructure issues

Without any infrastructure, it would be quite challenging to operate machine learning models in production. Doing data science on production relies on an infrastructure for processing and serving data, as well as for handling the deployment and monitoring aspects.

Data processing infrastructure

To easily productionalize and extract value out of data, there needs to be an infrastructure in place that allows for data processing.

In terms of need, this means computing capacity for training and computing the aggregates and model predictions. For cloud-friendly businesses, this usually means access to a cloud subscription, or some VMs. For companies in more regulated, it might mean purchasing additional servers.

Besides just raw compute capacity, infrastructure in terms of data stores and transfer, as well as processing frameworks (eg, Spark), and framework for training models (eg, Uber’s Michelangelo), help lower the cost of productionalizing data science proof of concepts.

Data Serving infrastructure

On top of a data processing infrastructure, there needs to be the necessary infrastructure to serve model predictions. Exposing model predictions is not just about exposing a Sagemaker’s model directly as an API, but being able to marry the data with the model.

Features need to be available within a production environment, and need to match those available during the training. There are different approaches taken to make this data accessible within the application ecosystem, from taking a fit for purpose or microservice approach (like Amazon) or relying on a unified data access model (like Facebook). Having data readily accessible in production for modeling purpose allows to significantly decrease the cost and complexity of integrating the model in production.

Data/ML Ops

There are also operational aspects to running models in prod, from being able to monitor and evaluate models’ performance, to handle model deployment, versioning, and experimentation.

Summary

There are multiple obstacles productizing data sciences applications, from organizational or project issues, to data and infrastructure. However, a couple of things can be done to provide the project with a better chance of success.

Setting up a multi-disciplinary team dedicated to the project, giving them the mandate to act in a domain area. It is necessary to empower them and hold them accountable for the success of the initiative.

Looking at an incremental approach towards development, and prioritizing automation and integration before looking at building predictive models.

Gradually set up the infrastructure along with the use case development. With the infrastructure helping decrease the cost and complexity of development of future data initiatives.

‍