BETA
This is a BETA experience. You may opt-out by clicking here

More From Forbes

Edit Story

Why Data Scientists Are Your AI Project's Biggest Enemy

Forbes Technology Council
POST WRITTEN BY
Jeff Catlin

Getty

It’s obvious that machine learning (ML) can’t exist without data scientists. But some of the patterns I’ve noticed recently suggest that maybe it can’t live with them, either. Data scientists may be working to advance ML and artificial intelligence (AI), but in doing so, they’re working at cross-purposes with their employers. The results aren’t sustainable.

To see why, we only have to compare how data scientists work with how others in the tech space do -- software engineers, for example.

We’ve seen amazing advances in software in the last decade. We have software libraries and services to thank for that, as well as software engineers’ willingness to use those libraries and services to quickly and easily build powerful new applications. Rather than reinvent the wheel, software engineers are trained to use existing libraries whenever possible. Building on top of existing good code is something they’re used to doing and see as valuable.

Data scientists, on the other hand, often avoid using existing tools and services. There are a bunch of reasons for this, and many of them aren’t the fault of the data scientists.

So why is “roll your own” the norm among data scientists? The relative newness of the field, the open-ended projects they work on and the lack of organizational understanding into how an AI project works all contribute to data scientists being given free rein to play in the AI sandpit. It can be fun and educational for data scientists but an expensive lesson for organizations.

Given the huge budgets and relative autonomy they enjoy, data scientists aren’t necessarily in a position where they need to consider efficiency and return on investment. Without a system of checks and balances in place, it’s tempting for them to build a model from the ground up rather than look toward an existing AI toolkit that could handle a lot of the scaffolding needed to make headway on a problem.

I’ve talked to data scientists who argue they can’t find tools that do everything they need. But ML and AI now have their own software library equivalents, and there are plenty of vendors who provide frameworks to manage hyperparameter optimization, over-tuning and algorithm selection. These tools are new and have lots of room for improvement, but if machine learning is to fulfill its promise, these tools are required components that data scientists need to embrace. They are certainly a more efficient and effective option than creating your own, especially as a project scales.

While a large part of the “roll your own” thinking is on data scientists, organizations also need to develop their understanding of what goes into an AI project and what kind of expectations they need to have of their data scientists. They need to know that it’s viable to say “buy it, don’t build it” and to push back against a data scientist who argues otherwise. Because these tools, when combined with measurable key performance indicators, can help prevent an AI project from becoming a financial black hole.

Accountability together with a cultural shift toward processes that more closely resemble how software engineers approach problems will help data scientists deliver the value they need to in order to justify their place on the team long term.

Machine learning can’t become a go-to approach for solving business problems until data scientists stop trying to reinvent the wheel -- and organizations stop letting them. Until then, data scientists are the biggest threat to the viability of your AI project.

If your company is new to AI and you’re worried about time and costs blowing out, our suggestion is to start small. You can use the following checklist to determine whether AI is a good fit for your problem:

Are you replacing an existing system or process that is well-understood?

Is this system or process heavy on human oversight and/or tasks?

Are these tasks fairly easy but time-consuming for people?

Is this system or process well-documented with lots of examples of results?

Do you have ready access to the results of the existing system or process to use as training data?

You should be able to answer yes to every one of the above questions. If you can, you have a good candidate project that is likely to succeed, and that will give you the experience you need to tackle harder projects in the future.

Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?