Given that data science has been dubbed “the most promising” career by LinkedIn and the “greatest job in America” by Glassdoor, many in the industry find it difficult to comprehend how something as lucrative-sounding as data science can ever be considered dead. By definition, “Data science is the field of study that combines domain expertise, programming skills, and knowledge of mathematics and statistics to extract meaningful insights from data.”
So, until and unless we find a way to not use data itself, data science as a field is not going to be obsolete anytime soon. However, many believe that since a data scientist’s daily tasks are quantitative or statistical in nature, they can be automated, and there will not be a need for a data scientist in the future.
Domain expertise matters
The notion came from the fact that some tasks of data scientists like data cleansing, data visualisation and model building can be partially automated by autoML models. However, while the tools might be able to do the task efficiently, many do not focus on the part of “domain expertise” in the definition of data scientist.
Domain expertise refers to the extensive knowledge in a particular field which data scientists apply to their data science skills. So, even if a large part of the data pipeline and workflow is being automated, you still need a data scientist to translate the business problem, which is being solved into the correct format.
Furthermore, it is not easy to identify which data science model to apply based on the industry. Particularly when the industries differ so widely; a recommendation algorithm for the health industry would not be helpful for a video streaming platform.
Tina, a former data scientist at Meta, believes that the very unappreciated portion of a data scientists’ job is to apply correct context to a model. Talking about her experience at Meta, where she worked on Instagram’s integrity, she said, “There was a machine learning module which screened for content integrity and subsequently demonetised them if they broke the rules. My job, apart from getting the data from the model, was to determine what is even considered as breaking the rule.”
According to Tina, “The issue is that ML models can’t detect ‘unknown unknowns’, and if you can’t even measure something, how can you determine if it’s even breaking the rule? There is always a balance between free speech and integrity.”
As Dr Vaibhav Kumar, senior director for data science at the Association of Data Scientists (ADaSci), rightly iterates, “Data science is a field where only 50% of the potential has been realised.”
“The field, in my opinion, still needs a lot of work and is far from being over. Machine learning may be used in various tasks of the workflow, but data scientists are still required to determine what to do next. What do these results of the model mean? How do you determine if the model is even doing a good job? What’s the metric,” he asked AIM.
“There will always be a need for human assistance in the field of data science, which machine learning alone cannot provide,” said Dr Vaibhav.
So, is it dying?
The fear had surfaced a few years ago in the accounting sector when it was claimed that AI may replace accountants’ and auditors’ jobs. However, even if an AI programme can pretty much do everything an accountant can, you still need the expertise of the accountant for tax exemptions, credits, etc.
In a similar vein, a data scientist may rely on the autoML models to collect, visualise, and clean data so they may concentrate more on business needs. Additionally, the demand for data scientists will only increase in the future as data science is still in its infancy in many conventional areas such as finance, healthcare, defence, and governance.
The funny part is that for AutoML data exploration to even occur, it needs data first, which is something that is gathered by a data scientist itself.