If you follow Blog MBA USP/Esalq, then you already know some important information about Data Science, right? But do you know exactly what is the importance of data engineering for this science?
We talked to Jeronymo Marcondes, professor of the MBA in Data Science and Analytics USP/Esalq who explained why, after all, data engineering is so important for Data Science projects.
But before, if you haven’t seen, how about checking the 3 pillars of Data Science for those who want to stand out?
What is it?
After all, do you know what is data engineering? According to Marcondes: “It is the name we give to the area that works the data for the data scientist. Basically, data engineers perform the processes of extraction, transformation and data load (extract, transform, load – ETL).”
It is the data engineer who ensures that the data arrives ready to be used to data scientists.
What is it for?
And what data scientists do with the information that comes to them from data engineering? Well, Data Science is a rather broad and complex concept to summarize. But Marcondes seeks to clarify some important processes.
“In general terms, data scientists work with predictions and inferences. Based on multidisciplinary knowledge, such as mathematics, computation and statistics, the data scientist seeks to capture, store and process information based on data, and these predictions and inferences can be generated”, he explains.
The relationship of data engineering with processes
Data engineering is responsible, then, for creating the processes that generate databases from ETLs.
“The maintenance of these processes, optimization of data response and system feeding architecture are some examples of other attributes of the data engineer. Briefly, the data engineer will ensure that the data to be used by data scientists is updated, available and with efficient architecture, which facilitates their consultations”, details Marcondes.
The data engineering professional
And if you are wondering what characteristics are important to the data engineer, the professor gives somes tips!
“I believe that the main characteristics are: analytical thinking, deep knowledge of databases and methods of storage and data transfer and, mainly, the will to solve problems. The latter is the main characteristic, in my personal opinion.”
Data engineering in crises
According to Marcondes, the Covid-19 pandemic was a good example of the importance of data engineering. “Many people went to home office and needed the data to be used by a company to be available remotely”, he says.
How to ensure efficient databases for data scientists? How to ensure that the data to be used by employees and applications maintain its efficiency in consultation?
The professor responds: “Despite the indispensable role of IT infrastructure professionals (Information Technology), the role of data engineers was fundamental in these moments.”
Did you know about the work of data engineering professionals? What did you think? Leave a comment!