Why I Left Data Science - And Picked Data Engineering Instead

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
like many people who started their data career around 2012 I was definitely hit by the hbr article about data science being the sexiest job of the 21st century but yet here I am near a decade later not doing data science at least not as directly as I used to so I want to talk about why I left data science and why I started pushing more towards data engineering early on in my career and in order to do that let's go back a bit and talk about my experience in college in the courses I was taking and how that kind of pushed me towards data science first so going back to my college experience I first started interacting a lot more with data at least applied data in an epidemiology course and for those of you who haven't taken an epidemiology course it's basically a branch of medicine that focuses on all of the math and statistics around the transfer control and management of disease so you learn things with like incidents distribution and things of that nature so it's how you since you can mathematically plot how disease is transferred and how you can actually plot and manage it one of the examples you go through is that of Jon Snow nope not that Jon Snow no this Jon Snow knew a few things and one of the things that he kind of discovered through his research was that cholera or a recent cholera outbreak that was hitting um the city wasn't caused by at that point what they thought was bad air but by bad water he used different data points including kind of where people were getting sick so where in the city they were getting sick as well as the fact that like the beer brewers weren't getting sick and different people weren't getting sick and so kind of used this analysis to figure out that oh it's the water that's making us sick and so that along with the computer science course that I was taking at the time pushed me to want to use basically computer science and statistics together um specifically towards Healthcare and that's really what I was trying to figure out after that I took some like courses in bioinformatics and a couple other spaces just because I was like I want to know more about how I can apply you know basically programming and math towards uh you know Healthcare and eventually I came across the hbi article and that's what pushed me towards data science so the first job I got was at a hospital well more specifically a hospital Network and so I was really pushing hard at that company to work as a data scientist at the time when I was hired I was an analyst and I kind of pushed to be involved in some of the data science projects we worked on things like readmission so that was a really common project to work on it might still be but especially back then I think that was one of the first places that was kind of people trying to apply the idea of data science um towards you know Healthcare problems so can we detect why people are having readmission problems readmission just means you know are patients being readmitted to the hospital after attending it you know 30 to 60 days after and we were using different socioeconomic data to try to detect uh whether or not you know there were different factors that were maybe outside of the hospital to figure out you know how can we make sure that every patient is taken care of and I kind of didn't have the greatest experience there because honestly the data scientists that were also hired were kind of new to the field as well and well no one really knew how to drive a project forward and that kind of gave me a bad taste in my mouth so when I went to my next company which was also a Healthcare company in this case a healthcare analytics company I saw how you could possibly use data science or you know basic stats to actually drive value and build a product and I think this is what really started pushing me towards the data engineering route right so in this company we were really focused on things like fraud detection General Health Care like you know how healthy are populations as well as detecting things like opioid over prescription this was all really interesting things that we didn't just do like research on but we built solid products around and this was kind of the next Domino to fall the first being again working for companies many times they don't even know how they want to use data or find Value from it I think that's a very common thing like some people they just find that you know they want to build things that are permanent right like yes the research is fun and there were someone that did all the research but that's kind of where their project ends they don't get to code things they don't get to actually make things permanent so I actually got to take those models you know reconfigure them so they were more performant and actually outputted the way we wanted and make them a permanent fixture to our solution and that was something that I think first started to kick off the fact that I was like maybe I prefer data engineering and there's a lot of other little things right like it's this desire to make things permanent and make things more of a product uh it was this desire to not just be stuck in this forever research Loop where you're constantly like hey what's this next question what's this next question what's this next question and rarely get to really deliver anything um that kind of pushed me more and more towards this data engineering world I think another facet of it um was the fact that just data engineering tended to be a larger problem set and I think this is why some or maybe many uh data scientists end up becoming data Engineers whether they want to or not um they just end up having to do data engineering work rather than data science work right instead of getting to do some research you have to create your data sets in such a way that they can be clean and reliable so thus you have to create data pipelines that are reliable and basically just become a data engineer and this has been a problem you know ever since at least 2012 maybe before since the days of Hadoop and when that was really popular a lot of data scientists were expected to write mapper reduce jobs and now even now I'm seeing data scientists have to do airflow work just to create their clean data sets so there is something also there where a lot of the work that just gets done in the data world is around the data infrastructure and actual data pipelines just to get us to the point that we can rely on that data I think that's another reason that many people have kind of been pushed this way one way or the other they may or may not like it some people glean towards it some people are very happy in this space they like the fact that they don't always have to present to the board because they just get to create the tables and then someone else gets to create the actual work but it's it's an interesting kind of flow and I'm honestly continuing to see it today again near a decade later that people are still experiencing these same issues in data science that they just end up becoming data Engineers over a slow period of time whether they want to or not I became it because I like the work but I really do think there's a lot of work that needs to be done in company culture as well as just clarifying roles just so that we can make sure that people do the work that they like and I know there's plenty of people who think that you know you should be able to do both and all and everything and I get it but especially if you can afford to have two two teams that do this kind of work it's nice to let people focus on what they're good at so that's why I ended up leaving Day science for data engineering you know I just tended to like the work that was a little more permanent a little more focused on delivering a product even if it's just a table and making that permanence uh an actual thing and not getting stuck in these research Cycles over and over again and to some degree it was just a necessity of the data world with that I'd love to hear your thoughts in terms of like why you became a data engineer or maybe why you're looking at data science and thinking that might be your career path whichever it might be please leave your comments below and I will see you in the next video thanks and goodbye thank you
Info
Channel: Seattle Data Guy
Views: 15,626
Rating: undefined out of 5
Keywords: data science, data science vs data engineering, should i become a data engineer, should i become a data scientist, how to become a data engineer, what is data engineering, why i quit data science, why i became a data engineer, becoming a data engineer in 2023, is data engineering a good career choice, how to become a data scientist, data scientist
Id: hMRZmN4LK5U
Channel Id: undefined
Length: 7min 13sec (433 seconds)
Published: Mon Jul 03 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.