Interview: Amazon Data Engineer (Majoring in Computer Science to working as Data Engineer)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
well i have few resources which uh helped me become successful while uh uh cracking the data engineering interviews thank you rahul for your time really really appreciated uh would you please tell me a little bit about yourself um firstly young thank you for the opportunity um i'm really uh interested and excited about it so i'm currently working as a data engineer at amazon for about uh you know two years and uh i would like to start from a little bit about my education background and how my journey has been until late um i did my undergrad in computer science focusing towards distributed systems in india uh in assaram university chennai and i came to the united states to pursue my master's and i did that in rothkar's the state university of new jersey focusing in information technology analytics and my primary concentration was towards data science and data management um initially i was working before coming to amazon i was working as a business intelligence developer at virus and i've been quite exposed to the business intelligence role on how things were being done and how the business was able to drive data to make better decisions and uh you know make the turnaround grow exponentially over the past few past decade so after coming to amazon as a business intelligence engineer uh i've got an idea of how to deal with tremendous amounts of data how to pull data from different places how to get the business essence and give the reports eccentric to the business so that the business could be the most profitable thing from the data or the reports i get and after that the main reason for me to pursue my career as a data engineer is being a business intelligence engineer i was quite close to the business rather than the engineering side it was the ratio was like perhaps 30 70 30 towards the engineering side and 70 percent to the business side and i i i think i'm technically more interested towards being towards a significantly more amount of engineering side and i'm also interested in the business too so data engineer is more like 50 50 or 60 40 more inclined towards uh the engineering part rather than the business part yeah so that is the reason i was uh pursuing my uh career as a data engineer and uh yeah um kind of curious why did you so a lot of the people who pursue computer science and their you know bachelor's degree or in their education they tend to go into you know software development roles for example um and that is a pretty prestigious you know role in tech industry in the us especially is there a reason that you have you know chosen to come into you know the data side of the you know equation versus the software development side i'm sure so before i address that question i would like to give a little clarity about what the data engineer role is about and then how it is addressed in different companies apart from amazon and what exactly the work the data engineer does so and based on that i'm going to tell you why i wanted to pursue my career as a data engineer sweet so data engineering is a subset of software engineering a data engineer is a software engineer who is completely eccentric and who's completely focused towards data engineering at amazon a data engineer is a person who moves data from one place to another that is one of his responsibility apart from that few there are few data engineers who work on the platform side when i say the platform site they actually provide the platform for business intelligence engineers and the data scientists to actually uh run their queries or run their train and test their models necessarily and you know even to store the data even to process the data we need to have a separate specific platform so that also comes under a data engineering responsibility apart from that there are many different aspects of dealing with live data we have seen most for most of the analysis especially on the business intelligence side and the data scientist side we deal with a batch processing of data that is historical data or the data which has already been captured but capturing the live stream data also comes under data engineering so for that we use an entire different set of tools entire different set of platforms and that is also a responsibility of data engineer and in a few companies like uber and few companies like lyft stripe this specific role is called software engineering in data okay yeah so at amazon it's just a an entire collided thing called data engineering but in different companies it is uh coming with a different name so to summarize this it's just a subset of software engineering um which is completely centric towards data engineering and coming to the point where i would where why i would like to pursue my career as a data engineer is uh especially in the past decade and especially when i was going to the school i have seen the impact lively on how data was driving the businesses especially looking from 2010 to 2020 the storage of most of the companies has been growing exponentially not even linearly so that is the impact of data which it has been making let's get few of the examples here how does google run as a company how does it function it's just user data yeah and most of the free websites like we transfer and many who offer free cloud storage how do they actually run it's they're providing a service for free at what cost at gathering data so gathering data has been a very basic and a vital element of any company growing there so if it is playing such a crucial role i definitely wanted to be a part of that to make an impact in increasing society and as with increasing uh data needs we are specifically tailoring a product to our individual users requirements so ultimately we are making the user experience way way way better so i just definitely wanted to be a part of the core impact team and that is the reason why that passion led me to pursue my career as a data engineer that makes all sense so i i know data engineers or you know the data family actually has a very different interview process versus for example a marketing manager which i am so would you can you tell me a little bit about your process of getting into amazon from starting from maybe application and uh if you had any sort of assessments um prior to the interview and or um any interview you know processes including any of the assessors that you had sure so um i would like to brief the i mean i would like to give an elaborate explanation about how the interview process for a data engineer happens and what are the key competencies there and parallelly i would also like to give it about business intelligence and data scientists it's because it is easier for us to compare it retrospectively so i'm going to start with a business intelligence engineer for business intelligence engineer the two main things are product sense and understanding of sql how the sequel works how a database works what are the key elements behind it apart from that it depending on the team depending on the role python is also a little mandatory for cracking the business intelligence role but product sense plays a vital role here coming to the data scientist role we would like to have product sense not as much as a business intelligence engineer we need to have exposure to sql but again not as much as a business intelligence engineer at the same time we would like to have uh exposure in the areas like statistical analysis machine learning and time series forecasting and etc more of the machine learning side so that is what makes a data scientist now coming to the data engineering side there are five different or six different elements which are uh the key concepts here uh one is dimensional modeling the second one is sql third one is query fine tuning fourth is python fifth is uh system design and architecture and uh sixth is etl so this six are the key components to crack a data engineering interview and mainly whenever a candidate or even including me myself was assessed on these six uh concepts when i was giving my interviews to convert uh to become a data engineer so um on top of that i know amazon leans heavily on the leadership principle and the behavior questions so those were also some of the questions that you got during the interview as well right absolutely so the interview structured this way uh whenever i had the interview for entire 1r and for the first 15 minutes i would have i would focus uh on the behavioral questions which would actually give me based on the projects which i did and which would help the interviewer asses on what is the amount of data exposure or the data in engineering projects which i've had the first 15 minutes would be the leadership principles and i think the leadership principles again vary from role to role but i was interviewing foreign and my key principles were ownership deliver results customer obsession backbone to disagree and commit learn and be curious these were the five different principles which were focusing on my behavioral questions and simultaneously apart from that my interview was very broad and i was also questioned on few of the data structures and algorithms and to be specific about the different data structures which were focused are arrays dictionaries and i was also a few of my questions were based on a recursion as well so that is the breadth of uh happiness after thank you for explaining that that's that was really robust um moving on can you tell me a little bit about what you do as a data engineer uh yeah so currently every day especially how it is structured in amazon is a team of data engineers or a single data engineer they own an entire product so when they're owning the product what they're doing is they're building many pipelines to capture the metrics related for that product and in my case as a data engineer i support my key stakeholders are three different people the product managers the data science teams and the business intelligence team of course all these teams and all these people require data in different formats for example the business intelligence engineer they want aggregated tables to run their reports on or to make their unnecessary tables on whereas a project manager he would want the data to be as an analytical form because he would want to see a direct impact on how this data is driving the business whereas on the data science team they would want the data to be very uh in in a very optimized way and they would also be looking at historical data so that they could train their model with the historical data to predict the uh upcoming uh future durations so based on the necessary key stakeholders i move data from one place to another i capture data i build models i build aggregate tables and i provide the data to my stakeholders that is my key responsibility as a data engineer um gotcha that is very interesting so you work a lot with the bis and then the data scientists and like the product managers and all the gotcha was there was there a learning curve when you joined amazon um absolutely yeah so i think uh as a business intelligence engineer i had a really great team and i improved as an engineer a lot and the amount of learning i did both as a data engineer and a business intelligent engineer was very tremendous and especially considering an organization like amazon the work is at a very high pace and i think the amount of work which we learn at one year at amazon is almost equivalent to three years of work which uh is happening at another basis yeah and definitely i'm not exaggerating i've worked in other firms and i've worked at amazon too i'm just doing a retrospective comparison on how things are happening and what the pace is happening and how steep was my learner and i think it was a great opportunity to be in both of the teams and being at amazon and improving my skill set a lot yeah yeah yeah i mean i i know because we're in the same team um what do you like the most and what do you like the least about your role as a data engineer i enjoy i think almost er every different bits of being a data engineer but to have uh but to be super specific on a few of my bits one thing i like is i like capturing new metrics when a product is being made because identifying the new metrics with the pm is a very tricky thing it's very interesting because each of us have a different perspective of capturing the metrics or measuring the success rate and almost you feel everything is important but you can't actually gather everything because it's going to be too much to analyze that is one bit i thoroughly enjoy and being on the technical side i also like pre-processing the data and capturing the live data for a specific data um i mean these two are my favorite things but i almost like everything and uh yeah okay yeah that's that sounds awesome um tell me about amazon's culture and maybe a little bit about your work-life balance uh amazon's culture i think uh you're quite aware of it uh definitely the work is fast-paced and with that at times it could be quite tedious too well i think uh at at least where i am in my career i can kind of enjoy the pressure because it gets me it helps me getting prepared for the future because once you are handled working under pressure at a really good organization like this you develop that confidence and self-faith so that you could you know you'd be able to perform well any at any other place at times work-life balance could be a little affected like every other company especially when you're close to during launches or when there are too many things on your plate but uh that's just about i mean by product of uh having a steep learning curve that's how i would put it as gotcha gotcha no um yeah i mean no one joins the amazon for the nine to five work-life balance so i think that's uh very accurate uh how you put it um so last thing is advice for those who want to join amazon as a data engineer um well i have few resources which uh help me become successful while uh tracking the data engineer interviews but apart from that i think as a software engineer practice is very very important for all the coding questions and the product sense questions too because each of the biggest successful companies in each of the domain have a different way of looking at uh the product have a different way of collecting different various metrics of a product so getting exposed to different number of uh uh of models which are existing and when i say models it's just not the business model and i'm talking about the data model too and getting accustomed to the way how things are done like few of the companies use only spark like let's say facebook they only use hive to process data in spark whereas at amazon we use a wide range of aws products so getting familiarized with the amount of big data technologies which are currently available in the market and uh how things are done plays a vital role there and to be honest i think experience plays a vital role in uh becoming a data engineer and practice as well and i have mentioned repeated i mean mentioned earlier the six different aspects of the data engineering you we need to be thorough and we need to be good with almost most of the parts but at least if you want me to be specific about it product sense is something which could be a little compromisable depending on the team depending on the company and depending on the interview but on the technical side you need to be super strong to be able to crack the data engineering interviews in for those technical parts what what are the what are some resources or um you know ex like practices have you done like is there something that is readily available to just check it on google or something like that sure so for i'll start with product sense for product sense there is this website called stellar peers it uh it is a really good website which helps us i understand the product it's really about making a good project manager a project manager or a program manager so i think that is uh that was super helpful for me to understand the product sense and what are the questions they're looking for what what are the areas where an interviewer is assessing for and it will give you a few examples of how things need to be done or at least the path or the thinking on how we need to measure the success uh i think it could be a good step to start and once you are into it you don't it automatically understand on what you are being assessed there and for python i have used lead code like every other individual i think i have done many easy and medium problems and especially i was filtering it with respect to amazon because i was interviewing for amazon that helped me do that and i've said earlier i'm going to repeat it again the key data structures which were focusing were arrays and dictionaries and based on the interviewer based on the team they could ask you trees as well like binary trees and forget about graphs and heaps let's not go there and there could be a few questions based on recursion as well and i would recommend using python because python is vastly used as a data engineer to process data and even to run many of the pi spark jobs and stuff like that for sql i think it's been my bread and butter for a while and even lead code has sql hacker rank has sql there are many online sql learning platforms but uh especially to crack an interview at amazon we need to be pretty familiarized with the redshift nomenclature of sql and trust me when i say redshift that shift has a lot of key attributes apart from the basic sql as well so getting an understanding about redshift and the key sql about redshift gives you a lot of understanding on how the how the redshift as the architecture or the infrastructure works and that would be really helpful and for data modeling or dimensional modeling i've gone through all the different basics which are available and they could be found on google and there's this website called datamodeling.com and that would help you understand different different data models used in different different places and yeah for system design and architecture i have uh there is this uh course called educated dot io in that there are two different two specific courses for system design and architecture grocking the system design interview so i've thoroughly completed those two uh those two courses and they helped me a lot in understanding of uh how the high level architecture and the low-level architecture of a product are built don't worry i'm going to share the links and everything with you guys later that's yeah yeah yeah no but this is this is awesome rahul thank you so much for your time those were all my questions for today thank you and for the etl part i think i've been through amazon load jobs and amazon's existing wiki pages and again etl is very specific to the product which we are using and that is totally based on our experience awesome thank you so much rahul and i really appreciate your time um thank you so much thank you thank you young for this opportunity you
Info
Channel: Career School
Views: 1,182
Rating: 5 out of 5
Keywords: amazon, data engineer, amazon data engineer, amazon data engineer interview, amazon interview topics, dsa round of amazon, data engineering, data engineering vs data science, data engineering career path, data engineer vs data scientist, data engineering explained, data engineer skills, what do data engineers do, what do big data engineers do, what is data engineering, big data engineer career path, data engineer career
Id: e3LonY618P8
Channel Id: undefined
Length: 19min 43sec (1183 seconds)
Published: Tue Aug 03 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.