Data Architects Vs Data Engineers - Is There A Difference?

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey there guys welcome back to another view with me Ben Rogan AKA bless y'all data guy today I want to talk about data Architects versus data engineers and honestly this is a question I get a lot a lot of people know the difference um and sometimes I have a hard time maybe answering it because for some companies the data Engineers are stuck doing data architect work and so it's kind of this interesting place in fact I just talked to um Jeff who is a director of architecture at Disney one of the things he pointed out is that some companies just kind of have the principal data engineer or the very very senior manager or something act as the data architect so not all companies have a data architect but it is still a very common role especially at large Enterprises so I do want to talk about generally the differences between these two uh terms as well as how they work together because they do tend to work very closely together now the simplest way you can kind of understand data Architects versus data Engineers is it's not that different than perhaps a general architect and Builders you know you generally have someone plan out what is going to happen how you're going like you know the rules you're going to follow what something's going to look like high level uh Concepts right like the architecture the rules of how some something will be built the the actual principles that you will follow for this specific project um that way towards the end as as you have you know even if you change out Builders the end vision is still the same and I think that's what's kind of important is at the end you have this thing that fell followed through there's this example um from the book The Mythical man month um that kind of covers this how you know old um building projects that took like 50 years to build and had multiple Architects or multiple people build it over that time period still manage to have that Core Essence of being the same project because they followed a certain set of principles and rules to build you know with that Vision to actually build something that looked correct you know that looks like it all fit One Vision and so that's generally the architect's goal amongst the fact that they're also trying to obviously meet all the requirements and things of that nature so they really set the Rules of Engagement and then the data Engineers implement it and so that's the basic way of understanding it but now let's talk about the specific roles and then we can go a little deeper so first of all let's talk about data architects when they are at a company they have obviously a certain role and as I kind of outlined it earlier the role is that they kind of are the ones that create the architecture that design the rules and everything that's basically focused around defining how things will be built they're not the ones building they're the ones defining how it will be built and this is very typical whenever you hear the term architect right like it's more about the vision and how things wouldn't built skill wise that means they need to be good at obviously data modeling um and this is why sometimes it's interesting because a lot of data Engineers have to date a model at many companies but at larger Enterprises most likely you have an architect planning out the data model which also means they need to have a good understanding of the business so there's this need to not just be good uh technically but also be very good in terms of understanding what the business needs and this is why you'll generally see a like principal data engineer maybe a thing take more of this role or staff or someone that's a little higher up because they're generally good at also understanding the business they're also likely very familiar with different Technologies at least again at a high level so that they know hey if we use SQL server or if we use this or we've used that what the trade-offs are so that when they're picking these different components they know why they're picking it they know why you shouldn't pick it so that's kind of their role their role is really just planning things out and and then also while the thing is being developed you know making edits or making changes based on you know different nuances that come up in the data engineer see something and like hey this won't work for whatever reason um and they try to continue to with that same vision approach solving this problem because what would end up happening if you have essentially you know every data engineer solve these little problems in their own way is in the end the infrastructure would look like and I've seen this like 10 people built it right like there was no single design that developed a system which is difficult to maintain but when you have an architect and one vision and one set of rules and one set of like architecture and principles that you're following whether it's datavault data warehouse you know whether you're using camel case or snake whatever you're doing it will all feel like it's the same project and that's important and so that to me is the data architecture role right like that's the skills they have again do modeling go to taking business requirements translating them good at understanding the big picture of Technologies what each of those Technologies do their pros and cons and then from there they take those skills and they're the ones that actually are making the designs they're defining how the data will be stored where it's going to be consumed how it you know all of these things that are very high level now when you compare that to a data engineer and if you watch this channel you should know kind of the skills of a data engineer you'll see there's some crossover and that's because at some companies again like I referenced sometimes these skills are forced together you've got a very senior data engineer who needs to know data modeling and so on and so forth but in its purest form if you look at data Engineers it's more of a Craftsman approach it's more of a discipline and it's less of like this creative approach it's more of following a very disciplined approach to doing doing things and actually doing the work building pipelines and this is why I think some people don't always find data engineering fun is because it can be sometimes a little repetitive in terms of like building the same pipeline over and over again to a degree even when you use out-of-the-box Solutions and it's kind of the same with some software roles depending on the role you know you can just keep building crowd applications but it's all dependent too again companies work for with that that means that your skills are going to be things like SQL python uh you're likely good at working with Kafka or spark maybe you're using snowflake maybe using data breaks but you're good at actually doing the thing you know the person before you planned how to do the thing that's the data to our text rule you the data engineer are now doing the thing again the same way you might have an engineer or an architect kind of plan out the design of a building and then you have carpenters and Craftsmen actually build said building and these are just two different roles neither one is necessarily better they're just very different in theory there's a very different in theory when you're actually applying them again at this point the data engineer's role is to actually develop the pipelines develop the tables that the data architect has said should be developed now in the process of developing set tables and pipelines that data engineer will come up with issues right like oh we can't pull out this data this data has duplicates there's all of these nuances that data Engineers need to deal with and in some cases they will be responsible for dealing with it on their own but in some cases there are issues that are so large they will impact the design and architecture of what you're building right like it's going to completely change everything that table that you thought you could build very easily not easy to build and that's when you go back you bring in the architect again and you revamp and re-change what you planned out I think another way you can look at this is also kind of how when you're doing this project each role acts so let's kind of talk through that if we go through a project let's just go through an example and we're talking about first step is planning requirements design the data architect will likely be the one that again is going to the business asking what they need try and understand requirements and they'll be the ones again defining based off of all that how things be stored what Solutions will be best based on cost based on you know amount of data coming in all of these things the data architect will then go to the data engineers and they'll work together to talk about the feasibility right I don't think the data architect should do it in their own little bubble and come to the data engineers and tell them do this you know they should go with the data engineers and like any other profession the people doing the work generally no you know Nuance level they're gonna know when it's like what you're what you're telling me to do I get what you're saying you know in theory that works it works great in whatever but if you've ever seen like some people talk about designs or if you've ever seen like car mechanics complain about certain ways that Engineers have designed cars you all see that they're like this makes no sense because I can't you know maintain it or I can't fix this thing because without taking all these other pieces out and so generally sometimes the architect might design something that works but might be feasibly hard to either implement or maintain and that's where a data engineer might come in and be like okay big picture makes sense nuanced we need to you know change this around for XYZ reason now in the implementation phase this is where the data engineer really starts you know shining they're the ones doing the work they're the one implementing the actual pipeline right like they're building pipelines building the tables and so they're kind of taking a lead in that regard and then they're kind of talking with the architect to make sure that they understand what's going on uh The Architects likely reviewing the invitation seeing how things are working seeing if you know maybe speed tests or whatever they're doing like how fast are queries running does everything make sense do we have any bottlenecks that we need to fix right now um especially if you're doing migration you're often comparing how one pipeline may be ran compared to the new one and so that's something The Architects likely pay attention to to call out where things need to get fixed and then once it's implemented you kind of got this maintenance phase right like okay we've implemented the thing we've worked together we've fixed all the problems all the weird bugs we've worked together to make sure it's one architecture One Design from here on out you know the data engineer will generally provide maybe some operations maintenance there's not generally always a data Ops Team so more than likely it's the data engineer who's making sure the pipelines are running if data changes if the ones making those like alter statements to add in a new column but they're still collaborating with the data architect who might be providing advice in terms of how to improve certain pipelines again you might find out that hey this one pipeline's getting fat like slower and slower and slower you know we've tried to fix it we're going to need to do an architecture change right because it's no longer managing uh what it was working with before and so that's generally where they'll come in and so as you can see there's they technically work together these two two roles and sometimes it's very easy for these roles to kind of morph and in some companies they do some companies they prefer having one role it's kind of like I've referenced occasionally like data Engineers needing to know data governance and someone once responded like what are they gonna do throw a python in Sequel at it and it's not that data Engineers you know need to do data governance it's that sometimes some companies kind of expect them to or at least some level maybe not the same level as a pure professional but sometimes there's this expectation that you think about it and same thing with data architecture and date modeling sometimes companies don't hire data Architects and that's just you and you need to do the data architecture work so that's kind of the difference between data Architects and data Engineers again they're probably sometimes conflated and together and one role or maybe more of a senior data engineer role but sometimes they're separate hopefully that's helpful for everyone watching um if anyone has any questions about data Architects or state Engineers feel free to ask me I'd love to answer more questions thanks so much everyone and I will see you goodbye [Music]
Info
Channel: Seattle Data Guy
Views: 11,172
Rating: undefined out of 5
Keywords: what is a data architect, data architect vs data engineer, what is a data engineer, what skills do data architects need, salary of a data engineer vs salary of a data architect, data architect salary, data engineer salary, should you become a data architect, seattle data guy, ben rogojan, skills for a data engineer, skills for a data architect, are data architects different than data engineers, data scientists vs data architect
Id: YzORhWjktwY
Channel Id: undefined
Length: 11min 4sec (664 seconds)
Published: Fri Sep 22 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.