How to Do a Data Governance Assessment

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi everyone my name is george your friendly data guy and i'm here today to showcase how you could do an assessment of your environment before you start implementing data governance this is really one of the first few things that you need to do to do that environmental scan and understand where you stand and what the challenges and priorities are as well as you know all of this information will really help you put your business case together for data governance so let's see in this lesson how can you do just that how can you do an assessment for data governance yes there are a few other things that we need to do before this assessment which i covered all in my course on practical data governance but now again let's understand where stakeholders are what their main pain points are maybe even what our technical and information and data landscape looks like so in this lesson i'm gonna cover it through a high-level ad hoc assessment okay we're kind of just trying to get an understanding of what's going on in our organization in the organization that we're trying to implement data governance now another way another way to do this is to adopt the data governance material model which i have a whole course about in other free videos as well uh this is covered in detail in other areas now what do we want to do with the ad hoc assessment well we want to first understand the pain points and i like to note these pain points from a data perspective but at the same time to really understand what's the impact to the business that's what we care about so looking at this from a data perspective i like to divide it in these three streams that you're seeing right here data acquisition which also includes data creation data maintenance and data dissemination why because at a high level this is the process that data goes through you can also include the um the data destruction here archival you know as another stream but those pain points they tend not to be as dire as the other three streams that being said if your main driver for data governance program is a regulatory compliance i would also add this data destruction archival to this list of pinpoints because those regulations kind of tend to go over their retention policies procedures and right to be forgotten in the case of gdpr and so on but regardless how do you uncover these pain points in these categories well i would start in an informal fashion through meetings and interviews and you know this could be done in person or by email or phone call i haven't done this through text messaging or whatsapp yet but who knows maybe that's gonna be on the table in the future now some also like to do some job shadowing for a few days with certain you know individuals and even though that's insightful it can take quite a bit of time and can also be a little bit of uh you know nerve wracking for the person that you're shadowing so instead i prefer organizing some drop-in sessions and workshops where you know people just join in and come and really tell you their their pain points their struggles and so forth and so on so let me give an example of a workshop that he can organize i invite people from different parts of the business that interact with the data in one form or another and those invited usually include a mix of business functions from support roles to management roles so you kind of have the whole gamut there i also give them a set of posts and notes yes posted notes very helpful very tactile and really using post notes i ask them i i'm asking them to outline at a high level the challenges and issues that they are facing when it comes to these three streams that i mentioned the data acquisition data maintenance and data dissemination now there's no limit on how much they can put up there you know on post-it notes i guess the only limit as the time but also recommend letting people know of the exercise well before the workshop so that they have time to prepare in fact in certain workshops i'm asking them to come with the post-its already kind of written down and then i guess by being inspired to see what other people have written they could keep on adding more during the workshop now um i usually run a few of these workshops with different sets of people but in each one i also tend to have someone from it from the technical side then as a group we just categorize these entries and kind of figure out which ones are duplicates and there will surely be some overlap right after the workshops are done you will end up with something like this okay you you have the three streams in each one of the three streams you would have different issues listed such as you know like how we have it now for data acquisition the fact that you people are complaining of the multiple data sources for the same thing uh the manual process involved in dealing with the data the fact that there's no standards there's redundancy of efforts there's multiple ways of storing the same information in different areas there's no data validation missed opportunities and so on for the data maintenance piece i see people complaining that there's highly manual there's no data classification in place so you don't quite know what's high risk sensitive information maybe you can when you spot it but you can't usually quickly point out to knowing yeah that table does contain sensitive information we shouldn't share that in that way with those employees i know there's data integration issues uh lack of data cleansing is not consistent it's not timely enough it takes a while for things to happen there's a lack of data accessibility always takes a long time to get access to something for data dissemination there's no golden record there's no master record that can be easily identified and put together usually it's bottlenecked by iit you know i love it but they usually tend to be the bottleneck because they lack the resources it's not easy to kind of repeat some of the results reports data that's being pulled out you might also get inconsistent information data cleansing sometimes happens after data polls are being done after the data's are being extracted out of a system and prepare for other processes the data gets cleansed in that transition phase but it never gets back into its source system you know there's a lack of definition so two people are looking at the same thing but interpreting it differently and these are just some examples that are coming out of these workshop okay what do you do well here's what i like to do here you can use the same workshop to prioritize these issues so let's say that we have these three issues being noted lack of standards duplicate records and misuse of business terms here's what i recommend give everyone 10 stickers you know kind of those those green bullets green stickers that you see there and they could basically use that to vote on the issues that they think should be addressed first and they can choose their stickers you know to vote for the same thing multiple times or just allow them to only vote for one it's your call i liked the first option just if they think you know lack of data standards is so important haven't put their entire 10 stickers on that post-it note if that's you know how strongly they feel about it and afterwards they they keep voting on it and you can just tally it up and from the votes you can kind of see the working group's perspective which should be tackled first i think this is a fun exercise again it's highly tactile and people feel engaged that being said the groups are not the ultimate decision makers though but it's good to get them involved even though you might come back to them and say okay uh the misuse of business terms it's definitely high on the list here but we actually should tackle the lack of standards first because that will help us identify the business terms need and define them and so forth and so on more on that in another lesson okay so that's how this is again a fun exercise fun workshop idea fun and insightful i would say at least from my perspective on how you can identify the issues from a data perspective for those three streams that we went over next on the list is some sort of an ad hoc assessment of really people we should we should really try and gain an understanding of who the sponsors are those that are most affected and the champions how will you identify all these well through the same methods that i've mentioned before in identifying those pain points now through you know those meetings and workshops for the most part right let's start with the sponsor when you're doing this assessment you might already have a sponsor that's tasked you with doing this assessment but there are many instances when that's not the case and the sponsor needs to be determined usually usually you have one sponsor for data governance but there are programs data governance programs that have multiple sponsors and that's even better you know if you have three major lines of businesses it's great to have a sponsor from each line it's kind of like having your program indoors three times okay so you can have several you don't need to stop at one more on that in another lesson the most affected the most affected individuals are those that are most affected by the status of your data by the lack of a data governance program for this it's it's great to look at who is creating that data who's managing the data who's ensuring its data quality security and so on but also who's consuming that information based on data who's analyzing it who has business processes that are dependent on data and ultimately who's really complaining the most these are people that you want to have in the workshops i mentioned before so you want to keep engaging with them with your data governance implementation similarly your organization would have some champions in its ranks maybe that's you maybe you're one of those champions now these champions they tend to be the unsung heroes that go above and beyond and really they care about the quality of the data and yes you know they're not always i.t it's people on the business side that take it upon themselves to manually correct the data that they work with they're people that have a lot of business knowledge and understanding why things are the way they are and how they should be they are people that usually create workarounds and you know fixing the current situation to meet their needs these tend to be data stewards but without having the title of a data steward nor their responsibility in their job description these champions are also people that are already voicing the importance of managing and governing data maybe without you know that clear explanation but that's what they are wishing for these are your biggest supporters these are people that don't need any convincing as to why data governance is needed and moreover they will be there to help you promote it so make a list of these individuals they're highly important now the last piece of your ad hoc assessment is to understand the technical and information environment it's not that important at this stage of your pre-data governance implementation work but it's definitely nice to have as you will uncover all these along the way um while you're implementing data governance so what are these things well the data sources and systems that work with your data any data management and governance tools and you know i don't know any other artifacts that we should care about and let me give you give you some examples data sources the data sources and the systems all right now it's good to be aware of what systems and databases are currently in your environment and you know at a high level high level right now and same with a data flow again a high level data flow uh there are different asset management tools that can track all of this sure but if there isn't one you can always track it in excel and here's the template that i use for that there you go and i'll post a link to uh to it um to that if you like to download it as for the data flow here's again an example of a high level diagram if you just create this something like this you have a great win it's it will just give you an understanding of the different data sources and since systems in your ecosystem and how they interact with one another again at a high level if you don't explore this at this point this is definitely not a fact that you can put together a layered date as it will help you identify some of those technical data stewards system and business owners that should be engaged for various projects and so on all right let's let's move on here the next on the list are those data management data governance tools such as the business glossary data dictionaries data catalog do they exist for the most part probably not but is there anything on data lineage is there anything on data profiling data classification reporting tools data visualization tools data security and so on probably there there's some level of all of these but this is really just for you to have a high level awareness so it's a it's a nice to have and i like to keep on enforcing this message that data governance is a business function i know we're talking a lot about technical stuff right now but don't forget that data governance is a business function and even though we are talking about all this technical stuff they they just support the application and enforcement of data governance okay lastly any other artifacts that we might have so you know a report catalog so a catalog of all the reports that the organization might have a data model any score cards and data quality standards again most likely these won't exist but it's good to confirm you never know what's what's happening at a local local level sometimes you would be surprised some departments kind of take it upon themselves and they are already documenting some of these things they are already putting some effort into it so you can definitely try and capitalize on that for sure okay now you also want to know how much information is documented how much information lives in people's heads and if things are documented how are they managed who's maintaining these documents chances are that most of these if they exist are being maintained by those champions that we've identified prior to this but they're just not shared broadly lastly here's here's my sort of conclusion for the ad hoc assessment these things we're not really doing them sequentially we're not finding this out you know what are the pain points first and where the people then what's the tech and information environment looking like no usually you kind of start to do all of these in peril a little bit of each and as you've seen they kind of feed each other out all right so this is one way this is how we are concluding our ad hoc assessment which is something necessary that you need to do before you're implementing data governance it's something crucial i hope you found this to be helpful again another way of doing it is through data governance maturity models but that's another topic for another lesson thank you so much i i hope that you find this to be helpful and if you like to learn more please check out lightsondata.com and the courses that i have on data governance cheers
Info
Channel: Lights OnData
Views: 1,709
Rating: 4.9499998 out of 5
Keywords: how to do a data governance assessment, data governance assessment, data governance maturity model, data governance best practices, data governance examples, what is data governance framework, data governance pain points, assessing your data environment, practical data governance, lightsondata data governance, data governance course, how to do a data goverance assessment
Id: zvOEt6OatbA
Channel Id: undefined
Length: 18min 54sec (1134 seconds)
Published: Wed Jun 02 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.