GIS Tech Tips: Understanding a Shapefile: What is it and How to use it

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] hello everyone welcome to another informative video from surveying and mapping also known as Sam in this video we'll be talking about shape files what are they and how we can use them shape files were first developed in the early 90s by ezri to use with their new product arcview GIS arcview was very Innovative in that it was EZ's first application that was developed to use a graphical user interface or guey this is common in all applications that we knew use nowadays however back then this was a big step forward prior to that everything was command line driven so you had to be a GIS expert in order to access GIS data Arc view allowed us to be able to create Maps perform edits perform analysis in a very userfriendly environment now a single shape file will store either points lines or polygons in addition to that Vector data it also stores attributes that are associated with those features here we see a shape file that's been brought into a map in EZ's rgis Pro product here's the shape file that you can see down here it's the parcels shape file and I know that because it's showing a SHP extension we'll talk more about file extensions associated with shape files a little bit later in the video but you can see the polygons being displayed in the map what you don't see at least initially is the attributes that are associated with that so if I right click here I can open the associated attribute table so this is a database table that is tied to the spatial data you see here so if I go up and and I select a parcel here it highlights or selects the record in the attribute table so this is additional information about this specific parcel so I can see things like the area the perimeter uh a parcel inventory number a parcel ID number the zoning and so on as I create the shape file I get to decide what fields or columns are in this attribute table so that we see the the link that happens between the spatial data and the attribute data now because shape files have been around for a long time most applications that have any sort of GIS capability are enabled to be able to read and sometimes write or export shape files as well this includes applications such as EZ's Arc map and argis pro but it also includes open- Source applications like qgis and post GIS or applications from other vendors such as Autodesk which would include civil 3D and map 3D or Bentley or anagraph Leica Trimble and the list goes on pretty much any of these that have GIS capability are able to make use of shaped files and as a result shaped files have become the de facto data sharing format between all of these various applications just because everything can read them which makes them somewhat versatile there are limitations that we'll talk about also later in the video if you're going to use shape files you need to understand the structure contrary to what the name would imply a single shape file actually consists of multiple files at a minimum there are three mandatory files that you must have in order to have a working shape file these are the SHP the dbf and the shx the SHP file stores the spacal data so this is going to be the coordinates that make up the points or the lines or the polygons that are part of the shape file the attribute data is stored in the dbf file so those attributes that I was just showing you those are coming out of the dbf file now the dbf is based on an old database format called dbase specifically dbas 4 now dbas was originally developed back in the late 70s and continue to evolve into the 80s and early 90s now because of that there are going to be limitations on what you can do and store within the attribute tables associated with a sha file because it's built on that older technology we'll be talking about those limitations again a little bit later here in the video so make sure you stay t tune and keep following lastly you have to have an shx file this is an index file that links the SHP and the dbf together so when I do things like you saw just a second ago when I select a in that case a partial polygon it also selects the related row or record in the attribute table or vice versa I can select a row or record in the attribute table and it will also select the associated spatial feature the sh X is the file that links that together so that happens now there are a lot of other files that can be associated with a shape file here we're looking at a shape file in Windows File Explorer so this is what you would see if you weren't looking at it in any sort of GIS application you can see the files we were just talking about we have the dbf we have the SHP we we have the shx those are those three mandatory files but you can see other files here as well such as a prj or an XML you'll see SPN s SPX ATX idx just to name a few each of these additional files though not mandatory do serve a purpose for example the prj file identifies the coordinates system or projection that the shape file is stored in this could be wgs84 latitude longitude it could be web Mercator auxiliary sphere which is used by many web applications including Google Earth or Google Maps it could be UTM or your local state plane coordinate system that information is going to be stored in the prj file now there is one caveat to this in the case of most shape files that's going to be assigned manually so it's not unusual for that to be assigned incorrectly I've encountered many shape files that I've gotten from other places other people that had the coordinate system and projection assigned incorrectly which meant the prj file was wrong what that means is that when I bring that shape file in with other GIS data it will not display in the correct location it will be shifted or and potentially scaled in some form or fashion the XML file is another good one to have it stores the metadata associated with the shape file if you're not familiar with the term metadata that means data about your data so if somebody's taking the time to create the metadata for the shape file you should be able to see things like why does the shape file exist how often is it updated how was it originally created are there any restrictions such as copyrights associated with the use of that shape file that can all be stored in the metadata you'll find that a lot of shape files don't have metadata that was a relatively New Concept U compared to the age of shape files so those will be missing there are several types of indexes that can exist with a shape file a spatial index an attribute index a geocoding index and there going to be files for each one of those if they exist the nice thing with shape files is that most applications when you use a shape file if they need one of these indexes and it doesn't exist it will often create those automatically so if somebody sends you a shape file and it doesn't have any of those indexes you're still should be okay it should still work it may not perform optimally when you do searches and queries and things of that nature but it'll still display you can still open the attribute table you can still select features and there's going to be others this is just a small sampling for example you may have a cpg file that's character encoding file that would be associated with it and again there will be multitudes depending on how you use the data the key to remember though is you have to have those three mandatory files the SHP the dbf and the shx as long as those exist and they have not been corrupted and you may ask yourself well how can they get corrupted well some of these files can be opened by other applications that are not Gis for example a dbf file can be opened in Microsoft Excel if somebody were to open a dbf that's part of a shape file and edit it in some way they delete a record meaning a row they delete a column or a field and then they save that back into the the dbf format from Excel then that's going to break the shape file because that dbf no longer matches up with the SHP or the shx it's important to know that don't go trying to edit or work with these files outside of a GIS application a GIS application like rgis Pro like qgis understand the relationships of these files and they're going to allow you to work with them in a way that doesn't corrupt them so they still work it's important to remember that now several times I've mentioned that shape files have limitations and they do because they are built on Old technology such as dbas 4 that we've talked about we have to think back to what computers were like back in the 1990s I bought my first computer somewhere around 1994 and I remember things like it was Doss based with Windows 3 uh 11 running on top of Doss and it had a 428 megabyte hard drive with 8 megab of ram our phones that we use our smartphones have more power than that computer had more storage than that computer had so it makes sense that any format developed back during that time frame is going to come with strings right so what are some of the limits associated with the shape file well we've already talked about the first one and it's that it's a vector only format so it doesn't store raster at all so it's going to store those points lines or polygons and it's going to be further restricted in that each shape file can only store points or it'll store lines or it will store polygons going back into rgis Pro we can see this I've got a folder here called Art View and in it I have a bunch of individual shape files and you can see by the icon this is a point shaped file this one's a polygon shaped file this one's a line shaped file and you see that continuing on this is not a shape file this is a raster I know this is a shape file because in the case of rgis pro and the same is true of art map it's highlighted with green icons that lets me know I'm talking about shape files and of course the fact it's showing this SHP extension so unlike other more modern GIS storage formats like Ezra's Geo database or even AutoCAD DWG files those can store multiple entities whether it's points lines or polygons or in some cases even raster in a single storage format shape files only store that single entity it's either going to be a point shape file a line shape file or polygon shaped file so that's important to note also it's limited in size again thinking back to computers in the mid90s and their capabilities they did not have big hard drives they didn't have a large amount of RAM and so storage formats like shape files reflected that the maximum size is 2 gigabytes and that's in any file that's associated with the shape file so if the dbf gets to 2 gbt in the other haven't then your shape file won't store any more information the same is true if the SHP gets bigger than 2 gigabytes you're not going to be able to put any more information into the shape file another issue we have is how it deals with null values null values are those that are left blank that shouldn't have any value stored in them well shape files don't leave those blank it's going to insert a zero where their null values present or I should say where null values don't exist right because it's blank that is problematic for a couple of reasons and it may seem like that shouldn't be an issue but if you try to go back and try to identify where data may be missing because somebody didn't completely fill out the attributes there's going to be a zero into that field into that value well is is that one somebody forgot to fill out or is zero the correct value how do you sort through that how do you identify incomplete data if it's not allowed to be blank or null so this is an issue that if you work with shape files you need to make sure you're addressing how do you identify something that somebody hasn't looked at so maybe instead of putting a zero there you go ahead and default to something like 99999 or some other value that would trigger that depending on what type of field it is another limiting factor that really goes back to the operating systems of the time which were were primarily Doss uh Windows running on top of Doss which were 8bit operating systems these had limitations that were then imposed on the application and the file format of the time one of those is going to be the length of both the file names and in the case of the attribute table the field names or the column names they're typically need to be limited to eight characters that's as long as they can be they also must start with a letter and contain no special characters meaning it can't have spaces andent signs hashtags flashes or dashes the only special character that is supported is an underscore another limitation specifically looking at the number of fields that can be included in an attribute table associated with the shape file is 255 so fields are columns those columns that are in there the maximum number in that attribute table is 255 another restriction you're going to encounter That You Don't See in things like geod databases or Oracle spatial or SQL spatial all of these are other GIS storage formats is shape files don't support support rules or behavior natively into the format so what am I talking about well with more modern formats I can apply rules to my data so that I can say things like a manhole must be connected to to a sewer line or partiel polygons must not overlap that helps with my data quality but I can also look at attribute data and assign things like domains which are dropdown pick list so it limits the value somebody can pick when filling out that field I can do subtypes which allow me to create subgroups within my data which then I can apply different rules to I can can also do things like contingent values or attribute rules that constrain the values associated with that field based on other values or it calculates the value based on several other fields again to help with data Clarity data accuracy and data completeness shape files do not support that as part of the shape file it's not to say you can't potentially do other things programmatically that would apply that type of behavior but it's not going to be native to the storage format itself so there you have it that's what a shape file is hopefully you know have a much better understanding of of this format and how you can use it best if you choose to use shape files as one of your primary GIS storage formats if you happen to have any questions please feel free to leave a comment down below below or shoot me an email I'll do my best to respond to it as quickly as possible of course don't forget to give the video a like if you found it to be helpful at all and also share it with others that you think May benefit from the information here also remember to subscribe to our Channel and enable and enable those notifications so you'll be notified when we post new videos up here on the Sam YouTube channel we really appreciate you watching and thank you for all your support we'll see you in the next video
Info
Channel: SAM
Views: 1,104
Rating: undefined out of 5
Keywords: Geospatial, Land Surveying, Surveying, BIM, Utility Engineering, Subsurface Utility Engineering, Aerial Mapping, LiDAR, GIS, Construction Services, Transportation, Utilities, Energy, Architecture, Land Development, Federal, Esri, ArcGIS, ArcGIS Pro, ArcMap, qGIS, PostGIS, Tripp Corbin, Data, spatial, geospatial, basics, how to, what is a shapefile
Id: CEUzbggzir4
Channel Id: undefined
Length: 19min 36sec (1176 seconds)
Published: Mon Oct 30 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.