Securing PII Data in Palantir Foundry

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in order to provide the services that we rely on  day to day plenty of organizations have a genuine   need to work with pii data that is personally  identifiable information think of banks or   hospitals insurance companies telecoms credit card  processors limiting the exposure of such sensitive   data to only those individuals who need access  when they need access to do their jobs that's a   standard we should hold all these organizations  to in this tutorial I'll show you how to use a   subset of foundry's data security capabilities in  order to bring a common customer service workflow   more in line with best practices for handling pii  data you'll learn how to use markings encryption   channels and checkpoints in order to restrict  access to sensitive data and track when it's   used for legitimate reasons under appropriate  circumstances what we've got here is a rudimentary   inbox app for customer service agents to handle  complaints as with my other tutorials all of this   is notional data for the imaginary ontologized  theme park now most people who visit the theme   park have a wonderful time but some don't and  they file complaints our customer service team   is responsible for addressing and resolving those  complaints for example this person lost their ice   cream cone to a giraffe this person had bed bugs  this person is complaining about our high prices   all pretty typical stuff I do see one issue here  which is that all of this pii data last name   gender date of birth Etc is plainly visible  to all customer service agents at all times   and because customer service agents need access  to the guest object type they can see everybody's   information all the time this is anything but best  practices when it comes to handling pii data now   you may be looking at this and saying that's  not especially sensitive data and compared to   bank accounts or medical records you'd be right  part of that has to do with the backdrop of this   tutorial being an amusement park and I sure  hope no amusement park has that sort of data   but I would also assert that the principle of  least privilege should guide all organizations   and how they manage so-called regular pii data  and all the capabilities I'll be showcasing here   are just as applicable to higher Stakes data  as they are to phone numbers and birthdays   so let's start by taking a look at a version of  mandatory controls called markings in Foundry a   marking as a way of restricting access to  some resource a resource could be a data   set or a code repository or some analysis  basically anything in Foundry is a resource   and so in order to do something with a resource  that has a marking on it including even seeing   it a user needs to have access to that marking a  version of this that we're probably all a little   more familiar with is like top secret documents in  the intelligence Community you can tag a document   as top secret and in order to have access to that  document you need to have top secret clearance   so foundry's markings are just a more general  purpose version of that you can tag a resource   within marking that's like putting out a top  secret stamp on a document and you can grant   access to that marketing to individual users  that's like giving someone top secret clearance   we're going to use markings to restrict access to  the data back in the guest object type so let's   see how we do that and what the consequences  of doing that are now I'm back as my regular   user which you can tell because I'm using the  different color banners to differentiate when   I'm acting as the uh like data administrator  versus acting as the customer service agent   to create a marking you need to go to settings and  obviously you need to have permission to create   and manage markings to do this I've already set  one up here and markings have marking categories   and individual markings so I have a marking  category of sensitive information and within   that you know we could have like a marketing  for employee pii data we could have one for   guest pii data we could have like other types of  sensitive information and that's how I would group   these another way of grouping marketing is pretty  common is like by geographies right so maybe your   organization has some operations in Europe and  there's a European um marking and there's a North   American market and those might be in the category  of geographies for example or regions here we have   a guest pii marking and details include you know  a little description here marking permissions   these are permissions about who can manage  this marking and then members who is a member   so to apply a marking we need to go find  a resource to apply it to and in our case   that resources the data backing the guest object  type which is in this project here it's that file   right there now we also have this Downstream data  set which we'll get to in a minute called guest   encrypted is in fact not yet encrypted we will be  making it encrypted let's just take a look at the   data and see what it looks like now so this  is the raw data as it came in from whatever   Source this guest data comes from and nothing is  encrypted we can see all the pii data right now   again this is notional data so let's go back and  then if I select this resource I can manage access   because I am the owner of this file now other  users might be able to have view access but not   the permissions to manage who else has access  to this file you need the right permissions to   apply markings to a given resource in addition to  the permissions to apply marketings in general so   I'm going to click View and then okay you know you  have to be a member of the ontologize organization   but currently no markings are required to  have access to this resource I'm going to   click add and then we're going to search for the  guest pii marking click that check box hit save   and now the marking has been applied so we have  this little badge here indicating that a market   has been applied to this resource and you'll see  this badge in other applications like pipeline   Builder or data lineage app and it's something  that shows up pretty much everywhere in Foundry   when you can see a representation of this resource  and if we refresh the page we see that this other   data set now has that arcing too this is because  guest encrypted which again is not yet encrypted   is Downstream of the days that we've applied  the marking to Foundry propagates markings   through the dependency tree of various resources  so if you apply marking to one data set all the   downstream data sets even if they're in different  projects will continue to have that marking   applied to them which means that the restrictions  that that marking imposes will continue applying   on all those Downstream resources let's see what  the impact is on the customer service user now   we're back as the customer service agent user in  object Explorer just like before however now we   don't see any guests and that's because this user  as I mentioned previously doesn't have access to   the guest pii marking so they can't access any of  the data of the data sets that marketing has been   applied to that means that in the ontology this  object type which depends on that guest encrypted   data set that's not going to show up for this  user anymore if we go to the inbox app we should   see a similar thing great we can still see the  complaints because we have maintained access to   that but we can't see any values for the guests  even if we click into here it just says No Object   selected and that's because we don't have  access to the underlying object type anymore   now we've gone a bit too far because obviously  the customer service people can't do their jobs   if they don't have any access to guest data  to fix this we're not going to totally remove   the marking however we will stop propagating it  and in order to continue protecting the pii data   we're going to introduce a new capability which  is encryption channels let's go see how those work   now we're back as my regular user you can think  of it as an admin user and we're going to create   an encryption Channel through this application  called Cipher to do that I'm going to click new   search for Cipher create a cipher Channel a  cipher channel is a way of letting us create   a mechanism for encrypting data with a particular  key and then assigning licenses to various users   or user groups to do certain decryption or  encryption actions based off of that key   so I will give this a name which is uh I'll  just call this guest pii Channel I'm going to   go to deterministic encryption hit next and  then I'll let Foundry generate a key for me great and now we're done the next step is create  a license and so the immediate thing I want to   do is encrypt those fields of that data set so  to do that I need an admin license because I'll   be using python transforms to to do this so I  will just call this um yes pii admin license   and I don't need to allow decryption I do need  to allow encryption and I acknowledge the risk   here hit submit great now we have this license  going back to the project we now have two new   resources here one is the guest pii Channel and  one is the guest pii admin license so using that   license I can then go into my python Transformers  repository and start encrypting these columns   great so right now this Transformer is  not doing any encryption it's just uh   kind of like an identity transform taking  the input data set and passing it as the   output data set let's change that so I'm going  to create a new Branch Taylor and crypt fields and then I'm going to grab  this code that I already wrote and what this code does is in lines 18 through  21 say for each of the columns that I want to   be encrypted use this particular channel in  order to encrypt them and that particular   channel is specified in the transform  decorator here where a crypto equals   encrypt your input and then I just give  the pass to that resource which is in the   same project so it'll use that channel it's  key to do the encryption for these columns   now the encryption will only take place  after I you know commit this uh create a   PR merge it back into Master then build it  on the master Branch so let's do that now all right my PR has been closed and if  we go back to the master Branch here   we see that yep the code is updated and these  comms will be encrypted now so let's build this   and see what effect that has now okay our build's  done and now we can see that the data looks pretty   different instead of being able to read the last  name or the gender or date of birth we have this   Cipher then this rid that's information pertaining  to the channel That was used to encrypt this data   so now this is what people see for this Downstream  data set the next step is to stop the propagation   of the guest pii marking we've encrypted these  values we no longer need to restrict access to the   entire data set because those values are protected  now to do that we're going to modify this input   data set argument of the transform decorator  here and all we're going to do is tell it which   marking should stop being propagated and on which  branch so let me update the code to do that now with this code updated to stop the propagation  of this particular marking on the master Branch   let's build the data set and observe  the impact on the customer service user   once this build finishes the marking will no  longer be propagating to Downstream data sets   and then we can take a look at what's changed all  right builds done looks the same from this Vantage   Point let's go to the project view however and  see what's changed this data set no longer has   the marking badge and that's because we've stopped  propagating the marketing to Downstream data sets   now let's switch over to the customer service  agent View and see how their data access has   changed now we're back in the same project but  as the customer service user and you'll notice a   couple of different things for for one we can now  see this data set for another although we can see   the channel we can't see the admin license and  that's because I didn't mention earlier but I   also applied the same guest pii marking to that  license so that only the fairly small number of   um admin users who need to be able to encrypt that  data have access to that license additionally as   before we don't see the guess raw data set  because that still has a marking on it if we   click into the guest encrypted data set we can  see that we have the same view as what we are   seeing in the transform as the admin user we can  see first thing because we didn't encrypt that   customer ID for the same reason but last name  has this Cipher rid here so as gender and data   birth and phone number in order to restore  full functionality to the inbox app we need   to do two more things one issue an operational  license the customer service agent so they can   decrypt individual values on an as needed basis  for example if you're going to call someone to   talk to them about complaint you probably  need to decrypt their phone number first   secondarily we need to go into ontology management  app and change the schema of this object type when   we encrypted those fields we changed their type  from a date or a string to a new type that takes   into account the fact that they're encrypted  this introduces a schema mismatch from the   data set they're sinking into the object back in  store and that causes the indexing job to fail so   even if we issue that license since the new data  can't get synced into the object backing store the   you know there just won't be any data in the app  so let's go quickly take care of those two things   okay now let's create a license for operational  users to be able to decrypt certain Fields within   the inbox app or object Explorer to do  that we're going to go to the channel and we're going to create a new Cipher license  this one for operational user license I'll call   this guess pii Ops license we don't need to allow  encryption we do need to allow decryption on top   of that I'm going to enforce a rate limit that  is a quota on how many times per day a given   user can decrypt values so let's set that at  oh um maybe 70 values per day and so this is   individual Fields so that's probably two per any  given ticket phone number and last name at the   very minimum so 35 tickets over the course of  the day that seems pretty reasonable and above   that well they just can't decrypt any more values  and while we're here I just want to emphasize how   I've got the other permissions set up for this  project so for this project if we go to access   we can see that everyone in the organization is  granted the discoverer role which means they can   see that files exist here that the project exists  but they can't click into the files and see their   content so um most people can see guests encrypted  that it exists but they can't see its content so   they can see that this channel exists but they  can't edit it they can't view it things like that   and what I've done to enable the customer  service workflow is go to this particular   data set guest encrypted and added that group  customer serviceation group as a viewer of the   file which will let them operate on that data in  the ontology I also need to add them as viewers   of the guest pii Ops license now we can move on  to reconfiguring the ontology here we've got the   guest object type and we can see that under data  sources and index failed and again that's because   there's a schema mismatch between the newly  encrypted data that is being fed into the algae   data store and the configuration we had previously  in the ontology management app to fix that let's   go to properties and change some of the types of  these properties we'll start with date of birth   previously previously this has been date but now  it's an encrypted value it's not a date anymore at   least from the perspective of all the applications  that engage with that encrypted value instead of   the underlying decrypted value so to change this  we're going to pick Cipher text and we're going   to do that for all the encrypted Fields so gender  is also Cipher text as is last name and phone okay   we'll save our changes to the ontology that'll  kick off a new index as happens whenever you   change the schema of an object type now that we  fix a schema mismatch problem and issued a license   to operational users finally our customer service  agents can resume using the complaints inbox app   now we're back as the customer service agent user  and as a quick aside here's what it looks like   when you have discover access to a project but  not view access you have the ability to request   project access but you don't have any other  abilities here and again because the customer   serviceation group was granted view access to the  Ops license and the guest encrypted data set well   you can view those even though your default Row  for the entire project is just a Discoverer yeah   again we've got the encrypted values here we can  see the rest of the data that looks pretty fine so   if we click on guest this will take us into object  Explorer where we can see all the results here   and we can see that yep we can see some of the  fields but we can't see all of them however now   since I have a license to decrypt data again  me being the customer service agent user I can   decrypt some of these values so I just click there  it says hey you know here's a decryption request   um it's like my email my username that is and  here's the time I did it or we'll do it I had   to click submit to actually decrypt the value and  it tells me how many requests left because again   I set a rate limit a quota per day of how many  decryption events one of these users can do so   I'm going to click submit just to show us what  this hap what this looks like and then the real   value is shown here for me and again I can call  this person to resolve a complaint if necessary   let's pivot to the inbox app here see what that  looks like so yeah we can see the guests now and   let's try resolving one of the complaints click on  this one from Selena who is annoyed that the wait   for the roller coaster was for nothing because  it broke down and to access the phone number I   would just click on the encrypted value it tells  me how many decryption requests left I have for   the day I click submit and now I have the phone  number to call this guest and you know maybe offer   a coupon as an apology that I can click resolve  complaint change the status from open to resolved   and say what the resolution was called Selena and  offered a 50 off coupon for the next uh Park visit   resolve complaint back here and that one  disappears in the list because we have a default   filter for only the uh tickets the complaints  that are open but we can see that if we go click   on the resolved ones here are the resolved ones  this is much better than it was before no longer   do all customer service agents have access to  all guests pii data by default it's only on an   as needed basis and per field so even if you  need to decrypt the phone number for someone   doesn't mean you need to decrypt the gender or the  birthday if you need those you have access to them   because that's how I set this license up but you  don't have view access by default although this   is much better we can do better still and that's  by adding what's called a checkpoint so let's go   take a look at checkpoints and see what they can  do for us the checkpoints application and Foundry   allows you to track certain actions users take  throughout the platform and prompt them for a   justification for that action common examples  include downloading a file or decrypting data   by prompting users we can remind them of data  handling expectations and record a reason why   they needed to take that action this information  is logged for review if needed I'm going to set   up a checkpoint for decrypting guest pii data  to do that I go to the checkpoints application   and go to configuration click configure new  checkpoint the checkpoint type will be Cipher   decrypt but as you can see we have all these  other checkpoint types available to choose from   um I'm going to it's for the intelligence  organization that's correct I'm going to add   a condition where the condition is it will apply  to the users in the customer service agent group   hit next and this will be the guest pii oh this  is the total The Prompt not of the checkpoint so   um why do you need to decrypt the guest  pii prompt uh please provide the reason   I think this is pretty self-explanatory we don't  need a description hit next and now you can choose   between an acknowledgment this is useful for  you know a message like oh by the way you're not   allowed to upload certain types of data through  the front-end import a drop down which could work   for our case you know if there were a set number  of reasons why you could decrypt this data I think   that's actually pretty realistic for this workflow  I'm going to go however with a free response you   I don't need any validation for that and I don't  think I need any placeholder text again this is   pretty self-explanatory what's going on here I'm  going to hit next and the checkpoint title this   is for people reviewing the logs how this is going  to appear there so I will say yes pii decryption checkpoint and create checkpoint okay now let's  take a look at what this means for the customer   service user all right I'm back in the complaint  inbox app let's try to resolve one of these   oh I think I'll click on this one margin I got  pickpocketed so like before I'm going to decrypt   the phone number let's see what happens here okay  so like before it tells me how many requests left   I have but instead of just giving me a submit  button here with my username and then timestamp   I now have that prompt and I need to provide  a reason why I need this data I need to call   the guest to to the complaint I'll hit submit and  now that is logged and I have access to the phone   number now let's go see what that looks like  in the checkpoint set we can see there's one   more event in the checkpoints log and it's by the  customer service agent if we click on that we can   see the user justification provided and we can see  other details as needed such as metadata like you   know what was the cipher Channel used um what did  the checkpoints say to the user all this is very   useful if you need to review the events in the  checkpoint log Additionally you can filter by date   or by resource or user and all of that will Aid  in whatever auditing workflow you may need to do   our complaints inbox app and the guest object type  in our ontology are now both in much better shape   when it comes to how we're handling pii data to  recap we added markings to protect the unencrypted   raw guest data we use Cipher encryption channels  to encrypt certain columns in that data set and   properties in the ontology we issued licenses  to an admin user to use transforms to do that   encryption and operational users customer  service agents to decrypt those values in   the context of their workflows we added rate  limits to that decryption so that no one user   can start trying to abuse the system and we  had a checkpoint so that we can prompt users   for why they're needing to decrypt that data  and we can track those over time markings and   encryption channels and checkpoints are just a  subset of foundry's data security capabilities   there's also projects and organizations and roles  and more and all these interact to offer a very   flexible and Powerful data security framework if  your organization wants to learn more about data   security and Foundry feel free to reach out to my  business ontologize which offers live trainings   and self-serve courses that teach people how  to use Foundry effectively as always if you   have any suggestions for future tutorials or  questions about Foundry email me through the   contact form on my website ontologize.com or  message me on LinkedIn I'll see you next time
Info
Channel: Ontologize
Views: 338
Rating: undefined out of 5
Keywords: palantir, palantir foundry, foundry
Id: a6evESSwZaA
Channel Id: undefined
Length: 24min 8sec (1448 seconds)
Published: Tue Aug 01 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.