The AI Data Protection Platform 2024 | Auto discovery and in-depth control over AI Apps

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hello everybody welcome to this breakout session on protecting data with AI and securing data to Applications my name is Vian suban I'm senior director of product management at zscaler the zca data protection platform is a comprehensive platform that looks through multiple channels both for data in motion as well as data in we focus on five channels so data in line data setting addressed in SAS and public Cloud applications data setting addressed at endpoint and getting transferred to endpoint devices such as printers USBS and so on email DLP as well as dspm and data security posture management a first part of our data protection platform is a comprehensive AI powered Discovery and classification for all data across all channels we provide discovery of what kind of data it is which applications is the data going to and giving you complete context across it for data in motion all data going to your public Cloud applications SAS applications as well as Internet bound applications we provide full visibility across the data which application it is going to and full control across those applications this control involves both classif C ification coaching as well as admin experience we also scan data addressed sitting in SAS applications public Cloud applications as well as data sitting on the endpoint itself AI power classification we're making a fundamental change in how data can be classified we moving away from just simple rexs or EDM IDM OCR the Legacy techniques and adding to with ML based classification this ml based classification provides contextual classification for both text as well as non- text based data first for text we have a new bbased contextual engine which understands the complete context of the data it's not looking for specific keywords patterns rejects but it understands the meaning of the document the meaning of a particular sentence in the document that way you provide very great accuracy and know false positives for image analysis we use a blip V2 classifier which understands an image rather than looking for a particular driver's license number it looks at the context does it look like a driver's license does it have a picture of a bear does it have a picture of a human being are there lines of text written in it we have improved our classification to include new categories across source code technical documents legal documents diagnosis prescriptions and much more we have expanded our data protection capabilities as well as expanded the ml capabilities to support non us geographies as well as non-english languages our ml classification also provides feedback so we if something has been misclassified an end user or an administrator can provide feedback to say it has been misclassified you can also provide classification from a customer defined cus of categories that means you as an admin can provide a list of documents to learn from and based on that we will provide ml classification and protection we also provide different thresholds to define the ml classification in your policy so let's take a simple example for example in a traditional DLP classification if you wanted to classify a financial document maybe you started with a simple Rex for with the word Bank the first sentence is a financial document it says the borrower hereby agrees to repay the loan in full to the bank so this is something you might want to classify but the word bank can also occur in other sentences for example we had a great party on the River Bank the plane banged to the left before it landed the bottom two cases are typically what are false positives and you do not want these kind of alerts this is the bane of regular Expressions now with our new ml based classification we understand the context in the very first sentence we understand Bank refers to a business establishment that is used for saving money or for commercial purposes in the second example we understand Bank means the land along a river in the third example we understand Bank means inclining or tilting so here from a full context perspective the first sentence will get classified as a financial document but sentence two and sentence three will not get classified by using this machine learning we provide greater accuracy we also provide visibility instantaneously without a customer or an administrator to defin predefined policies from an image perspective in this particular case instead of rather saying it's just a driver's license you will know this is a Pennsylvania driver's license it features a man's uh picture as well as has his personal information so this is well beyond what is typical OCR where you just extract numbers rather it understands the color the context the presence the count of things within an image this can be used to detect things like a passport a check that might be sitting in S3 and image of a check so again both the image analysis as well as the contextual ml based categorization for Text data is available in zscaler across all channels and by default this classification provides instantaneous visibility across sensitive data now the AI classification is automatically present in all our inline channels that means automatically as data is uploaded without defining any policy with zero rejects we can automatically provide you visibility into what kind of data is getting uploaded to various SAS and public Cloud applications is it a medical document is it a mortgage document is it a security and exchanges form not only do we classify the documents but we also give you context across which applications which instance of applications is this data going to and from an application perspective we also provide you visibility into how risky is that application is that application evasive is it PCI certified or not this way you can combine the context of the classification of the document and the risk associated with the application to determine the risk of data extration this provides instantaneous visibility and then you can take policy to take control across what data should go to which applications or application instances and in other cases provide both coaching as well as blocking the exfiltration of this sensitive data on the UI you can clearly see the total number of files you can see which applications these files are going to which users accelerating these files and you can drill down and get the complete context of the data sensitivity the application and the user and this is a three-dimensional Matrix through which you can pivot we are now newly introducing this AI Discovery across our endpoint DLP the zscaler endpoint DLP does not require any new agent it's already built into the zca client connector that is already installed on most of your laptops and desktops we provide protection across four channels removable media printing Network shares as well as personal storage applications these may be applications which are SSL pinned and cannot be inspected in the boxy this protection is provided both across Mac as well as Windows so this provides the UniFi data protection across multiple channels you can leverage the existing DLP policy that you built in other channels such as API casb or inline DLP you have the exact same instant management you have the exact same user experience for the end user and you build your policy once and you deploy it across multiple channels now with this visibility we can automatically classify what kind of sensitive data is getting excelerated again in endpoint DLP you do not have to deploy any policies you do not have to deploy any engines out of the box using the same ML and BT algorithms we will automatically classify all the documents we classify both documents which are getting copied to external channels such as USBS printers network drives or shared SAS drives but now newly this classification is also available for data addressed that means if the end user is not copying any data just data sitting on his laptop will automatically get classified so across all the Macs and windows in your organization across all the users you can automatically classify all the documents sitting at rest on the endpoint we've also used AI to make sure that the is no bad use experience where we look for things like low CPU utilization and scan data during those times in this example you can see we have 563 users with sensitive data we show you the total file scanned across all these users and we also showed which files have sensitive data you can also see a list of users and for each of these users you can see how much sensitive data they have we can also classify the data setting address using our AI as well as ml category you can get this visibility across the sensitive data how is this trending across different days across different users as well as the distribution by file type now let's go and look at a demo of this AI discovery on the endpoint let's take a break here I'm going to show a demo and then we'll come back to the slide this is our AI Discovery here you can see the list of users the Departments they belong to the endpoints how many files they have on each particular end endpoint you can click through a particular user here you can see armor has 56k sensor of files uh he also has two incidents you we have a discovery of by the engines as well as by the AI categories what kind of documents are sitting on his endpoint um you can also see the distribution of these documents by the file type um for each of the files you get the path and what kind of DLP engines that they match um you can get various activities um and from these activities you can see where is the data going to and you can trigger those incidents which can then be tracked via our incident workflow management tool so you can see here you get a full visibility across all the data and instance associated with this specific user now you can also go to the dashboard the dashboard gives you a holistic view but you can also export the data in this dashboard into a PDF and then you can look at this particular PDF you can give this to an exact to see a trend of the dat sitting across all the laptops in your environment you can get a trend across the sensitive data uh as well as the distribution of file type different engines and policies uh which are matched so this is a quick and simple demo across our endpoint DLP scan okay thank you I'm going to break here and then go back to the presentation let us now take a look at the discovery of data on public cloud in this example you can see all the sensitive data which is sitting across your public Cloud applications and this is again through the classification of AI and ml here the classification happens both for structured data as well as unstructured data so structured data there is data sitting in various databases unstructured data there is data sitting on your virtual machines um on uh unstructured storage like for example S3 as your blobs or as well as your um any of the dis drives within the virtual machines here you can also see that for each of the data what is the exposure of the data is this data publicly available or is there a different canonical user in AWS who can have access to this particular data um are there any risky credentials associated with this so you get a full context of the data it's also important to mention is that this discovery of data across the various databases as well as storage is available across multiple clouds AWS asor as well as gcp you can also look at the list of alerts and each of these alerts provide you contextual alerts across the AWS role the data element as well as the context across the data and the sensity of the data now let's take a look at how how can we secure the usage of gen AI applications most users today in most of our organizations are using gen AI applications and one of the risk associated with this is that your sensitive data can be learned by these gen llm models and once it's being learned that data cannot be unlearned and we've seen this in many data breaches it is very very critical that our sensive data is not learned by these algorithms first zc provides a comprehensive view of all the Gen AI applications that are being used in your environment how many transactions are being used in this gen applications what sensitive data is being uploaded to these gen applications and which users are using these applications so you just get this comprehensive visibility out of the box next you get an industry pure comparison how are your users using these gen applications compared to users in other organizations which are similar to your organization you also get a list of each individual applications that are being used in your environment how much data is being uploaded how much data is being blocked or being prevented from being sent to these applications now you can get this visibility by applications prompts users as well as data transfer you can see the sensitive data that is being uploaded by apps that means for each of the categories of data whether it is legal data financial data real estate data you can drill down for each application such as chat GPT or Gemini what kind of sensitive data is being sent to each of these applications you can also get visibility by which departments in your organization are using jni is HR using it is marketing using it uh most of the data that you're seeing is resumés which are being written uh on some of these gen applications you can get that complete visibility across which departments and then which specific users are using this jni applications you can also see the prompts which means by default you can go and see what data the end user is entering inside this gen applic ation from a privacy perspective we have a setting where whether you want to see these prompts or by default you do not want to see the prompts so you have a choice on Z Scala whether you want to see these prompts or not if you opt to see the prompts for each of the prompts that your end users enter into these gen applications you will get visibility across the prompt both in allow and block modes you will understand what data is being sent to each of these applications once you get this visibility now you have the opportunity to go and control these applications so you can filter on a cloud application so for example you might want to filter on chat GPD and you might say for J GPT I want to provide controls where certain kinds of data is allowed but maybe source code cannot be cut and paste um certain kinds of sensitive financial data cannot be uploaded to chat GPT so we provide actions both allow caution where you can coach the user where you can block the user as well as isolation which is a great option where people can still use these genni applications but it's through an isolated browser so like a large piece of code cannot be cut and paste into these applications we also provide the context so for example if I determine source code is being uploaded to chat GPT then I can take an action to block the source code from being uploaded to chat GPT I can also notify the end user and also notify an auditor so this provides visibility across all the Gen applications that are being used and we provide you control across what data can go to gen applications both from an allow block coaching perspective as well as from an isolation perspective your next steps please get a free trial of zca data protection you can explore our ml powered discovery and you can see this across multiple channels you can gain visibility across your generative AI applications you can understand which applications are used by which departments by which users and once you get this visibility you can protect your data going to these gen AI applications thank you for visiting this breakup
Info
Channel: Zscaler Inc.
Views: 256
Rating: undefined out of 5
Keywords: security as a service, cloud security, zscaler, sase, secure access service edge, digital transformation, secure cloud transformation, zero trust security, zero trust exchange, zscaler private access, zscaler internet access, data protection
Id: NsKhjjEf_wU
Channel Id: undefined
Length: 18min 20sec (1100 seconds)
Published: Wed May 15 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.