Intel ® SGX: Privacy Preserving Multi-Party Data Collaboration Good afternoon. My name is Yulia Gontar and I work at Intel. With me today is Nukri Basharuli, the founder
and CEO of Aggregion. In a partnership with Intel, Aggregion has developed a decentralized confidential computing protocol and successfully deployed it at a number of major companies in various verticals: retail, telecom, and banking. Today we would like to talk about decentralized confidential computing of big data and its use in business. Here are some of the topics we will focus on with our business and technology partners: How first party data is collected and how it may be used for cooperation in a transparent and secure manner; How confidential computing delivers personalized shopping experience; How telecom companies, banks and insurance companies go about developing cooperation and building their ecosystems; Deep dive into Aggregion’s decentralized confidential computing protocol; And representatives of technology companies will speak about off-the-shelf products for confidential computing. Nukri, could you please tell our audience what is the difference between centralized and decentralized computing? Thank you, Yulia. In the centralized model, there is a party that serves as a representative of all other parties involved. This central party receives and stores actual data on its own servers. And there are risks associated with this model. The main risk for data owners, the companies that collect first party data, is that once the data has left your servers, you lose control of it. It’s an issue of trust. This is very sensitive personal data and, in my experience, major companies are not willing to take the risk that their data may be misused and used not as agreed. Now, what we have done at Aggregion is we have mitigated the issue of trust through decentralization. In our model the data never leaves the owners’ servers. There is no central server. Our protocol is deployed as separate instances at each of the participating companies own servers. Each instance of the protocol has its own set of rules written on blockchain smart contracts. The rules determine who can have access to the data and how this data may or may not be used. Now the big question, of course, is how do these separate instances actually work together to collaborate on data? This is where Intel SGX technology comes in. For the lack of a central server, the actual confidential computing is taking place in secure enclaves in each of the participating instances. And even the owner of the instance cannot access the data in use while it’s in the enclave. Thank you, Nukri. Yes, the Intel Software Guard Extensions, SGX, technology has been developed specifically to protect the data while it’s in use. I have asked Jesse Schrater, the Security Manager at Intel Data Platform Security to tell us about Intel SGX and how confidential computing works in a Trusted Execution Environment. We've been really impressed with what Magnit has done with Aggregion to protect their customer data in the loyalty programs, and they're using SGX in some very creative ways to make sure that they can leverage the customer data in terms of analytics, spending patterns and interests to better present options to their customers and yet not expose their customer their private data to anyone else. And we've seen them actually be able to share data between other parties and yet maintain the privacy of their customers' data. So, it's a great example of where Intel SGX is operating in the real world and is enabling that multiparty compute type usage to really do new things that couldn't have been done previously without violating customer privacy. And yet now being able to do it and mitigate the risks associated with any potential exposure or attack that might happen on that data and keep it safe. Intel SGX is a Trusted Execution Environment (TEE). There are other TEEs out there in the industry. Intel SGX really is the most proven of the TEEs for the data center. It's been around since 2015. It's had a lot of opportunity to be tested and researched and hardened overtime, and it also has the granular set of protections because those protections are all the way down at the application level. Where the application can decide exactly which code and which data is isolated within that enclave, and literally everything else in the system, is outside of that enclave, so it's very granular level of controls are placed in the developers’ hands to make sure that they can lock that trust boundary down. What Intel SGX does is it allows an application to speak directly to the CPU and the associated memory, bypassing the entire rest of the system, the operating system, and other applications and the hypervisor, and anything else that's there. So, what it does is it creates an encrypted enclave in memory and the applications code and data operate within that enclave. And so that any breaches in any other part of the system, or any escalated privileges that anybody might have in other parts of the system, would still not be able to see the contents of that enclave because it is directly between the application and the processor, leaving everything else out. So, in that way it creates an added level of protection for your sensitive data to really be separated and isolated from the rest of the system while it's processing that data in memory. So here is it. Now, let’s go over the global trends in terms of the data collaboration. I have put together some numbers on B2C and B2B behavior change according to McKinsey surveys. 71% of consumers are ready for more-integrated offerings, according to consumer sentiment surveys. B2B collaboration evolved due to remote work requirements, and the importance of digital channels. Rating the latter approximately twice as important now as they were before. Which is a 30% increase from the pandemia’s start. Over 40% of companies are developing or considering to develop some data collaboration in various areas, including marketing and value chain optimization. We expect the trend of big data collaboration between companies to explode upwards. I have talked to market leaders in different industries such as telecom, retail, finance and Aggregion technology partners to learn more about their needs. They are Alfa Group, Beeline, Magnit, Mail.ru Group, the Otkrytie Bank, Microsoft. Nukri, your company has many clients from different verticals. They collaborate with each other in specific business cases such as advertising, data analysis and market research. In the course of our conversation, I would like to get your comments on some topics that were raised during the interviews. Sounds great! Thank you, Nukri. The question people often ask when we discuss big data is about the use of personal information. That was the first question I have asked our guests. Let’s hear what they had to say about it. We would like to talk about confidential computing and the first question I would like to ask you is some personal probably. So, we live in the world of digital transformation. A lot of information is being gathered, stored, processed, used by corporations, but on a personal side these are the information that is being potentially gathered from us, right. So, for you personally, on the individual side - what is the most important in terms of the usage of data by these aggregators: banks, retailers, telecom providers? So, for me personally and in the contemporary world we see lots of informational pollution. And to me this is the biggest challenge overall. Whole bunch of different information comes to you some of them is relevant, some is absolutely not. It comes from different sources and the volume of this is such a huge amount that you cannot really observe and understand what's really relevant for you or not. So, for me - security computing, proper computing, proper usage of big data recommendation engine is actually the way how to combat this informational pollution, because this is the biggest challenge. To me the challenge is not a personal data, privacy, government control of the data, control of the data by different organizations, is what you are actually using this data for. And if it is used to kind of clean the world from informational pollution is actually going to make our world a much better. I am willing to share my data, but in return I want the aggregators to make the user experience for me a bit more pleasant. In the beginning this is just better targeting of existing products, but then, down the road, I do have the vision that some products become specifically tailored to me. I like it when there is a contract. When I have something for you, and you have something for me. That gives me a feeling that we have a deal. Imagine that I simply handed over data with no idea what that data is for, or if someone just took data from me without even asking, and that’s something that happens pretty often with careless market players. Personally, I prefer that when I hand over data and know why, so I really like when I get recommendations on good music from an app, like a music recommendation service. I like it when artificial intelligence helps me to learn. I think that’s great, and so I consciously hand over my data, because in turn it creates a greater value for me. If it turns out that there’s no value, I would like to have a chance to tear up that contract. As someone who was responsible for this issue with banks and telecom companies, I understand perfectly that they collect huge amounts of information about us. For me this is just a fact of life. Now, of course this is something you may find scary, it might make you nervous and cause you to worry, but in fact this is what enables transformation and usability. If data is gathered, it doesn’t necessarily mean that someone has access to it. These are two very different issues that are not connected to one another. Data is gathered by someone whom you have originally trusted to collect it. You’ve chosen a mobile operator and obviously they are aware about every call you make because they are obliged to serve that call and to make a record of it. So yes, the fact you have placed a call is recorded, in order to handle billing and invoicing, in order to comply with certain laws. The same applies to financial information that is recorded by a bank. We leave behind a digital footprint, we do it all the time. A digital footprint is extremely useful. This is the case in many countries around the world, when a user makes a decision to share important or sensitive data to get more benefits, income or more relevant information. It just becomes a new way of life. Nukri, you work with different companies that collect personal data. Can you please share you view? It is very important that the companies collecting personal information have direct control over where and how they collect data and where this data goes. This is the only way to ensure security and remain compliant. Each company collects only the data from its own clients. And in our Aggregion protocol, this user data is always separate with data from other partners, it remains within their own premises, never mixing in a central storage. The rules written in smart contract govern how this data may interact, as if it was a license to a digital object. And this license can be withdrawn at any time. We use decentralized confidential computing to securely match data ID’s and if the data owner wants to withdraw the consent to use data, then it becomes unavailable for matching, becomes “invisible”. Yes, George and Fabian have also spoken about the importance of security and forming a customer portrait 360. Each of us have a certain element of customers data: telcos - we know, where people are making phone calls and how they are traveling. Retailers know, what people are buying the stores. Banks know, what the people are spending the money on. But nobody has so-called 360° of what the customer profile is. We know, what they put on the table for the family. We need to help them make better decisions, we need to help them create or get more value out of the shopping experience, but at the same time the customers trust us with a lot of very personal information. And we need to make sure, that this trust is not violated. Let’s see what Microsoft had to say on this subject. So, today as an industry we do a great job of protecting our data at rest and our data in motion. At rest we use technologies such as BitLocker to make sure that our hard disks, that hold our data, if they ever get compromised, our data itself does not get compromised. Similarly, anytime that data passes between two endpoints, which we typically call - in motion, we encrypt that data with technology such as SSL or TLS. But, when we actually operate on the data, when applications actually use the data to compute on it, that computation happens entirely in the clear. What we're trying to do with confidential computing is we are trying to protect that data in use as well. Such that, when your applications use the data to get a business result, that data stays encrypted and completely hidden from any of the attack vectors, that would want access to the data. So, now with confidential computing we can help you protect your data in use, thus completing the data protection lifecycle of protecting data at rest, in motion and now with Azure confidential computing, you can also protect your data in use. Nukri, we talked about personal data. Now, let's discuss, how business use this data in your experience. The most frequent request to Aggregion is related to personalized shopping. For example, if a customer purchases baby products, a retailer can use data analytics to predict their product needs and then create personalized discounts. The mobile operator also has services based on working with big data. And they know a lot about us - budgets, locations, internet searches, social connections, social-demographic data. One of the goals is to predict the needs of the client and offering a relevant service. But also important is cooperating on data services with business partners. Through cooperation a bank can, for example, assess the risk of loan default or help increase sales of insurance. Nukri, more and more data is being collected about us. And George in his interview mentioned that he personally went even further in data personalization. Let's have a look. Only can we deliver the really meaningful and relevant products to the customers when we're going to deliver them in the right time, hundred personalized and properly positioned to them. And this is what our customers would want it. And this can be only done, if we know so much data about the customers, when we combine them. The communication always should be very unintrusive. It should be always on the background, it should be always coming from the trusted friend, it should be kind of appearing in the right time, in the right format and the right form factor. In a contemporary world, we're always thinking: its needs to come on your smartphone, or in the IoT world while you're talking - why not to use this wall. And I'm actually gonna just message appears on this wall. So, you can just talk to the wall, because normal IoT device can use this wall as a microphone, or it can use like, I have a chip, for example, in my hand. Do you? Yes, I do have a chip and one of those few lucky guys who have a chip built in on the in my hand, and I was doing this when I lived in Dubai and we did lots of really cool things with my passport, medical information, all my blood test, all this stuff. You want to become very paperless. What was the reason for that? I want to become like a cyborg, I like this concept with a cyborg. Wow! What an amazing experiment. George is the man of the future. The future of the Internet of Things, where everything around us - gas stations, parking lots, locks, coffee makers, medical devices - everything will be connected to the Internet and George can control it with his chip. The Digital Renaissance Man. Yes. There is a topic of IoT devices used in various areas of healthcare. This technology helps to remotely monitor patients and diagnose them more effectively. Different IoT devices must interact with each other and with different IoT systems. So much data is being generated and all data is highly sensitive, of course. This is why one of the areas of usage for Aggregion protocol is in healthcare, where we get so securely match big data between IoT providers through confidential computing. Nukri, this is a very interesting example. We at Intel also see medicine as one of the important use cases in confidential computing. But let's get back to the interview and see what Fabian and Alexander had to say about their cases for using data. Now, especially since the launch of our loyalty program, we are able to understand what certain things the customer responds to well much better than them before. So, I think there are many initiatives that are ongoing in all of those consumer good brands, when we can help them interact with our customers more directly. Whether it's initiatives on more eco-friendly packaging, whether it's trying to understand trends in regards to vegan, vegetarians in the market. Be able to identify, communicate much more targeted to those kind of customer segments for which those topics are relevant as a service that we can offer to those partners. We are currently a brick-and-mortar retailer who is heavily investing into bringing some of the service online and this combination of online and off-line is something that we need to master. But it needs to become a true Omni channel experience for us to serve our customers as good as possible. So, there's a ton of things that we can improve by simply leveraging our data a lot better, but then in the next step there are certain information, that we can only explore in close collaboration with other players in the market. If we boil it down, financial organizations have several goals in data collection. The first is Risk Assessment. Assessing risks when issuing of loans, during some credit transactions. The second is attracting new clients. The third relates to existing customers. This is the third set of data. Is the so, called Customer-Value Management. Yes, it increases the value of what the bank does for our customers. Nukri, I know, you have many advertising-related cases. Do you? Yes, Yulia. One of the most popular requests we get is the use of our Customer Data Platform for marketing. The platform is decentralized and built on the Aggregion protocol, using the big data of our clients as the source. Advertisers can target audience segments based on this data and then launch marketing communications through various channels. The platform can be used for advertising, surveys, coupon offers, advanced data analysis, market research. The data between partners is matched through confidential computing. It’s a great case of data cooperation. Thank you, Nukri, for this example. You told us about advertising and I remembered the bit in George’s interview about a couch You’re searching in Google for the couch and for the next 6 months you’re gonna be bombarded in all the communication media about the couches, which are available. And that doesn’t matter, that you bought this couch or you didn't or, maybe, right now you're not actually interested in couch anymore, but for six months you are to receive whole bunch of communication. Yulia, yes, this is a great example of why data cooperation is needed. Internet sites and social networks all the time show you ads for what you have already purchased: a smartphone, a sofa, a washing machine. We have developed a sales uplift analytics module that securely matches ad campaign performance data with purchase data. So, the protocol would know, that a specific person saw the ad and then went to the store and bought the product. This is also done through confidential computing, so no actual data changes hands. Sounds great! Another case, where data cooperation is extremely important for trusted communication with customers. By the way, I have another story from George about a proven and effective way to jumpstart cooperation between companies. I'll tell you the real story. We usually don't tell this story in the public, but I'll allow myself. As an exception. As exception we'll do. Yes. So, Alfa bank is leading consumer bank in the country. Beeline is one of the leading telcos. Two big companies, they are part of the same group. And for years, really, for years, we're trying to cooperate. And it's always was the same challenge. We're bigger, we're smarter. No, no, we are bigger, we are smarter. And at the end of the day, nothing ever happened. So, around two years ago, two our CEOs met, and they put together a couple of guys in the team and say: Look, guys, we want you to get married. And how you're going to do this, it's your problem. But in six months, we want you to come back and present, the first fruits, maybe it's gonna be garbage, maybe it's gonna work, maybe not, but start doing something. And this was a really mentality change for us. Because one direction, which we were going for last many, many years; and couple of years ago, this is a mentality change. So, what's happened, then happen this, and it's all about the people. This is actually very interesting. So, on one side, it was me - on Beeline side. And it was somebody, some gentleman called Michael Tuch, from Alfa Bank side. Very bright, very kind of American executive. And we kind of met together and said: Look, we have no option. And in reality, doesn't matter who's bigger, who is smarter. And we all have similar MBAs, similar experience globally, we all play tennis and like all this, various lots of similarity between us. So, the personal level, we met, we went to the restaurant, we had a couple of drinks. Okay, not a couple, we had lots of drinks. At the end of this kind of dinner, we agree that, you know, we're gonna deliver it. And it's going to be working product addressing real needs of the customers. And we'll figure out what's the profit sharing will be, what's the liability shift will be, what the marketing budget, so lets us deliver the product for the customer. And after this, we'll figure it out. To tell the truth, the product is life. And life already a couple of months, we have hundreds of thousands of customers. And we’ve just finished the legal agreements between us couple of weeks ago. So, in reality, it's all about people. So, at certain point of time, you have to lock yourself in the room and make an executive decision, that this is how you are doing it. And in our case, going out and drinking lots of drinks, let's say. So, when operators going to start cooperating and we see some cases when operators are starting to share the data, in this case we're gonna put the puzzle together and will be able to have all 360°. Ha-ha, this is a great story. We see that despite all the digital innovation in the area of collaboration, some things are done just as well through the old-fashioned human contact and a few drinks. That is true. Let’s see what our next guests have to say on the subject. Today the landscape is changing not just for intragroup cooperation, but all across the board. If you want to make a product, you need to stop thinking just from the point of view of your industry. You need to start thinking from a wider perspective. To ensure the value of this kind of horizontal cooperation and partnership, which is never closed, meaning you can cooperate within a group, but it is also just as important not to be closed, to be ready for open cooperation. Based on what has been done, you need to immediately consider what opportunities it can give to potential external partners. So, now it looks like, for instance, that the banking sector will increasingly exist in the backend. It must be integrated with the services, that I truly need as a consumer, to become complementary. For example, I need an apartment, and, so, I need a mortgage, or help to pay the rent. These are the kind of services that are needed. Direct banking services, that are not tied in to real life scenarios must disappear. As a customer, I think I want to see fewer services that are intermediary, non-transparent, irrelevant, and over-regulated. I’d want more services that help me live and deal with my real-life problems. What do you think, what kind of new partners would you like to see here in your ecosystem, the group of companies, in order to move in the direction of what the customers want and need today? I’d say there are probably 2 main trends. One of them, in simple terms, is being complementary in nature. This is something that helps to build the customer experience and create the final product. The second aspect is when someone else can do something better than you can. That’s a really important aspect of a partnership, since it is not only difficult for an organization to try to learn to do everything itself, but it is also a dangerous idea. We truly believe that not only we need to allow for ideas, but also to start from the assumption that someone else might be able to execute them much better than you can do yourself. So, you can just enter into a deal, or even better - a long-standing partnership, - and these skills are passed on, or even just the result of the work is passed on, as it’s not even needed to pass on the skill itself. Our goal is to remain relevant and needed. On the whole that’s all that the customers expect from us. They expect a bank to be useful. Furthermore, if the benefit goes slightly beyond a strictly banking product, beyond just a simple lockbox where they safely keep their money that’s great, the more extra value, the better. This means that loyalty to the bank increases. Therefore, our goal is to be solving the simple everyday problems of our customers. This can be very straightforward and the bank has every opportunity to do this, but there is simply no chance to succeed if we limit ourselves only by the data that’s available inside the bank. Thus, the next stage is the transformation of business to a data-driven model. After this data lake has been established, we now need to go through the Research & Development stage, to build the architecture, find partners and establish collaborations. Today we are constantly seeing ads based on rate-discount-cost, and yes, given the new logic the ads will be personalized to solve your problems based on what is relevant to you. We do know that there are certain information about our customers which other people might tell us a lot better. So, for example a telecom provider can tell us, why a customer is not buying from us on the weekend. Is it because he just doesn't shop on the weekend or is this because he's driving to a dacha, where we don't have a store yet? In the long term and we clearly want to collaborate with anybody, who can bring value either twice or our customers on data. In the short term there are limited resources. And when we decide, which banking or which telecom provider do we collaborate with, obviously, this strategic component, when it comes to companies, that are in one way or another affiliated with us. Obviously, we work very closely with company such as VTB on some data topics and that might for us be a higher priority, than to work with her in more independent bank, which can provide a similar benefit to us in terms of enriching the data that we have on our customers with the data that the bank might have on our customers. I think, when we talk about these collaborations right now, we are very often focused on, who can help us understand our customer better, and we are approaching the topic from the customer point of view, right. However, there might be companies whether it's Intel, whether it's partners like Aggregion, who actually help us on the technology side of making use of the data. Because, I think, there's quite a lot of data, which were not able to explore yet fully and the technology partners for me are equally important as the partners, which can enrich our data. The Magnit case that we've been working with Aggregion, is a very innovative use case because it helps us achieve that business outcome through secure analytics, where our customers can now run business analytics on a joint dataset and they can combine this data set for richer insights for scenarios such as a combined loyalty card. We have worked very closely with Aggregion to make this use case possible, and it is one of the innovative end-to-end implementations, that we're seeing to enable confidential data sharing. From an architectural perspective, that data is stored and processed at the partners cloud data lakes. The processing nodes themselves are not determined beforehand. The partners in this case release the joint analytics products and grant access to specific data areas on specific terms that are set by policy. And in order to do the actual computations dedicated secure enclaves are used for these sensitive joint calculations, where you need to do operations such as ID matching or joint processing on a common data set. These enclaves are signed for execution using strictly defined scripts, such that no one can get access to the data inside this enclave other than the code that is authorized to operate on the combined data set. All data input and output are encrypted. All operation logs are stored and secured in an irreversible distributed register and, in addition, this use case uses Azure Kubernetes service or AKS for executing and orchestrating the containers, thus scaling up and scaling down based on the data size, such that these workloads can be expanded horizontally, such that we can scale as the business requirements go up and down. The result is a matching ID, a joint distribution data set, which contains an anonymized customer's attribute sourced from multiple data suppliers. The solution in this case, that we worked with Aggregion and Magnit is very innovative and the solution itself is as applicable to other industries such as financial services or government. To sum up, we can confidently say, that large companies are actively looking for opportunities for data cooperation and, apparently, this trend will only grow with time. Nukri, maybe you could give us a case example of such a cooperation from your experience and illustrate how data is used, step by step? Ok, I will not name specific companies, but let's say, that there are four main participants involved: an FMCG brand, a major retail chain, a telecommunication company, and an advertising platform. For many commercial and legal reasons, it would be very difficult to combine the three databases together directly, by simply taking data from these companies and placing it in a single storage. Let me tell you about a typical project that we implement. Let’s take a simple case: marketing and advertising on this joint first party data. The retail chain and telecom have all this data, that they get from their customers. It’s both transactional data, such as purchases, and also data enriched with attributes produced by their data science teams: interests, social demographics, budgets, events, preferences,
hundreds of attributes. The advertisers also have their own data, such as customer lists
in the CRM. This data is used to target audiences, which will see the ads through the advertising platforms such as Facebook and Google. The retail chain, telecom and the brand all deploy the Aggregion CDP Customer Data Platform within their premises. The CDP contains the user facing applications for marketing and advertising as well as the data management services. The platform allows advertisers to target audiences based on attributes from multiple first party data sources, show ads through the advertising platforms, and receive uplift reporting of campaign performance into offline sales. The CDP is powered by the Aggregion protocol, which contains all the necessary API integrations, SDK and decentralized confidential computing components. The confidential computing is processed through the Intel SGX protected enclaves, which allow to securely match customer ID’s between datasets. The blockchain nodes contain special smart contracts, that store information about the available scripts and which enclaves are allowed to run them, as well as logs all requests for issue resolution. Let’s do a simplified walkthrough. At step 1 the raw data is converted via ETL into the CDP data model. In step 2, the advertisers go into their own CDP and use the interface to choose the desired customer attributes from multiple data partners. In step 3, each CDP instances process the request to apply the selected attributes to the data and then places the resulting customer ID’s in its own enclave. The CDP verifies with the blockchain to ensure, that the scripts have the permissions to run. Then the enclaves match the ID’s through Intel SGX enclaves and return the results to the advertiser’s CDP. Then in step 4 the customer ID’s are securely matched with the advertising platform as an external segment, verifying permissions with blockchain. In step 5, the advertising campaign is launched and ads are shown to the selected end customers. And finally in step 6 the logs from advertising platforms the actual views and clicks on the ads by specific customers are matched with the sales in the retail chain. This is how the system knows that the specific consumer saw the ad and then bought the product. The advertiser receives an aggregated uplift report. Thanks for the detailed explanation. We have discussed the Aggregion protocol with George. I’ve heard, that you have already implemented Aggregion protocols. So, as a company do you plan to cooperate with data within your group as a whole or you already plan, or already considering, doing certain activities with external partners? This actually a very good question. To me it's a major philosophical one. And the game comes back to the story of ecosystem. Beeline, we are part of Veon Group. So, on one side we are part of 11 different operating companies across the world, ranging anywhere from Pakistan to Algeria. So, from this perspective we are cooperating between the companies. But, at the same time, we are part of so-called Alfa Group. Alfa Group is consisting of the leading retailer in the country, the leading commercial bank, the insurance company, obviously, telecom operator. So, we started to cooperate already. And I can share with you some numbers, which is really amazing. We cooperate with the partner bank, with Alfa-Bank, and we issued a whole bunch of different products together with them. Kind of standard credit card, debit card and all the stuff. But what’s very unique, it's all developed based on real understanding of the customer and we connected our data platforms, so we can share the data. Right now, we are doing around 89 transactions per second. Yea, 89 transactions per second to address the proper needs of the customer, to properly develop the product. Because to create a credit card or to sale credit card is a simple process, but to know to whom to offer this card, to have it pre-approved, to have a credit limit installed in this, to offer it in the right point of time, when he's doing certain activities, or when he is not doing activities. And we are doing the same with insurance company, where we created, for example, one of the products is a medical insurance for migrant workers. Which is very unique, but again you need to have lots of data to know who the migrants are, where they are, what is their purchasing power, how often they go to the hospitals, what’s the rates you can offer them. We want to have 360-degree information about the customer, that’ why we are cooperating with Aggregion protocol with the other retailer. Nukri, I know that you have also discussed plans for cooperation with the Mail.ru Group in confidential computing. Yes, Yulia. Through our clients, we see that almost any medium and large business enterprise uses cloud technologies to one degree or another. Companies are moving to the cloud only principle, that is, they are considering exclusively cloud options for building infrastructure as a scenario for their business development. And the issue of the security of these clouds will increasingly be on the agenda. The most frequently asked questions are how to protect data from leaks, from malicious actions of a company employee. Also, customers want to get a service in which even the administrators of the cloud provider don’t have access to its data. Obviously, confidential computing is already a fairly tangible trend. It makes it possible to spot different kinds of reporting and therefore, of course, it’s a part of our strategy. I lead the Mail.ru Cloud Solutions platform. We present our cloud platform to major and medium-sized enterprises in Russia, and in this context, we are considering certain pilots and existing technologies. As you know, we work primarily with Intel. We’ve already made the first installation, and to make the adoption of any technology easier we need partners to help with the adoption, to bring the product to the market faster. This is why we work with partners like Aggregion. We have a technical deployment period, then the pilots with the business. Next comes the question of scalability. As for our service, we think that by the summer we’ll have passed the pilot stage and received some feedback, a kind of case study. Next, we’ll look at the other industries in which we can also work in this country and by the end of the year, depending on how successful we have been, we can move to the production stage, when the service can be delivered fairly seamlessly from the cloud and used to deliver on business objectives. Nukri, we talked about personal data, talked about how data is used in the companies. Much attention was paid to the value of the data cooperation itself. I really liked how George described the four levels
of cooperation readiness. Let's listen to him and then further discuss the technological readiness of data cooperation. You see, to me, the game is actually built on the four levels. The first is a foundation - is a technical platform. Thanks to Intel, it is there in place! Another one is sharing the data. And of course, you need to have lots of partners, who are able to share the data. The third layer, which is, usually, very popular by journalists, but, usually, it has nothing relation to the real life - It's about personal data and personal privacy, that we intrude personal data and all this staff. Look there are instruments this is a way, how you can manage your personal data and, if the services, which you're getting, actually helps you to live better, it's absolutely no reason to be afraid of sharing this and this is proper education, and, probably, the most importantly, delivering
the meaningful services. But, the fourth layer, which to me is, probably, the most complex and we, as a human race are really not ready for this - is the ethical level. Because, when I was always talking, I'm always saying, that we need to be non-intrusive, be friendly and be on the background, just to recommend it, because there are lots of things, which we can do with big data. I'll give you kind of one of my favorite examples. My data scientists, they jokingly saying, that they can predict, when the girl going to dump her boyfriend, actually, 3 days before she gonna do this. And data scientists perfectly understand how it is done. Because the girl changes her behavior: she's starting to wear different makeup, she spends more time at home, she doesn't go to the restaurant, she listens different music, she does different shopping, she speaks longer with her friends. So, lots of different parameters from 360 degrees life of the customer. And the rest is just proper mathematical model. So, mathematically, we can predict all the stuff, the challenge is - who we are on this earth to actually do this? So, this is the big challenge. This is a big challenge, actually, for the human race. Where are we going with this and how we can be able to predict, and recommend, and play the role in this one? And only when all of those 4 elements are going to be in play, we'll be able to properly transform the world. So, the foundation is there in place. The second part - about data sharing. Thanks a lot for Aggregion technology, we don't have to worry about this anymore. So, it is done. I don't need to worry about this. I'm sure, with the engagements like what we're having right now, we'll be able to help the industry to open the eyes and see that, actually, it's not about you and your small island where you lead, it's all about the customer and delivering meaningful stuff to the customer. I would say, that there are positive things, which are already starting to see. And in different markets, we see lots of elegant and proper technical solutions starting to change. We see green shoots. Our project with Alfa we see as the first green shoot in Russia, where we exchange the data. Your Aggregion protocol and the stuff we already doing with one of the biggest retailers, with one of the pharmacy company - we see another green shoots. More of those green shoots gonna be there. Very soon we're gonna have lots of grass growing. And after this is going to be trees and this is we're going to be changing the lives. Nukri, if companies collaborate on big data and build joint products, do I understand correctly, that there should be a data scientist workplace, which also should be secure? Yulia, this is a very good question. On the one hand, the data scientist needs to be able to work on this joint data, and on the other hand, the data must be protected from data leakage. To solve this problem, we have developed Data Lab and Cleanroom technologies, based on Intel SGX. This technology allows for advanced analysis, hypothesis testing and model building on partner data. The data scientist can build a data sample on joint data, create scripts in a virtual machine, and then run them on joint data. Due to the use of Intel SGX, the actual data never changes hands. Nukri, recently we have launched the latest 3rd Generation Intel® Xeon® Scalable Processors, code name Ice Lake. I have asked Jesse to highlight the changes coming to SGX. So, we're really excited about Intel SGX coming to the 3d Generation Intel Xeon Scalable processors, because we've had SGX in the environment since 2015, but it's been on single socket processors with small enclaves of only a few 100 megabytes. And that's been a great proving ground for the technology for its security properties. But now, what we're doing with Ice Lake is we're bringing this technology to the mainstream: multi-socket cloud scale processors, that can really handle large workloads. And we're doing it with much, much larger enclaves all the way up to a terabyte. So, that can handle, you know, the big cloud workloads like AI and database, and really enable this move of sensitive data into the public cloud and also multiparty compute usages, that enable, you know, good analytics and exchanges of data to leverage the security properties, that using a trusted execution environment really makes possible. I'm sure the data scientists will be happy! More memory, more speed - they should really like it! And we see how Intel technologies are evolved for confidential processing of big data, and cloud partners begin to support them. For example, the Mail.ru Cloud. Yes, I heard they are actively developing their Platform as a Service and confidential computing will be a part of it. Let’s hear from Ilya. We are highly focused on PaaS - Platform as a Service. This is when you provide not just the bare infrastructure: virtual machines or S3, but also the services that help solve the end task. Databases, Cluster cybernetics, Big Data, and Confidential computing, for example, are all parts of a PaaS solution, and not just the infrastructure. It’s infrastructure plus software that helps to deal with certain business objectives. This is naturally an area, where all the major players, like Amazon, Azure
and others are competing. We are looking at the most popular technologies on the global market, looking at how they are used in Russia, and launching PaaS solutions too. To be an effective cloud provider that reaches many companies, you need to have two offerings. This includes a digital offering, which is very technical and provides self-service and an Enterprise offering, which is different in terms of security. You need to provide services, consulting, migrations, you must provide expertise like, how to rewrite an application. You need to provide more customized services, since ‘off-the-shelf’ services are not always a good fit, and there's a lot of additional requirements
on how your cloud solution should be set up to meet the requirements of enterprise customers to demonstrates, how you can meet their needs in practice. For example, we launched the private cloud product line that allows the customer to install our platform on their hardware infrastructure, while getting the benefits of our PaaS solutions at the same time. This means, they can use our technology in-house say, they need to ensure their data stays within their premises. We are now developing this model as quickly as possible, trying to deliver new features for the digital segment, and, at the same time, driving our developments horizontally and in depth, creating enterprise-centric services. Let’s see the interview with Microsoft. Azure has been one of the lead partners with Intel in bringing the SGX technology to our customers. In April, we announced that Azure will be bringing the 3d generation Intel Xeon processors to Azure enabling much larger scenarios, that can run at scale because of the orders of magnitude capabilities that the 3d Generation Xeon processors bring,
that customer can use. We are very excited to be partnering with Intel to enable this capability in Azure. Since we are building on our existing investments that we've done with Intel Coffee Lake processors code that we are now bringing to Intel Ice Lake processors, which is the code name for the 3d Generation Xeon processors. In addition to IaaS virtual machines, Azure also offers Microsoft Azure Attestation service, that developers can use to make sure, that the code in the data, that they run in Azure, is actually the code and data, that they expect to be running in Azure. We also offer key management capabilities, such as the one with Azure Key Vault manage HSM service. Such, that customers can be confident, that they own the key management lifecycle and the keys themselves are confidential because our AKV managed HSM service is built on a confidential infrastructure itself. In addition to virtual machines and key infrastructure services, Azure also provides developer tooling, such as the Open Enclave SDK or the Mystikos Lib OS and Azure Kubernetes service integration, such that containers can be deployed in Azure onto the DCS V2 virtual machines, which are running Intel SGX hardware. We've taken our capabilities further by enabling our platform as a service capability to be confidential. We have started with Azure SQL always encrypted, which make sure that your data in Azure SQL stays protected even from the database administrators. We recently released a confidential machine learning inferencing SDK based on the ONNX Runtime, that developers can use to make sure, that their influencing, that happens inside these DCS V2 SGX VM's, stays
confidential as well. These are all the capabilities that Azure provides to make sure, that your workload stays confidential, completely end-to-end, which means you can start from an IS base perspective using one of our DCS V2 VM's or you can use a developer tooling to make sure, that you have complete control over the data protection life cycle and you can also leverage or pass services such as SQL always encrypted to make sure, that even the services, that your solution depends upon, stays confidential. And returning to the discussion of big data cooperation, it seems to me, that Alexander also raises the topic of how the progress in technology is making this cooperation possible. If we focus on the maturity of technology, I have to say we’ve made a giant step forward. Five years ago, we couldn’t speak about external collaborations, we needed to build a black box - which was very expensive and time-consuming, and rarely economically feasible, because no one had a true business feel for it yet, it was quite difficult to estimate and was a pretty abstract concept. We had a lot of discussions underway, but very few actions were taken in this area. Now everything has changed and we have a platform - Aggregion in particular to make it possible. And there are technological solutions that allow us set up this collaboration quickly, cost-effectively and with far greater efficiency. Yulia, yes, that's right. To add to Alexander's interview, I can tell you, that the Aggregion protocol supports a distributed data storage system. Integration of Intel SGX and blockchain with distributed storage made it possible to safely store scripts, models, and calculation results all protected by confidential computing and smart scripts. This data is transparently encrypted and only the participant, who has the key to this data, can get access to it. Nukri, you're right, Fabian also talks about the need for ecosystem building and integration. There clearly are certain risks in the collaboration, right, because, when you open up your data to other companies, you need to make sure, that you can really trust this partner. At the moment, as we are at the beginning of the journey, I think we're still very much focused on how do we extract the most value for us and typically this is building applications for our own business. In the next step, I do think that it will become very important how do we open this up to someone external. So, another partner in the ecosystem needs to be able to integrate, he needs to be able to help us deliver more value to both of our customers as well as our partner and this integration needs to be a little bit easier than it currently is. Now, you need to start focusing on what else might I be able to start to know about my customer, that I don't get from my own data and that's why date collaboration, I think, becomes more and more important. I think in a way we as an industry need to define the ecosystem more from a customer point of you, because I don't think there's room for 5 different Amazons, who build their own super powerful ecosystems and, as a result, you will not be able to offer something attractive to your customer, if you don't integrate well with others. Barriers created by some people are to be overcome by others. Marat, for instance, believes, that every professional, who wants to resolve issues of collaboration between companies, should be, at least, in part a technology professional. Interestingly, most top managers are not “digital-natives”. As they say ‘they are not from around there’ - and yet they lead companies. They are trying to quickly explore and understand the fundamentals, unlike at companies in the new economy, where someone like a programmer can become CEO. I think in the next stage it will become almost a basic skill set. Maybe you don’t write the code, but you understand very clearly, how to interact with it, and it becomes a kind of standard. Why do you need a marketing person who does not have any idea of what digital marketing is? They could be the perfect marketing professional of yesterday, but if they are not really into digital marketing,
not immersed in digital, then it seems like they’re not really up to task at hand, no matter how creative they might be. Some time ago, I would have called my IT director and said: there’s one specialist here and another one over there, would you please sit down together and solve this task - if it’s a challenging task that has an IT component. Now, I think, this component needs to shift from being something introduced to employees, toward being something which is inside each employee. Everyone has their own internal IT Director, that makes up a part of their professional identity. Yulia, I think that today’s journey with you and remotely with other great experts from both business and technology has clearly demonstrated the value of data cooperation. Alfa Group, Bank Otrkitye, Beeline and Magnit - all are either considering or already working in data collaboration through the Aggregion protocol. Many more are to come. Technological readiness for secure deployments of the Aggregion protocol was also confirmed by a Russian CSP - Mail.ru Group, as well as the global one - Microsoft Azure. Security is foundational to digital transformation. And rooting security solutions in hardware is the best way to help ensure a trusted foundation. Intel has committed to putting Security First. All the new features and capabilities we are bringing with recent announcement of Intel’s 3rd Generation Xeon Scalable processors
and Aggregion protocol on the top should help the business to embrace the full potential of trusted multi-party computation. Let’s hope to see the results of such data collaboration
in the near future. Be healthy and stay safe!