ChatGPT in Localization

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
indication is key and choosing your partner becomes crucial a project of expansion to International markets requires a long list of tasks and finds an answer in a One-Stop destination solution the only one capable of giving the right answers to the multiple challenges of reaching the desired markets save time gain quality with ap Portugal Tech language Solutions and its comprehensive range of high quality linguistic and technological solutions that will deliver the right message to the Right audience in any language yeah touch [Music] [Music] foreign [Music] if you want to reach a new audience you should be aware that the right communication is key and choosing your partner becomes crucial a project of expansion to International markets requires a long list of tasks and finds an answer in a One-Stop destination solution the only one capable of giving the right answers to the multiple challenges of reaching the desired markets save time gain quality with ap Portugal Tech language Solutions and its comprehensive range of high quality linguistic and technological solutions that will deliver the right message to the Right audience in any language thank you get in touch [Music] alright guys hello um let's begin we can dispense with the advertising very happy to see a lot of people here already at this first minute of the event more than 500 people in the room and it's been uh conference which went beyond our wildest expectations uh more than uh 2 500 people have signed up that signed up at this hour never expected this so we um working together with ap Portugal here who are very proficient in handling the events of this size it's uh new for us and I hope you uh feel the same excitement as we do at the tremendous success of this event even before it has even started it shows how important how exciting the topic of chat GPT and large language models is for the language industry and suddenly us language professionals we're in a line byte and a spotlight finally I can explain what I do to my grandmother right before that I would say language technology and see what uh she would just shrug now she she heard about it on the I don't know maybe on the radio or maybe on now um I know many of you are getting conflicted feelings uh there's a feeling of a huge opportunity of a gold rush and also the anxiety right maybe there's a fear of Technology whether it's going to take away our jobs and the um make a revolution in a language market and only the EI guys who are controlling the AI will be getting all these Spoils of War today I would like to to ask the speakers and everyone involved to focus on the opportunity uh I think this is the first industry Forum dedicated to judge petite but many will follow I think the whole year we're going to speak about child GPT and large language models so today I want to focus on how to use it where to apply it uh how can I uh benefit from this in in a variety of ways now uh house rules this is an event which is probably going to be overbooked we already have 700 people in the room and the limit is one thousand so if anyone cannot make it please join our stream on YouTube or on LinkedIn and there we have the extension so everyone will be able to join uh this is number one uh for questions if you have questions to the panelists uh then ask them in the QA section the chat is going to be thousands of uh messages long so we'll pay attention to the questions which are in the QA the program is very uh packed so we might only be able to select a few questions to ask the panelists feel free to ask as many as you'd like though and there'll be a voting mechanism to promote what is the most interesting to the audience today um the questions from from LinkedIn and from YouTube will be copied into the QA section of Zoom for uh for the moderators to see what's going on uh it's a it's a big crowd today okay uh final thing program overview so what do we have today for you we will start with a keynote by Marco trombetti a Visionary in our industry and then we have three panels one panel is from the vendor side the companies in technology and language services that have already adopted uh charge PT in some ways and integrated it into their systems you know the vendors they move really really fast to seize the place under the sun and already something is going on something practical so we can see some of the early developments today keep in mind this is all very very early it's a technology which is has come around two three months ago and throughout the year everyone will be adopting it in some way or another the second panel is what we call research and this will be a panel where we have a few people working on large language models here in Europe discussing how an alternative to charge a PT is being built and maybe how you can take big part in that process so bear with us and see how you can join the revolution the final panel of the day is uh buy side on the buy side the Implement it's a bit too early for the implementation so we'll have localization leaders from prominent companies and with prominent profiles sharing their early experiences and views on how it can be adopted how it can go in there and change what people do in the Enterprise after all this is not an event about technology it is an event about people right using that technology and how it is going to change our practices our careers our jobs our businesses so um shoot your questions uh let's begin and um for those of you that stay throughout the whole event till the end there's going to be a surprise guest for you hopefully it's still in the making we went for a fast approach for this conference to be the first Forum to um to go into charge GPT as a dedicated um event so uh bear with us if some things go in a slightly chaotic way this is how we like it uh with this I adjourn my part and please welcome on stage with your Smileys and emojis uh our Visionary of today Marco trombetti entrepreneur and investor Marco is the CEO of translated one of the most exciting companies in our industry and he is also a CEO of modern Mt one of the leading machine translation providers the leader according to IDC Marco is involved with chat DPT from the beginning in many many ways and let's hear about it from his personal experience Marco the floor is yours excite us Enlighten us astonish us there we go and you're on use 2023 2023 you're on good no pressure yes not yet we can't hear you yet so let's see what we can do uh can the technical team deal with the hold on Marco we can't hear you hold on no sound you can see that going fast sometimes has its bumps uh Google that they didn't mistake in their Bart presentation and this mistake cost them 100 billion in value hopefully Marco this wouldn't wouldn't cost you 100 million and finally it works I think so does it does it work now constantly yes we can hear you loud and clear perfect okay do you have any Echo also no it's good enough we we came here for you not for the echo okay perfect so I'll mute and move the ends if there is any problem okay so uh thank you so much today we talk about AI singularity and machine translation and obviously about the big topic the Constantine was so good and intercepting chat GPT in localization something that can change uh some of the things that we do so I want to start not by talking about technology I want to start with humanity and I don't want to present myself this I'm asking the help of a little girl a very gifted girl that will remind us why our work is so important so let's see what Lara has to do is to say to us then um [Music] foreign they make quantity Uncle Greco see stories [Music] hello ladies the special of the day is eggplant parmigiana we'll take it okay sounds good welcome oh welcome we are coming a very common one young and then [Music] it is foreign um [Music] and yes this video was dedicated to all of you all the people that works in Translation machine translation localization because of the impact that you're creating on humanity and uh to start with the presentation uh there is so many people that I guess some people may not know me already know translated so it's a longer service provider started in 1999 different from all the other companies because of symbiosis between human and machine we've been working in this symbiosis between professional translators and machine translation since 1999 and last year two years ago we signed what is considered by many the largest contract in the history of translation and we are the makers of my memory made cap modern Mt and also we invented adaptive Mt in 2011. so the reason I think why Constantine wanted me to talk at this keynote is is probably because of the leadership position that translated as in machine constellation but also I think the balance of view the fact that I have been involved with language models uh for a long time and and also my connection direct and indirect connection with uh all the work all that open AI is doing and to connect you to this part I want to tell you a very brief story because very few people understand the translation was the inspiration for uh creating um GPT and many other models the technology behind the ideas behind where all solutions invented for translation and many people of you are looking at at what is happening as something distant and and you probably don't connect that this is really something that is happening because of many work that has been done in our industry by all of you and so this a story started in one day in March in the same day I was meeting two people that were completely unrelated at that time and and and represent two different areas of my business my my scientific interest in machine translation and then my investment background as an investor I have a venture fund called it Pi campus and so in the morning I was meeting with this guy Lucas Kaiser so we were asking machine translation architecture I was really working on on how to create a next the next level neural network for adaptive machine translation and Lucas was was trying to work on creating a larger a model that was able to understand context at a larger scaling machine translation and you know what happened is that three months after that meeting in March Lucas co-invented uh together with a few more people the Transformer and if you don't know what the Transformer is was first the first model to improve machine translation a lot neural machine translation a lot but also Transformer if you look at the name GPT generative pre-trained Transformer is the technology behind GPT Chuck GPT and all the very large language model that you see out there Luca Shields then became the mentor the main Mentor behind the school of artificial intelligence that I co-founded we worked together before in machine translation and then it was a Google at that time he left Google to go to open AI to continue his research in the afternoon I was meeting two other people Michael Sable and Sam outman because of my investment work I do partner a lot with Y combinator an accelerator where we do a lot of co-investment together Michael is the CEO of Y combinator and Sam outman is the president and so what happened is that few after Michael became the first investor in translated the very first and some left a y combinator to go and work full time in open AI the company that very few years ago he founded as a non-profit and so Two Worlds that were completely unrelated scientific interest for machine translation an investment Community they all go in One Direction and because all my connections were going to do that I paid a lot of attention to what was happening there and I've been paying attention for a long time and so translation why doesn't matter so much we've seen as being the inspiration between some of this great technology but the reason the true reason behind it because this is one of the most important problem we have to solve so first language is the most human thing out there it's very simple for humans the most complex task for machine okay and and language differentiate us for all from us from all the other species so every species have developed and motor control the capacity of going around but only human have developed complex language is the way we evolved our species without language we were not able to plan a future and cooperate versus the future bees and they can cooperate on solving uh short-term problems for survival through language humans we were able to predict the future language is probably one of the most important and key element behind human evolution and so allowing everyone to understand to be understood is one of the biggest problem for Humanity and we cannot do it simply by teaching people English for example or Esperanto we have to allow everyone to speak their own language to preserve the culture and that would be the ultimate uh goal that we have in our industry because if you achieve the goal we can unlock the next level of human evolution and I think that going to Mars is important climate change is important but if we don't solve language first we're not at the level of global cooperation we need in order to achieve the goal so our progress what is the progress going in that direction are we solving the translation problem and why are we so interested in understanding at what speed are we going there is that because being the most complex activity in artificial intelligence is also a great proxy for General artificial intelligence so the more progress we make uh in in machine translation and very soon this will be applied to the other domains and so translator collected this data that is extremely interested about the time to edit the amount of time translators spent correcting words and these last for 15 years in make it and I'm presenting here since 2015. and if you can see the time for editing each word goes up and down but the trend is quite linear and it's going down the machine translation is improving what is the singularity Point The Singularity is when a machine is better than the human translating and so to do this test we gave 100 exact matches that did not require anything any edit to professional translators and measured how much time they need to say this is perfect even if they don't apply changes and this is one second per word we don't need time to edit to go to zero we just need to get to one and if you can see this is happening month over month so many people said to me when they see these lies and say Michael yes we see progress but we know language is so complex and you know will be difficult for machine now to make more progress in the next month because in order to improve now machines need to understand reality they need to have a great model of the of the real world in order to do great translations and so we asked ourselves is that information available does the machine have the information in order to translate correctly and so we made a quick experiment we take a sentence we translate that with machine translation into another language and then we went into the center of the neural network the encoder and trained a model that is taking the semantic representation of the model so what the model is thinking what is in the brain of the machine and created a text to image system in order to see what the machine is thinking and this is there is a tire a telephone a CRT monitor probably a paper a dictionary there that information is not there the level of understanding the machine as of the reality is much more by looking at this experiment we understood that the model is able to have a great understanding of the reality is just that we are not able to extract this information yet because this task was so impressive uh I wanted to go a little further so I asked it obviously okay what is a successful language translator in the future okay and I wanted to see what the machine is thinking because maybe the machine knows something more than we do and boom this is the answer I I guess that this is not a telephone this is probably a device that extracts the information directly from the brain a brain to computer interface the mind of the translator is the entire planet Earth and it's dressed like or she as a Star Trek incredible so to get to the topic of today chat GPT what it is what are the opportunities for us first is an unreasonably good next word predictor so a little history the Transformer as I mentioned before was invented at Google in 2017. and very recently the CEO of Google said we're not going to offer this technology to user because it posed a reputational risk to Google in the meanwhile this was November 2022 charge GPT launches and went from zero to 100 million users in two months making it the fastest growing up in the history of the internet Google immediately after two now launches Google barred and lost 100 billion dollar in Market evaluation for a small small error the chat GPT did at least 10 times more during the last year what is the morale here which is extremely interesting for all of you is that in startups the market will remember your successes and forget your failure and incorporates is just the opposite this is a great news for any of you that would like to get into this business there is opportunity and you have an unfair competitive advantage over the monopolies so but what is actually Chuck GPT and I tried this example to explain you in simple terms so language model is exists from since ever and really is about predicting the next word in 2000 if you were the author of Harry Potter you were writing a book doesn't matter how many chapters you already wrote you're writing this text the only things that machine sees the last three words no finer boy and you want something that predict the next the next word can you guess what the next word is no finer boy is extremely difficult and even if it's difficult this was used to do translation in 2000 and it was not that bad okay we waited until 2010 in order to get five word context and so now the system is not able to see the content but the previous but it can add two words there was no finer boy question mark for me it's still hard to predict what is the big difference with chat GPT and GPT and large language models now 1500 words context here I'm just showing the last hundred words that is 1 400 words even before that the model is able to look into it and now with all this context you can say this is the word is anywhere the one that's missing this is the difference there is no secret magic no secret technology behind no tricks there is no specific components no human knowledge a large language model is simply a next word predictor if you say I'll say you say why is the system able to answer my question then if I say how old is Barack Obama because you predicts an export if someone is asking how old is Barack Obama then the next word is Barack Obama is an age so is just a prediction and what I say is unreasonably good it was never respected then the language model was able to start discussing with us giving us answer by just predicting the net worth that's that's fascinating so how do we train these models this is the sample of the data from Lama so the model that was trained by Facebook and uh and these data is shared among all the models 95 of the models use the same data so common crawl is snapshot of the internet C4 which is a version filtered of common crow GitHub so code Wikipedia books paper scientific papers and another code this amount of data if you can see is about four or five terabyte of data and it's treating the words it's very big it's a lot of knowledge but not so not so much bigger than what you can actually store in your computer okay this is chargpt this data train it and next word predictor so to bring this closer to our world okay to the world of of translation I can make this other representation that can help you so a basic language model in the left is something unconstrained that generates the next word without any kind of constraint just the most probable word it generates very fluent text but it's not accurate at all what the model will say may be completely false not true it's just probable that exists in the language on the other Spectrum on the right side the basic machine translation technology is highly constrained so you generate the next word by looking at what is in the source so it needs to represent the source is highly constrained by The Source content these give you very accurate translation but sometimes to literal the magic that happened in these years is this so when from here to here GPT became more accurate compared to gpd3 and is still fluent and quite accurate on the other side with adaptive Mt we went from from literal now to accurate and quite fluent so these two words are in two different directions are coming closer obviously our industry because of the work we do we have to start from the right side because translations cannot lie we cannot just generate things that are fluent but they don't represent the the original uh this is another way of viewing the the comparison so language models are quite poor at translating if you use them for translation in certain things are great in other places so in Translation party overall they're quite bad if you do proper human evaluation you will find a lot of Errors many more errors than what you find in Google Translate Microsoft uh and if you look at accuracy so how much trust you can give to normally is actually very low because in general it doesn't make a lot of mistakes but those mistakes that are made sometimes will be considered very severe error in our industry it's also very slow five seconds per word five second to generate a translation so while it's acceptable for the use cases in our industry to have systems that answered about 300 milliseconds but there is a few things that they do very well context capability so think about gender or plurals the model is not translating sentence by sentence is that they document label so they can do this learning extremely efficiently and so they never make mistakes actually in gender and and plurals the other thing that the model does extremely well is rephrasing so something that you cannot do today you want to say translate this in a very formal way the model is created this there is very few use cases now of this but can be useful in the future and one great Advantage is noisy source so if you have a Content that a lot of errors in the source large language models are better translating those kind of content because they convert they can correct easily the error of the machine because they have a better representation of reality and because they take some risk so it's more fluent I take some risk and one is ambiguous you are it's okay to take risk so there is many things to do and I predict that language models will work on solve those problems and also adaptive Mt and static NT will also work in implementing more context and this capability so the future looks extremely interesting and we will probably have multiple solutions to choose from from these two different angles so if you think that chat GPT is the only solution out there what I want to tell you is that you have multiple options that you can use okay and the first I want to mention is bloom bloom by a Consortium called big science they trained gpt3 and release it it completely open source free to download the model already ready and pre-trained and the license is very permissive you can do commercial application whatever you want the quad is quite good it's not as good as the other models that we mentioned in a second uh but is a great great solution to start there is maggot from from Nvidia this is chinchilla and llama and llama is probably my favorite one because it outperformed me gpt3 in most competition was raised by Facebook last week and Facebook was so kind to release this for free download so the license does not allow commercial application but it's completely free great Alternatives out there so don't think that this is a monopoly this is something where you have multiple options if you want to start using these Technologies so how are they going to affect our affect our industry so translation exists when it's more convenient than creating the content from scratch we do exist because we're convenient fast accurate low cost compared to rewriting the content from the beginning and uh and so if you understand this you understand what kind of things might change in the future so one thing important thing that may happen is that people will start generating using these tools content directly in the target language and then instead of adding a translator they will employ instead of copywriter in order to fix the content and you know something is very probable that this approach will create better quality content than a translation okay will be less literal more adopted more fluid like a copywriter can do so this is a solution where actually if you're doing translation you're probably losing money and if you're doing copywriting the opportunity is bigger for you the second approach that could happen is content restructuring which is super interesting probably to me the best application of larger language models right now summarization let's think about this you take 1000 reviews from a product on Amazon and you create a new content that describe what the users thinks about this product in a human readable way this is a new content that will require new translation and because it's so easy to create this content then I guess that there will be many applications and much more content created by restructuring of content last application is that in general there will be more content so nobody is producing is using language model to create content per se they're using these tools in order to to help humans to create better content at a lower cost faster and so if you remember my previous slide translation exists because it's more convenient than than creating the content from scratch we need to adapt so there would be a lot more of machine translation and a lot more of human translation with AI so we need to design new solutions that are faster cheaper and better quality and if you see the graph of the single origin AI I think this is possible because the technology is helping also us to do that and we have to stay in line big opportunity uh also there so also outside our industry why the smartest people in the world love chatricity and so first is that because everyone thinks that it will improve very quickly uh the big change between gpt3 and chart GPT it was a model called instruct GPT that was trained on 33 000 human examples only 33 000 human examples at work that let's say is an investment of 100K 100 000 compared to a training cost of 10 million in research cost of 100 million is nothing so the big opportunity here small amount of data we made a huge increase in quality so everybody's thinking that trust and freshness can actually be solved in the future people want it 100 million users in two months that's a world record so there is Market the market is extremely big this is not a better way of doing search the people that work in language models believe that this is a completely new system that is attacking the use cases between search okay in a different way and last thing is an open competition so there are few barriers to enter the data is free and available for everyone the architectures are free and available for everyone the barrier is the training cost they started with 10 15 million to train gpt3 last week uh Facebook I made some calculation I think it was 1.5 million dollars to train it that's basically 10 times x and I think we can go very quickly to numbers that will make it accessible to anyone so is a very interesting space it's not a space for one player so but the big big problem we have with generative systems I think is that they are breaking the internet deal what I call the internet deal why should I create content if I don't get visitors the entire internet until now the fundamentals are that you create content so that you get indexed by search engine and people can find you and so there is a clear benefit an incentive in creating content getting visitors then you sell advertising you sell Services you sell products language models are breaking the deal because if I search into the language model I will get the answer there by the combining knowledge of of everyone else so this is breaking the deal so what could happen so my first personal reaction was you know something I will ban open AI to crawl my website I don't want my content to welcome in order not to send me users okay and this lasted for me about a day so and then I realized that how can I influence that if I if I change my robot txt files so that they can now index our website am I really stopping them no so I took a completely different approach and I said you know something I will generate massive amount of content on our website tens of millions of pages by explaining why translated is the best translation company in the world so that it can influence the model okay and if you influence the model then basically you will get the model to say what what you want and I mean obviously open Ai and many others are fighting back because they are creating AI text classifier they want to identify content that is created automatically and very honestly I don't want to pollute the internet with massive amount of low quality content that should try so this I think that people may do kind of things like this but then there would be convergence okay and convergence is about two things basically llms driving some traffic and you can test this on perplexity.ei is a search engine you ask questions like Chachi PT but other than the answer they will give you also the references where this information is coming from this is called a reverse model an inverse model so by the text you predict what was the URL they generated that information it would drive some traffic but the other most interesting thing is that people will start opening up their website through apis so the language models can actually interact with your website and buy stuff services so I won't translate it to be able to offer human translation services through open AI I will create an integration let's say the language store the language app store is a new level of App Store where we will all publish a way for language model to interact and so I want the new iPhone I just asked for it I want to find an Airbnb in 200 kilometers on the beach four bedrooms is a much better way of searching than just using a location-based search on a human UI so there is enormous opportunity and I think that if language models will succeed and will progress as they're doing this could be a nice uh new uh organization of the internet but we need to move fast and because if if you think the language mode also so language is the new paradigm okay we're searching through language Etc I want to you to think about this you need to act fast because also a large language model will be soon replaced in some use cases so language is very compressive but it's very low bandwidth and and it's convenient for humans but it's not good for human computer interaction and instead brain computer interfaces can replace some use cases and if you think of micro you're talking about the future something that who cares about that but I want to tell you something open mass or Elon Musk resign it from open AI and founded neuralink a brain computer interface company so there is people the things that this may happen uh even sooner okay I think so we need to act now the opportunity is now is going to last sometimes but not an infinite amount of time so to to conclude translation is by far the hardest problem in AI but it was the main source of inspiration of all the work that you're seeing in in large language models the for the first time a non-biotech company is leading machine translation and this is not about translated as you saw the IDC report is a success for the entire industry we're getting back a piece of the market that we should own open AI reminded us the new ideas have an unfair competitive advantage they are replacing some big Tech they're winning users every single day and it was a startup with a new idea and Ai and language are what everyone is talking about today everyone and and so you are in the right place at the right time all the people that work in machine translation have these two things together Ai and language you are in the right place in the right time so if you want to have an impact in the future Humanity you absolutely can a translated we do say we believe in humans and what I want to tell you personally is that not only I do believe in human I do believe in you I believe in the capacity that this community have in inventing new solution to allow everyone to understand and be understood and if we solve this problem this would be the most important achievement on our industry thank you of emojis uh streaming through the screen when you said I believe in you that was electrifying it was really insightful to find out that charge GPT is not an omnipotent Duty that knows everything about the meaning of life the universe and everything but actually uh word predictor and there are people working actively on not only integrated it but also fulliness in many ways and you can write your history in any way you want if if you have computer skills that was fantastic thank you so much uh I understand we are a little bit behind the time we don't have the uh opportunity to ask you a thousand questions but if there is sufficient interest would you be open to hold a follow-up session ask me anything uh with Marco somewhere later this week or maybe next week Constantine wherever you ask I'll do okay I'll take up on your promise so send us some questions for Marco and uh we'll send you a date when we can follow up Mark again thank you so much and with this uh I bid you uh uh not you for the moment because I'm sure you uh want to see what others have to say and I would like to welcome on stage Manuel herrance and the panel of uh the vendor side of the companies that have already adopted chargeupt in some way let's see what's cooking up bye hello everyone I think my video is my video okay start my video here we are now it's wonderful perfect hey mother I can hear you we can see you good good good and it's much better Steve Jobs look with the turtleneck yeah uh the different different shirts and different shirts yeah I'm in Paris so it's a little bit cold you need this kind of Club thing all right so the floor is yours please introduce the panel and uh we'll have of course some examples of how charging PT has been already put into different solutions learn from the other uh vendor companies and see what you can do in your uh business or in your organization indeed in the University Center in lightning uh talk by marker and um many questions were coming in already which I will share with the panel um we have three panelists here people that I I admire personally a lot and we going to um we're we're going to uh I'm going to present them all the three of them from from smartlink and there is Frederick with which we had the pleasure of having some a little talk the other day very enlightening the work that they're doing and I'm going to ask the panels uh to introduce themselves to give a little short uh introduction to who they are you can start with opening statements uh share slides as marker has done and then I will post some questions some challenging questions to you and then also questions from from the audience but I don't want to take any more uh of your time I'm going to introduce first Olga if you are here hello so the floor is mine the floor is yours okay wonderful so um with smart link now uh running machine translation and AI vision for the company been in the industry you know those intros that you sent to conferences I pretty much reuse the same interests at the month the number of years it's just increasing and increasing so the later numbers are close to three decades close to three decades being in language technology and in machine translation national language processing space so why am I here why am I so excited about the panel is things change so insanely fast in the past I was like oh cool you know we have um you know we have Livingston distance we have fuzzy matching cool here is translation memory right and you know machine translation was evolving at a certain at a certain Pace now you fall asleep you wake up and you llam was born and then the question is what do you do with this and we would smart things since I'm representing vendor uh side of the panel we always keep that balance between test validate deploy rather than just following the hype and trying to boil the ocean so to go into practical examples I mean first things first I dare to disagree with Marco a little bit uh in August 2022 Gartner published a report saying that 85 percent of machines are of uh AI initiatives fail um if so I would argue that natural language processing tasks for AI are probably the most successful space for it while it's challenging the language is structured there is plethora of data so there is I think our playground this is where I agree with Marco our playground is probably the hugest playground for large language models so back to practicalities first things first maybe I'm a little bit itchy about yes we talk about chat GPT but why only first of all and this is what we do at smartling we pay attention to different large language models possible applications of them and equally to different flavors of gbt and it's very important between even between flavors of DaVinci you get certain results for certain tasks better and certain results and certain tasks you get I mean you get lower performance in terms of deployment we start with word years ago and then with Roberta which we were able to train with data so that's one of with our own performance data so that's one of the applications where we started with and again now I think and I'll repeat it again our vision is its most important just like we have selection mechanisms to select best machine translation output it's most important to capitalize on what's available and not just stay myopic on one application and obviously chargpt is the fastest growing application but let's make sure we pay attention to others now Manila I can speak a little bit more about practicalities or I can leave it to questions we need to stop here and then introduce you have an open discussion one wonderful so the next victim on the panel is Diego cross Sherry a person that admire a lot has a very interesting and upcoming uh LSP in Italy so Diego the florist jokes thank you Manuel hope you will be fine uh yeah my name is Diego cashieri I own creative words lsp-based in Italy we have a small company still going fast uh the reason why I'm here is because Costa loves me that's probably the biggest reason and we've been using we've been testing and using chat GPT in different scenarios for our own company uh so I guess I will be able to share the point of view of small companies more LSP uh I'm not using it for Translate translation uh honestly in this point in this point in time but I think there are many other ways we can leverage this technology in our company so this is what I'm here for uh happy to to be here thank you everybody it's amazing number of people that are here it's thank you for the commentary glasses I was expecting that so the floor to Frederick I guess and happy to answer any question that's that's right and the third person is Frederick um from Easy translates a company that has recently received some funding in order to implement and augment the capacities of activity in their offering so we eager to hear more about you Frederick yeah hi uh my name is Philip Peterson and I'm the co-founder of easy train State we have existed uh for for 13 years but in the past three years we have basically pivoting the whole business model um going into a hybrid of a train station management software and a marketplace where we facilitate the connection to Freelancers freelance training Sellers and copywriters um and in this case we are not actually charging anything for the actual translations we are only providing the payments and the calculations directly between the Freelancers and the customers so I was told to also just do a short demo of our current implementation of activity of my eye I don't know if if I just should jump straight to that now or when would be the perfect Point um I think again with you being with everybody being in the panel I think this is the time I'd like before we do that uh oh no let's let's go for it because then the questions will come later I want to talk about emerging use cases and examples of of chat activity being integrated so it's wonderful it's great then you have that already perfect okay so let me know if you are able to to see my screen so basically here now we are in our our software and what we've done is creating uh the consideration tool here where you have some some predefined uh prompts that you can give the product you can also use the advanced settings here to create some creativity and to to control the maximum length among others um so if you just use the current one we have here uh it will straight out starts to write a blog post about the importance of localization but what we have learned from this and we only have it live for kind of like a month and we have already been generating more than a million words uh we can see that this is pretty much just a party trick so uh it seems super nice but how can we actually implement this into a Enterprise content creation flow so what we're building now is uh is basically we are looking at releasing this uh prototype I will disclose for this little crew of a thousand people where you actually is able to to build templates to create uh hundreds of content samples for products or for for category text or others so how you're going to build this is basically by using uh you know content reference where you say this is something similar to this text I like to have generated I would like to define the main subjects I would also like to include keywords uh for Co optimization among others when you have kind of set those brief you're able to save the template and Hereafter you can actually go back to plain text you will generate the content and it will be similar to the content that this as a reference what we think is super important here is that this here is always a foundation for the content and what we have learned from our customers is that actually sourcing it to Freelancers is a essential part of the workflow so we also use this as a general content brief so all the details here is included for the copywriters to actually start directly and improving the content but also about improving the content it's important to learn from it why we also has released a fine tuning model where we fine-tune on customer base level on the changes that is done so that the content actually is um getting more and more refined towards the audience group for the specific company so so we really much believe that what we have seen is that um that this year is working on I mean a lot of different languages we have tested down Danish we have customers using it in Finnish and where we really see the big opportunity here is actually being able to to to create content for your local markets from scratch and we also believe that this might be a competition to actually translating the concept because you are yeah you're basically starting the opposite way uh also as Marco mentioned um there's of course the nature of some content that always would have some kind of source but at least for marketeers and for product description analysis it would make a lot of sense to actually go with creating a can I say the content from scratch with these AR models very interesting uh very interesting Frederick so um now I'm going to open the table for the three of you with some questions um that we had uh ready and then some questions specifically to each one of you um we've touched upon some emerging use cases very good use case Frederick that you have there uh goes in line quite a bit with what marker was mentioning not so much on translation but in content generation and pre-drafting so what do you think the impact of chat CPT can have on the business relationships on on the translation industry or is there a role for example uh for a prompt engineer for someone that can ask the right questions to to the software who's taking this one I guess that I always think how we kind of like approach it is that um the templates in our kind of universe is basically the prompts and there's other big players that have done a lot of research within prompts and how to generate the best possible outputs um I think this here will be kind of like some of the secret ingredients for a lot of companies how are you actually making sure that you uh in each customer specific case generate the most comprehensive concept so um so that is at least on on that side and my comments to that good good excellent let me move back to Olga you've been quiet for a while uh I'll get um anything and anything to mention on that frontal you like me to yeah I think first of all what it's going to do to business engagement right translation as much as we love our industry translation is still and not an afterthought but it's a second step right I think now the opportunity for us be it LSPs be a technology providers is come in as content owners and end-to-end content partners coming from beat Source generation right as Frederick just said be it source pre-optimization for further translation be Target generation where translation is not applicable all the way to validation right as marker rightfully said that reliability and Trust are yet to be gained so now you actually come in as a full-scale Content partner and I think that's the huge business changer and also pricing model changer you don't price per word for being a Content partner so um I think that's in terms of the business models that's what I want to say and maybe later when we get to jobs I also have a few things uh to say about what I think the next generation of jobs relate to deployment of language models is going to be indeed and dear what about you because okay you mentioned you're a small LSP you see yourself competing with marketing agencies could that be possible well if you think of the name of my company that was my dream from the start actually to compete with marketing agencies and yeah I do see myself working on content generation also in on top of translation absolutely we've been doing that for a long time already be it can be putter descriptions or you know marketing content emails similar campaigns with this tool I guess we have we can leverage it and and do more of it I want to mention that I think technology needs to solve a problem and the reason why I started to use charge GPT is to solve my marketing and content production program for my own company so this is what I'm doing at the moment I'm using it to uh not for the customers well I do use for the customers also now but I start to use it for my own marketing needs we do produce a lot of content uh not every time we use English mother tongue uh native language copywriters and for small content you'll be AdSense so the product we are building with my other company ktvi is for helping a small community businesses to to be there to be online to this content with the pace that is needed nowadays yeah funny you mentioned that because I was in a meeting here in Paris earlier on this morning and I was speaking to a very large and then thin tank and they said that we're going to revamp our whole website and we're going to use improve our titles and improve our descriptions and meta descriptions and it's gonna everything's gonna be be made more efficient with what we've tried uh with activity over the last month okay so um um let me move into some of the questions that we have here there's a lot of questions from the audience many won't be able to be answered some life there are people already complaining that uh some questions have been marked as answered they are not many are for Marco so we are afraid we can't answer for them but I'll get back to you uh what do you think um implementation such as chat TBT will be in in the translation what kind of implementation of chat gbt being in Translation management platform and subsequently Enterprise scale translation process uh sorry just so I don't forget anything I I admit I have a little cheat sheet on the site to make sure that it uh I like I have a lot of thoughts so I had to keep myself organized so um I think again just like with any uh translation process you need to design your translation strategy in collaboration if you're a large Enterprise in collaboration with your uh translation management platform or LSP or both partners and I think it's all about identifying not breaking what works but identifying opportunities where you actually can optimize streamline and automate so when it applies directly to chat gbt I think the touch points are pretty clear at least from what I know about Enterprise translation process as we already spoke about it possible Source generation so on the actual side right but then your Source beat Source generation or beat Source we already know that what controlled authoring is so I guess this is new face of uh controlled or whatever the right word is authoring then obviously it needs needs it all needs to connect so you still need your Source repository still needs to connect to the TMS and then where can it what can happen on the TMS front again Source pre-edit right so that's one touch point then from there you go to machine translation I think we all probably would agree that chegebt is not quite there for machine translation so from there you go into machine translation then there is this whole area of ape automated post editing and again we touched on it briefly there are tons of things that you could do with beat chat GPT or other flavors of GPT other large language models in terms of okay machine translation output is still a little bit subpar right we're not there with human parity but because of the fluency there are tons of things such as improving tone of voice correcting grammar and a lot of other things that you can do so I think that's another touch Point possibly and then there is the whole lqe we invest ton of money into language call estimation language quality evaluation and I think the capabilities that actual possibilities of utilizing large language models and challenging including for validating the quality of the final output that's another output so I think I've identified at least five good Excellence a very good question uh answer Olga and I I see that you agree with Marco in the fact that machine translation are adaptive for customized machine translation can beat chat CBT so there will be a hybrid oh people will be building hybrid Solutions here very good uh just maybe to add just to add also I know I guess I think that you know I think the key takeaway is that there is not one technology that wins I think also it's basically how you combine the technology is to achieve your goal um and I very much also believe in that that um that we should when working with Enterprise clients not only uh of course we need to listen to How the workflows and how you know the process is but also actually support in improving workflows because I think that that is some of the most interesting things when coming you know to to specific company use cases how much you actually can benefit from now supporting the whole content life cycle and I think that is what we have really maybe he's trying to say you know log into this direction because now we actually is not just focusing on the translation part we are focusing on saying okay if you are prototyping in figma how could you use content generation tool to generate the content to mirror it into your localization flow to use machine translation cost machine translation all these kind of aspects and there's always you know all these different empty Solutions out there and it's never uh you know One Stop Shop solution so maybe uh Google's custom machine translation solution would work very good for XY set and then in other cases it would be debel or other so we very much believe in actually supporting them all broader area to to support all the use cases um good points uh good uh Point Frederick in fact you were my next in line for questions and I have one here but um uh I'm receiving tons and tons of questions about 26 are Marcus and said some of them wear about sound uh and video but uh I don't think we'll have time to run through them all uh about 16. some of them you have already attached and the the issue is about privacy a lot about privacy that perhaps we can touch upon later but Frederick do you think um will GPT be able to learn my company's strong advice for example or become better at generating content specifically for my audience yeah so so I think that and back to also and you know at the the demo of the current version of uh of uh of May I attractivity in our software uh it was uh incredible how quick the feedback was from our customers I mean it's good but for us to use it in an Enterprise setup it would need actually three things it would need to have you know templates so that they can structure and actually make sure that every time they generate content it is in the same layout and including maybe specific keywords and others they would need to be able to it needs to learn from the edits they do or else it would be too time consuming as well and the third thing is that combining it with a Marketplace the Freelancers copywriters that can actually take on the task either when you know there is bottlenecks or maybe if they want to to generate you know create cover writing content for new markets so so um so what we've done here is basically testing out how we could train on specific customers chains on so basically how it goes is that you write a prompt you get an output and then you have a final output which is the where there have been the human modifications and it's incredible how quick it actually um learns from those outputs and also I think Marco pointed out uh that the amount of prompts that they have trained with injectivity is actually not that much and I think that we have at least I have at least created 1 000 prompts in our own service um but yeah so so it's uh it's actually pretty sufficient how much you can gain from from structuring it again structuring into some workflows and gaining the data for the customers to benefit from it um in the actual workflows so yes I think that that this would definitely be possible to some extent there's always uh and especially with the current uh my um model available it has been trained I think up to 2021 so so you know there's some data that is just incorrect so here there is a big need of of the human fine tuning I would say yeah yeah I think that this answers uh um some of the questions from the Q a about customization or or tuning fine tuning the model uh so what do you think it will take for chat activity to become relevant in a in a Enterprise content flow uh sorry uh so so what we have done is we have basically made out of beta uh version where we have selected some customers that have is now participating in the beta program to being able to put numbers on what is it that it actually takes to bring it from you know the more generic version you have now to to to actually if you will understanding uh the company's toner boys and also being able to to streamline the communication towards the audience um better and I think that there will be a big impact on which language that you are doing this for so I'm pretty confident English will go pretty fine if you go to Danish or finish I think you know you know there will be some it'll be less improved so but the time will tell okay now Diego back to you um that was very interesting uh friendly also calls for I will uh or bad activity is doing in certain languages I I do have feedback from users in other languages that are not English large languages not not French and Spanish not English that information is not it's not that great so in your case Diego what can a company like like yours like creative do to Leverage The Power of Satur 2 uh both Inward and customer facing yes so we we let's talk about customer facing the last experiment we did was we put a description for a shoe company and it was actually great so it was a mixture actually of let's call it technology so putting things together putting keywords together and so shoe features and so on and chargedity uh and it was actually great and that was for Italian and for English um this is the case and presented the customer I've been experimenting without telling them for the moment but I will soon next week we have a meeting I want to present this and it's amazing I mean the the level of creativity that it's it's showing is goes far beyond what the human can produce this is what I'm experiencing in the moment in terms of inward facing for our marketing I have to say that English was great compared to non-native English copywriting for Italian uh I don't like it very much for the moment being but the two the truth is we've not been training it for the moment being so I guess it will become better with time and with training but we have a big stun saver for us already even without you know spending a lot of effort on it yeah well we know that uh version four next to be released next year it sucks about that's my Vlog last week is 500 times more more powerful I mean this one has been 2175 billion parameters and the next one is in the trillions 100 trillions of this it's a number that escapes my mind um well ladies and ladies and gentlemen if we have completed our own questions and explanation and demos we can move to the part that I like most in round tables which is which is live q a and there are tons and tons of questions I'm sorry I have to apologize for people that are going to be left out with their interesting questions uh one I've been browsing through their maths as it spoke a common theme that I'm finding in many of them is privacy the concern about privacy what happens uh and I think Market touched upon this as well you know I don't want uh open AI to crawl my websites uh but on second thoughts maybe I want because I want to influence the way AI is built um but the other type of privacy issues from the questions that I see here in my personal data is flight and a is flying I read my clients details product names of uh that stuff that is that doesn't exist yet that's going to be launched in the Christmas campaign are going to be released out um what are your thoughts on that can I say something that is not really very popular here I mean with machine translation we are concerned about putting the Privacy the client data on DPL or whatever uh for Content that will be published the next day I mean what do we think that detail these companies are doing with our own data I mean what's the actual risk here that that's a question it's not an answer but I mean is are we not worrying too much about the Privacy for some scenarios ity I can mute myself I I I I would have something to say about that but I'm just a moderator I will let Olga and Fred I think as far as machine translation goes uh Diego I think to some extent we have solved it through the service agreements right and if you pay for your API key then certain things just do not happen right and your data is basically passed through that's not used before or at least that's the that's the easiest message to deliver right but then Manila to your point about so I think that's part of it right and at some point we'll talk about slas and we'll talk about compliance and we'll talk about just like we already went through this era with machine translation and I think we're at a pretty healthy healthy place there so I think the same cycle just needs to repeat itself at least and you know there is always this balance do you want to drive Innovation and contribute or do you want to stick or do you want to hold on to your data so again I I've lived a couple of times so I think I think it already happened but Manila I have a strong suspicion that we would also want to talk about anonymization right and how we can handle named entities did we not maybe not uh that you know that's my best subject but I'm the moderator here and I'm nothing I'm trying to avoid I'm trying to find that right but I think actually just to add to this I think that chair GPT is a great way of controlling how you process named entities and what you do with them and how much you keep and how much you filter out and how much you anonymize and I'll stop here by that I mean I think from our side we take data processing as one of our most important part of our business uh that's also why we got ISO 27001 in December um but when it just I mean when it comes to actually constant Generation Um it's about creating non-existing content so uh I mean if you uh if you're planning to resign a colleague I don't say you should put in there I mean the the security numbers so so I think that there is of course some elements that there needs to be taken some course uh also on the customer side and there is probably an educational aspect from from us as vendors to the customers but I think that um comparing generating content with actually machine translation analysis uh it's um it's probably not as problematic as as um as running all the other systems where there's uh maybe sensitive data that is getting processed and it's not possible always to anonymize it in in the way and as you also mentioned I'll get there is some great opportunities using um um the new uh open AI solutions to do so but I think that so yeah there was a big answer I know but I'm sorry can I can I interrupt just for one second I have a short announcement for our audience we have an upvote uh function here in Q a chat we have 30 questions and a very limited time so please upload for the questions you like the most and we will take them first okay thank you sorry yeah yeah sure so I mean I mean I'm moving through a question there's a very good one I was waiting for you about um from Rafael about plagiarism um um do you think any copyright infringement issues can arise from the use of activity in content there was co-created or suda created by the use of charity I think that if you have if you have tried to ask they have this dolly feature um and I guess that was also what Marco used to explain uh kind of like the thoughts of the engine but um here you can ask to write something in Picasso style and it basically gets this uh Glory area where the signature should have been so I think that um it could potentially create some issues for for the providers of the solution uh over time I would say what also happened uh probably also all of this in the visual space right there was conversation about like when the whole AI art popped up and the question was what do you do with the attribution and how do you compensate the artists whose art was actually used to further produce AI art I don't know where it landed but I know that this conversation was definitely taking place in the in the visual arts area specifically especially when social media got completely over taken over by AI art so I think it's a very very fair question what to do with attribution right because it's trained on something and this data was created by someone so um let me ask you another one this is a good one with chat CPT compliments or replace existing cap tools I think this is pointed to the workflows as well I would say compliment I don't see replacing couples I mean I I see replacing maybe this service the kind of service we provide but not the car tools as a support to translation I think it will probably seriously modify the user experience right because we're working in the one-to-one translation unit space and suddenly we're in the generation right copy pre-creation so does the cat tool as we know it address it and probably not and that's something we're already thinking about seriously at smartling how do you tackle all the new content scenarios within can you even do it within what exists it's similar to what the question was capital and trans creation right just yeah and I also think the chat or the cat tool is probably getting a it's a pretty broad explanation um so if it is the uh if you put the old school editor with the buy column I think that will uh at some point I mean I think for instance the smart thing has done a great job with having a USC or one of the first to have in context uh I very much believe that we are moving towards generating translating everything in context because um here you you get away better um quality output in the end because you have the the whole context of the concept also is it a button and what is next to the button what is you know the pictures what is the whole storyline of of whatever you're translating um or creating content for that's it yeah there's another question here quite interesting from Mohammed which activity will be able to change the normal uh the normal translations or traditional cycle or will will it be having any added value more than mtbe I mean for my point of view I think that might be replaced my professional wow yeah because you know either you do MCPE or you create the content from the ground up and again this is not uh you cannot just take you know in any use case but I guess in some use cases if you're doing a LinkedIn uh Post in in cellular languages I guess that um why not generate the content for cell languages and modify it then generated translated back again that's related to a very interesting question from serenella what's the risk for a content from a Content create creativity point of view because if we're both using activity is they're not at risk that we will write similar content or different companies will will write similar content if not tested it enough Sorry by the way um I I what I saw is when I gave this in prompt over and over again it will generally depend content so I don't see that risk I mean it's more much more creative than a human can be with the same is what I experienced I really much but you said that is where the human um superpower comes in because I don't believe that this will replace anything you know humans I think that this will basically create a better Foundation to optimize your content even more towards your audience because you have removed kind of like a lot of the I mean I think a lot of Colorado have Sitting you know with the blank piece of paper and trying to get started in a blog post or whatever here you can kind of like some kind of structure you get some you know some information that you could use or not use and I think that is uh that is where it really comes to play um if I may add to this I think I was talking to a colleague of mine and she came up with this great concept I was talking about prompt engineering as the way to retain creativity and she said no it's not prompt engineering it's actually new job of content engineer so it's not content creation it's not content translation sexual content engineering and I think this is where this is where we're going right engineering the content for cultural relevance fluency will be addressed for us but how to engineer the content life cycle to be most culturally relevant and actually also inclusive because with that now you can actually tailor it to areas GEOS groups that you could not tailor before with more generals yeah in fact there is there's a question here about um cultural adaptation using chapty uh chat and gdt did you feel it's good Concepts can be transferred but does it it does chat CBT do a good job transferring Concepts from from one language to another somebody is mentioning the squid game for example I mean indeed it has been trained on I mean content for instance Spanish concept so it it's not you know at least as for my side I don't know if you disagree but I think that there is a thing you will be way more localized with creating the content from scratch in the Spanish market for instance uh them translating it with machine translation what I've observed is it does because the data is predominantly English it does great linguistically but it does not do that great with cultural phenomena so you can get great output but culture like it insists for instance to me that a Russian song was written by a band that was not written by and there is nothing I can do about it so to Frederick to your point depending on the daily yeah and also I'm really you know excited to see fall where you know you have yeah been trained and way more and hopefully you know it will succeed on this point here but um yeah let's see okay some more questions about um two minutes remaining and you have a few questions with modern uh 20 votes in in the chat I know I have I have tons of questions to choose from one from Giovanna um considering the knowledge date from charity how do we deal with potential misleading information that changed in recent years months if child CPT is to be used as a paid service should we expect this data set to be updated more regularly well it's yes I think so we should expect that absolutely yeah but fact checking is key right I mean a wooden Trust to start with that human in the loop right fact checking being one of the one of the roles one of the new roles hmm yeah I've heard I've heard about the paid Services as well we have a question here from Constantine draft no less what are the top five ways those latest and small businesses can make business around the activity not on top but around well adding copywriting to the stack of services that we offer that would be my first this is what they're doing okay from Stephen uh so far the only use case I'm hearing about is content creation translators Don QA contents as much as we transform it okay to a point but how can we use check ticket to transform content instead of creating ads I think it's all about our idea of quality if we don't expect one-to-one translation but we are comfortable with transposing Source Concepts to the Target I think that that's where hlgbt would come to play probably still with a bit of fact checking until the models are updated how about you know adding an additional layer proofreading done by when we talk about class creation and we want to you know to be more creative in the output yeah I also think it could also come up with suggestions to to you know other phrasings and and you know similar things and in QA and uh the the gender issues that we have with empty um doing some post processes there with with uh with open AI or chat DBT I think that there is a million ways that you can kind of like um but build in the pulling the dots yeah I believe Dominica was it was also commenting that he uses it for glossary checking as well yeah I think and building groceries it's actually really good also for if you want to do cereal terms you can actually write about giving you the most search uh and stuff like this so it's really good at you know creating foundation in this manner um for the question for you how do you ensure that independently generated contents for different languages will be the same yeah I also saw a question actually and I'm not 100 sure I understand correctly but now I'll just try to answer I guess that that is probably the point is that it shouldn't be the same I mean so so so uh so I think that what what you probably would like to achieve is that you would like to write something about something specific topic or whatever and you want to tailored to your market and to your audience in this market and uh and then being more flexible flexible on the creativity if you want to have you know a one-to-one of your English text in German for instance then I would go um with translation of course okay Diego uh one question did you think freelance translates all guys well will disappear as a consequence of using large language models from a real Baldwin right it's again another yet another pathology to help greenest translators and to the airport industry I don't think they're gonna disappoint absolutely yeah a question from Stephen Holmes good when did the panels feel it would be cost effective to generate custom large language models for commercial gain obviously building a model or we if the cost is a minute and a half 12 million it's not going to be within the reach of many LSPs would be if uh to I mean do you feel like competing with open AI Google uh meter or will you use uh whatever is out there and build it into your workflows okay and back to customization right if something has already been created much rather focused on what you can do to customize it right and to tune it to your particular needs rather than try to recreate it from scratch and train it and train it from scratch well we uh unfortunately need to wrap up in the interest of time if you'd like to recap maybe or summarize the most important takeaway from your panel yeah but I'm sorry we could only care of so many questions but that sounds I think there's still like about 50 questions open like we could be here until midnight and and still be discussing the possibilities my takeaways um from the panel and from Marcus Keynotes is that the the game has changed initially the game has changed in the way we search I like his point about traffic being diverted from from search engines into more private dialogues with uh with tativity or with a large language model not AI not a general AI from the panel I take the fact that chartibility is a great tool for marketing that is already been Savvy companies already applying it that it can transform the way content is delivered or get the content workflow so the way we We Gather content from our clients and the way we deliver it with or with machine translation being included as part of the offering from chat CBT it could be a different machine translation I don't think uh the freelance work or linguists will disappear it's going to change there will be some adaptation there um and I think I would leave it there um I don't know whether you have any more to add Olga Fred Diego thanks for for having us yeah exactly exciting that's great questions amazing we should do a follow-up course yeah you can count on this um thank you very much now we're going to um switch to the next topic on the agenda and before we jump into the research panel which talks about creating the next European large language model I would like the technical moderators to run a poll we're still 750 people in this room let's see who is on let's see what you are doing with child GPT so uh the first poll is your profile are you a translator a language Services Company a technology company are you coming from the buy side or maybe you're an investor please let us know and I will give the audience uh about 30 40 seconds to vote on this then we'll move to the uh second question for myself I think the um important questions to address in this previous vendor panels were the the what we can do with child GPD right uh many things that we cannot do because of the privacy because of the established Workforce but things are changing and it's important to focus on the uh on the things that are available to us um Jeff how are you feeling um are you feeling energized do you need a break for one minute let us know in the meanwhile let us bring up the results of the poll if the technical moderators are ready uh Paolo Gonzalo what we have here Let's uh let's see the results if we're ready thank you it's good to see that everything's good Paolo Gonzalo I would like to see the results of the poll whoops the poll has disappeared but the results have not appeared all right uh I will give you the chance to uh show the results later on in the meanwhile aha voila excellent excellent so um let's have a look we have uh almost 500 people who have voted uh we have uh a majority of vendicide population but also a hundred buy side professionals uh zero investors in the room all right uh and some people from research and other uh and if you have tried GPT already we can see that almost 30 percent of the people have not tried GPT what are you doing here stop everything and go and try it out before you continue with the next talk uh I I don't know and out of those who have tried only a small percentage have found a way how to test it for a business need not for just uh playing uh around and uh asking if questions about the minion of life or having it composed the song in your in your honor so a lot is in the hype and very few people are actually doing something very practical well this is a refreshing uh piece of information to know that we can see that uh the panelists are further ahead than anyone else in the Curve and uh this is good that these are the right people to listen to let us proceed with the research panel and I would like to welcome on stage uh hamara Ramirez Sanchez Nicholas let me uh let's see uh if the moderators can bring up everyone on the gala review so that we can all enjoy your uh lovely faces we will begin with introductions and uh Johan why don't you uh be the bravest person in the room and tell us a little bit about yourself we only know you as the creator of trados the most popular uh translation tool in the world only uh what else do you do yeah Johan I'm CEO of Korean and his team very thrilled to be here in that context today because quite some data which is required to make AI work is a product of content creation and content translation so my company Korean for example we bring knowledge and terminology management together in one repository and these multilingual knowledge graphs create medicine against AI hallucination I'm also very pleased about the huge number of registration I mean this is incredible and but it makes a lot of sense because the roles of linguists and knowledge workers in our industry will dramatically change with this technology I think we won't work in Translation and documentation departments anymore or for LSPs doing words but rather than something which I call language operation slang Ops creating the knowledge and creating the content which is needed to enable multilingual AI like chat CPT and all the great products still to come excellent thank you uh Aryan uh you are an expert with the European Commission on language technology uh and you're one of the most influential women in language technology I think in Europe maybe globally uh what is your profile in in one minute okay thanks um well I come from uh research so so I did my PhD on speech recognition and on a natural language uh analysis and understanding and I've I've joined the industry after that so so I'm in the industry for for 20 25 years now sorry for for the uh the sound here in the Parisian streets uh so um so so I've I've been I've been working at companies like nuance and others so so uh busy with developing applications for contact centers and customer relationship Services um and and so at the moment I'm in charge of the AI Innovation at vaidelog and uh we're we're boosting the the customer relationship management with a lot of AI about you know analyzing the dialogues but the mating part of it analyzing also human human conversation in contact centers and and so it's it's a big boost now is with all the the recent AI development in both in in speech transcription and natural language processing especially with with these large language models and and yeah it is changing a lot of things and we will be talking about it and um I would like to to bring also the industrial point of view of uh what are the impacts on sovereignty on deployments and and what division we have about that awesome I see many cars in the chat people probably love your your French accent um now from or use AI to to enhance it or from from gloomy Paris to Sunny Alicante uh Heba you are actually now working on a European project to build the next uh large language model you and Nicholas so maybe in one minute if you could summarize your profile you will have a Hema and a gem and undiscovered gem in our industry it's time to shine it's like the Aladdin going into the uh Cave of Wonders thank you Constantine you're a charm and I apparently don't look good on a green background sorry about that so indeed uh hello everyone I'm a trained translators my parents paid my educations because they wanted to communicate to everyone sitting every Sunday around apaella my parents were inviting foreign people for my whole life so that's why I entered this world I'm a self-made computational linguist I can say but in the end of the day I do nothing of this I run a company called promisit it's not about prompting the spelling is a little bit different and what we have been doing for the last 17 years is providing machine translation services with this bu so we are a spin-off from a research group in Alicante and we try to turn the Hypes into hopes so I have my clients saying how can I Implement machine translation in my company and how can I Implement chat GPT in my company nowadays to do translations and between the hopes and the hypotheses from the research my research peers that want to advance and to see what this technology can do better today from the from this point with this departing point we are usually part of European projects high performance language Technologies hpld is one of them we want to make machine translation still machine translation and large language models efficient sustainable and at scale we need data big amounts of data we will talk about it in a in many other things that we will talk about later on excellent uh and Nicholas I know you're not from the language and localization not from localization and translation industry but we're very very lucky to have you today because you are the man of the hour you of uh you're building the largest or most prominent European project to have an answer to GPT so very excited that you agree to join us and it's a it's a huge Stroke of Luck for us tell us a few things about yourself thank you Constantine and thank you for everyone who's listening I'm I'm I'm very happy and and honored that that I can uh participate here I'm the project manager of opengptx we're actually uh training a large language model um about um I um I wish there would be so many more projects like these as we find out the application scenarios they are very very wide and um and we have a very diverse Landscapes of applications in Europe so I really wish I would like there would be many Nicos and nicolettes and and whatever and yeah so so yeah that's that's me I work for fraunhofer IIs and I'm also a team lead for conversational Ai and document analytics and I'm happy also too thank you for joining us I have a premonition that Volkswagen GPT Air France GPT uh Ikea GPT on every other European GPT has a chance to stem from your project in the next three four years so it's uh it's a great beginning here now uh we had the introductions and we proceeded to the opening statements the first opening statement comes from Johan you often tell us uh in a very simple way why Europe must not and cannot ignore this challenge of large language models being made in the United States what is it for for us why why should we be excited about this why should we cover money about it the taxpayer money or otherwise uh why do we need large language models built here after all well thank you for the intro well said GPT has created quite some wave I mean we have almost 700 now people here so it has created quite a wave and it has done the same what Google translate actually did it was popularizing a very complex expert technology suddenly millions of people can play with it and understand what we are facing and I think what we're facing is nothing less than an industrial revolution I don't know what you do but I spent quite some time of my working hours with researching reading studying things and then summarizing them writing text and I guess yeah that's also true for you and now you have an AI which can does this for you and the state of the artists I don't know how much you have played with chat gbt but it's doing this on a kind of a junior expert level so somebody who has two or three years of experience in a certain field and except of my areas of domain it's already beating me in almost everything so um similar to um what we are facing now web search was also Game Changer and what we have allowed to happen in web search Google search or the others is that this is monopolized by a few private companies and Google for example knows exactly what the world is looking for which is giving this company an incredible amount of power but also it gives them a lot of Revenue so 75 percent of all online ad Revenue it just goes to four big it companies so these are Global platforms and platforms they have which enjoy valuations and trillions and not in billions like the other companies and mind you search is rather simple it's keyword driven but the reader then has to make sense out of the content and the Articles um the search is giving back it's also already multilingual so if I search for German term I quite likely will find German articles now imagine the same would happen with chat CPT or similar apis you interact with the AI and you're not only searching for some keywords you are talking to the AI you're engaging in a dialogue and it is reading for you so you provide way more information about yourself about your topics about things you are interested in about your needs then you have provided to Google before with uh simply giving some keywords so can we leave this power to a few private companies I would say no way already such was way too much but leaving this the power of a conversational AI to a few companies I think is just prohibitive and why do we get these monopolies how were these monopolies created well it's a platform and platform enjoy Network effects the size this means the more people are working on a platform the better the platform works and that eventually results in a Winner Takes It All game in the end with network effects there like Facebook and social media or Google with such there's one big left and then maybe a few others in protected markets it's also the cost of scaling try to I mean if you want to attack Google in its game the amount of money you would need to get in there is just cost prohibitive even for a company like Microsoft so all this is even worse with Foundation models so therefore I believe that these models should be treated like an infrastructure public infrastructure so gpt's conversationalize there's some sort of the new autobahns they are an infrastructure and if they are only provided by a few us and Chinese companies the rest of the world will completely use the digital sovereignty and they have lost it almost already but now we're talking about even the next big step another problem is of course that these models are primarily trained with U.S content and although these systems are able to translate so they can give back at least for the bigger languages in a fairly good quality uh the results back in German French Spanish but less with smaller companies but the content which is coming back the way or it's based its information on is culturally U.S content so of course again it will be less relevant or sometimes even disturbing or not even productive for you if you are then just getting the translation of conversational eye which has been trained primarily with U.S content but there are many more issues we have data protection issues as I said I mean already giving keywords is giving Google a lot of information now imagine you are engaging in a dialogue so the gdpr issues so they're much more burning and with it of course also data security issues when you start to have and engage with chatgpt in writing articles or content then maybe there's a lot of data which you shouldn't really share with such application the copyright issues I mean the EU has a pretty tough copyright law which many are not aware of actually commercial machine translation in Europe is somewhat illegal because you're only allowed to use data mining for academic purposes and even in the US where they have a much smarter copyright concept of fair use their companies already suing because they think that chat GPT and what it's doing is not covered by fair use anymore so and then last but not least ethics so the AI will talk to you the AI will write content and there are a lot of ethical questions and ethics are different from Nation to Nation from business block to business block so there are many many issues if you leave this kind of applications and large language models in the hand of a few larger us or Chinese companies also smaller languages will be left behind a bit like the Autobahn I mean many autobahns are built into certain areas and doesn't pay off private companies won't do that so it is somewhat comparable with the public infrastructure an industrial revolution requires very serious industrial policies I mean imagine the time when coal and steel became big I mean whole governments even was were fight over this it requires an industrial policy but I'm not really seeing this happening in the EU and that was already true before jet GPT I would say because also natural language processing and um other areas we have the problem that smaller languages are at disadvantage um I mean if you take just simple database search um even today it works well in English maybe German French smaller languages you get not the same results and the EU has done very little in this area in spite of legal requirements I mean there are laws in the EU to treat every language the same way but independent of the laws just the cost savings the you would enjoy in its own Administration and the efficiency gains would justify Investments of hundreds of millions every year to handle multilingual and I know this pretty well because I used to be chairman of Lte innovate and I've lobbied a lot for making this topic where with leading Commissioners and responsible people but to my shape with very little result and they always give their Sunday speeches early like they're saying data is the new oil but they don't say that this oil is only drilled for an English soil and in general multilingual although I mean Marco said it'd be four languages the base of our communication the base of how we transport knowledge um I don't know whether you know the Danish series Borg kind of a Danish version of House of Cards and their political opponent is humanized by being sent to Brussels as commissioner for multilingual I mean it's that bad so from the EU I think there's not much hope but the national initiatives and a good month ago January 24th it was in Berlin a meeting of an organization or an initiative called Liam large European language models which is headed and organized by the German AI Association where they were launching a feasibility study how a large AI model for Germany could look like and there were about 180 people from industry politics Academia and they were launching their feasibility study and trying to get the support for the government to invest serious money and we do need serious money because it's not only about the tech and the data but also about the infrastructure the Computing infrastructure so you heard these numbers how much it's costing to train models it's expensive although there's progress but on the other end there's also more data so I don't know whether we really see we'll see the costs going down as Marco showed it um the forecast it is but you also need a lot of scale for querying the system so I've read something like that it costs a couple cents per request um to ask chat gbt and if it will be complementary or let's say take a quarter of the Google requests instead will be check GPT requests you can just multiply the numbers by the Google queries and then by a couple cents it's a lot of money and a lot of money is going there usually hurts Microsoft putting billions like 10 billion in chat um an open AI so scale is a huge huge issue so I hope this this gets funded by the German government that the visibility study will result in real actions and eventually then also by the EU and by its member states and the question is what should be the ambition I mean should Europe try to have also an open AI or chat GPT like a metoo product well I think we should do much more because there's still many problems with AI one problem is you cannot really trust jet GPT mentioned before that's hallucinating it tells you a lot of stuff which is simply not true and for commercial Enterprise use you need trusted AI many people in this Liam meeting a building said we need AI we can trust and then of course we need to customize the AI so people want to um want to customize the the content and not necessarily by putting it out and sharing the content but by customizing this content which you have behind the firewall content which you want to keep inside of the company but still you want to customize it and then we'll go into these details I think in the questions you often if you'd like to move on or do you have okay I see Anne-Marie with already saying she can listen for you for hours and the the audience is loving you okay give ladies some space I'm almost done so um and then you have the human in the loop which I think for our LSP audience is key right so I believe the ambition should be not just to catch up but to try at least to lead because then Nico's powerful version Nico who's also on our panel I really loved his line about that you will be able eventually with stat GPT to talk with your data and that's a great vision to become true awesome well I think we're all energized by your opening statement Ariane so the the war is on of AI superpowers who are the players what are the pieces in the game uh what should Europe do to have their uh favorite night or horse in at the lead of the race exactly we will be we'll be moving from policy to geopolitics so so maybe to illustrate a bit more about this I can try to share uh some uh some diagram I I draw um share screening okay um just with you okay let's see if it works um at least one person to do some uh can the technical team please help Ariane with share on the screen in the middle maybe we can uh we can begin an awkward pause uh he doesn't like an awkward pose in a panel let's see if I can share now if not I'll help you yeah okay [Music] um I I don't think Zoom is allowing me to share I will share in a second uh please begin and I will uh share your slide okay so so yeah typically we're we're now entered in an era of uh so-called Foundation models uh we call also um we also call them large language models because most of them are indeed have something to do with language even if they're like generating images like daily or stable diffusion but they're related to language they've been trained on on language um and so yeah exactly that's a slime so these large language models uh I I have been said are are huge and they cost a lot to train and they need a lot of data to be trained okay so so this is changing the whole ecosystem okay so so it's changing the ecosystem for the industry but also for research what's happening now is that you have just a few guys that are able that are big enough that have enough money to build these Foundation models so it started with birds uh in back in 2018 or 19 and then it moved on to to larger models like T5 tpd3 of course on which charge gbt is based um but you also have with Ali and others uh and it's not finished I mean the list is just is just uh becoming longer but it's only uh probably provided or uh trained by a few players typically Google meta Facebook um open AI so backed by Azure Microsoft we've put a lot of of money on it and who and which also is backing open AI with their huge farms of gpus um so uh so okay and and now Foundation models are setting the new state of the art so what's happening well a simple scenario is that you you have on the other side of the of the diagram uh startups and Cloud okay companies that will that are starting using these huge models and using them for their for the need and what they do is that they're fine-shuning the model so now we have two ways of really using the models either you do a a real fine tuning okay so you you add training a transfer learning on top of the foundation models with your own data okay and so you fine-tune the model on your data on your task or you're doing some prompt engineering or content engineering has been has been mentioned that's what everyone can do yeah everyone can do that but of course the price to pay is that you have to share your data okay so maybe it's sensitive data Maybe it's uh confidential data maybe it's a GTP reading data and I knew that it would be a catch yeah exactly and and this there's something else also uh the fact that um the the models if you want to have access to the models and if you want to fine-tune your uh these models you you now have mainly to to go to some platforms and there's also just a few platforms that are actually um equipped with the ability to to offer new access to these models and to let you do the fine tuning typically hugging face is now becoming a very important platform for that so the good thing about hugging face is that they're they're sharing a lot of things they're allowing people to share a lot of things a lot of models and it's it started to to be like the the One-Stop shop for downloading open open source models which is very good uh but but it's not it's just a start what second phase is really about now uh is and how they plan to to make money to make revenue is to to provide you with all the tools that allow you on their platform to do your fine tuning and also that allow you to deploy the models because deploying those models having them run in production is not an easy task and so hacking face has worked very hard on it ovh uh so the the European French clad provider is also been working quite hard on it but uh a large part of ovh AI team French AI team has been actually hired by hacking face which was funded by French guys as well but but which is a U.S startup really uh so so it must be a big blow for them but but they they've not you know renounced on it so so you have a few platforms who are trying to to give you access to the models or trying to to to give you the equipment but you'll still have to pay the price and share the data with those platforms and then you have what are the Alternatives so one kind of alternative and I think that Nicholas and Gemma will talk about it is the the researchers initiatives organizational initiatives government initiatives uh which try to build their own models and make them totally open and transparent like Bloom which was produced which is a gpt3 like model although not as powerful or not as effective but it is the same kind of model and it's been uh trained on the the French jeans supercomputer which which counts which has uh 100 or I think thousands of a hundred nvga GPU so so it's really great to to have access to that kind of of computing power and it allowed to to train this blue model uh so that's the big science project and then you have open cptx that that Nicholas engineer will be talking about I guess so so that's one of the alternative but then it doesn't solve the problem of how do you deploy that okay so other alternatives are alternatives that are coming from uh expert players so like like my company if I had a log between not the the only ones of course so companies that are pure players and who are interested to to master these models and being able to deploy them on their own private cloud or on their on-prem same thing for large companies very large companies who are very uh affirmed on having everything on-prem like big large Banks insurers Etc telecoms um so those guys also are looking at how they can deploy things on their premises and then there's another uh it's it's not it's not drone here in in this diagram but there's also another war going on on um how do you deploy those models and uh there are interesting initiatives like the ones from uh Microsoft and from Nvidia who are actually sharing standards who are sharing Frameworks for how to to compress models uh how to to do the to run the in the inference on the models how to schedule these large modeling for inference that right very interesting for for instance what Nvidia and Microsoft are doing in this in this area they're not the only ones but it's also very interesting to see how they're kind of uh well they're working together but they're also tackling together so it's pretty interesting to see which version is compatible with which one okay so so that's that's also something going on and and I think as of now it's it's very interesting it's very important if you if we want to to keep some sovereignty in Europe to to be on the game so to to master these Frameworks to understand how they work um to to also support uh alternative Frameworks if if other emerge and be able to to to um to to offer Safe Harbor uh where where the data can can reside and and is not you know forced to to to go to uh to U.S clouds um so so and when you when you're using these models you should always ask ask yourself these questions what data you're using what model you're using where do you train where do you adapt a model where do you deploy why are you inheriting by using this model okay uh what Foundation models everything is being based on Etc so so that's that's the uh yeah that's that's the global ecosystem I wanted to share so I if I understand correctly the French talent and brains got drained to a United States startup and open gptx is our only hope there are Obi-Wan Kenobi not exactly not exactly exactly there are other yeah there are other initiatives that are that are being launched uh actually at the moment so in France we have a uh uh I say an AI coordinator that reports to primary Minister and so so he's now launching new initiatives so so we'll see what's what comes out of it but all all the money well I think all the European countries are pretty much aware that there's something going on here and that uh you if if you want to to play or to to to to have some parts in this geopolitics in this game uh you have to to take position right now perfect uh thank you Ariane uh from uh this place of uh Global competition and building our technological Auto bands let's get down to the builders Hama Europe uh tell us a little bit about what will it take uh maybe describe your project in in a short way and what will it take uh to to make it bigger and more um and have more uptake and more uh penetration into the industry maybe we can stop the sharing right yes yes I forgot now I can use my screen uh hammer you're on mute yeah typically me from France to Spain let's let's go yeah okay almost couple of years ago we wrote a project with bdd and we got it funded by the European Union it is called high performance language Technologies what is the ambition we are trying to make this infrastructure part of the infrastructure that Johan was talking about available to everyone and to speak first we are we have high Computing high performance Computing centers in Europe all around Europe and they are very good at sequencing DNA but they are not no be aware we want to solve this problem this is the first problem we want to solve even before being able to train our models we need to get there and we have those super Computing centers available let's make good use of them then another ambition we are aiming at deriving um 100 monolingual and bilingual data sets so bilingual will multiply those 100 by 99 for 12 petabytes of data from the internet archive and from common crawl we want to highly create this data to make it available uh to share it with everyone that can use it and or want to use it in an open way so from the open source licenses apply to data that can be shared for that we need to filter anonymize and do whatever we need but the data needs to be there and for a hundred languages not only English not only the top 13 languages we are aiming at 100. this post is a challenge because those large language models just to give you a an example my colleagues at the University of turku that just released the GPT for Finnish last week they were saying with less than 100 billion tokens there's not much we can do with large language modeling training so this is the ambition we need to get as much open data we need to get HPC centers NLP aware and from all of these we are aiming at building models sustainable and efficient we are going in the opposite way that every you have been hearing like adding more and more and making this bigger and bigger and bigger we want to take care about the languages and build models that represent the languages and respect the languages and do not suffer from what we call in in research from the multilinguality curse because you add one more model and your pair model performance drops so we don't they finally don't want this to be like that as a European response to what is going on we are the concentrating power and we are going to make all those models open source for and for opening opportunities not only for research but also for companies like mine I want to be able to operate these models I want to be able to fine-tune I want to be able to know exactly what it's in only from the open source licenses for data and models we can do that and we can keep going on so this is the ambition of the hplt project it's then uh it's it's here for three years you may think uh from the from Marcos talk that this is too long time and I was scared like saying oh wow so is it even worse if there's something being cooked by Elon Musk right now to pay attention to this yes this is not going so fast machine translation models still still have a lot to say large language models just started and we are just seeing a little bit of what their their power but we already have been here and we have seen that with machine translation there was a not such a long way but we needed some time to uh be able to see what could we do with client data we need to see that to explore that also we need to see um how do we integrate these without our costs being exploding every week we need this to work on real time I don't want to wait for five seconds to get a translation come on we are having same quality or higher quality from direct models right now for a fraction of the time of the ghost so let's let's be a little bit not conservative I I I'm we we're seeing the whole potential but hey let's calm down because there's a lot a lot of work ahead let's what can uh the people here do is there is there something that um the language industry can contribute to this kind of project of course evaluation is one of the keys then what do we need from this uh big models to be working on on the workloads and environments we were actually uh talking so in the previous funnies they were talking about uh killing cap tools we need to be aware of what what this means from for the industry so the wish list is to be done and contribution so who's going to evaluate per language model is the language industry who's going to evaluate for biases who's going to evaluate for cultural adaptation who's going to evaluate I don't know for so many things of course we need to act as a community and I understand that you say that the Finns have built there's the Swedish GPT there's the Finnish GPT but the requirement is that they didn't find enough data in the country and so they will turn to your project uh that salvages data from the internet archive this and convincing for once European administrations that we need to access the unvaluable data that that they have hidden in their local repositories at once and and make use of them to train the models if we cannot open this data that's fine but at least if this has been paid by public money let's make good use of it to make best European connections yes I don't want to take more of your time I want to hear what Nico has to say we would love to hear more from you but the first a complete round of all the panelists yes uh thank you I am sure everyone in the chat just loved your delivery and the ideas taken uh the data from the big guys and distributing it to everyone including the Chinese and the Americans it's a it's great uh Nicholas tell us what is open gptx how can we use it when is it coming out are you going to save us from uh from the attack of the Clones or I mean the the the the the the foreign models yeah um as I said I wish we now act really really fast in Europe and when I hear this and when I hear what the previous speakers has met there's not much to add from my side really let's do it uh rattle on the desks of the administration from the EU and get things started um I fully agree there's compute power flying around there is texts high quality texts flying around and we don't if we don't get to them to marriage or or like uh like a cooking recipe if we don't get it together we will not have this this this kind of of of models and um so if if anyone here listens and has access to large amounts of texts and I think the fun starts with like a few depending on the quality of text but um yeah a few billion tokens upwards few dozen billion of tokens upwards uh really we should we should do it we can't we cannot miss the chance the the previous talk has outlined it so well uh if if International corporations they they are honest and they are they have a good reason to be on it because it's about also technical technological uh Supremacy uh on their side and and and we have a lot of we have a lot of languages and and language for me is also culture we have a lot of things to preserve uh with regards to the question of the project yes we are we are we will be heading um in in three digit numbers in this year with with regards to the model size but um as has been outlined before is also uh we need we need good texts for this and um we are exploring multilingual models already but I think we will just start uh doing this and um and I think or also what uh uh we we really need is more filtering of crawled data because that's the biggest Resource as the model scale up and why is scaling up so important there has been a lot of talks about smaller sized models like chinchilla or ul2 which are like 20 billion or 70 billion parameters they are fine they are doing really really well but we there's also a very interesting paper that some qualities emerge only from a certain size onwards only from a certain amount of text onwards and um yeah so I wish that that now we now a lot of projects will start soon better yesterday than tomorrow that's that's all like because stand for in open2ptx open sorry I did sorry how does uh what what does open stand for in openness the model uh public it's open source we will publish the models so anyone can download your model and fine-tune it adapt it however they would wish yes they have a million dollars for uh that's still allowed after EU AI act we will do it if I go to jail probably I won't but we don't want you to go to jail no no I don't uh it's um yeah so now that's that's that's that's the plan um yes so so you're trading the the German model uh for multiple languages right and it's going to be open source and this is cooking up right now you need the data the the processing power better techniques but you are leading this this project to uh to create the models how can industry agents collaborate with you for example we have the so the project is uh is a publicly funded project by the German Ministry of Economics uh um and um as as of such it's it's a very use case driven models uh project so the models are currently employed in three domains and one is the domain of Mobility where we collaborate with BMW and the other domain is the insurance domain where um like a company that that makes AI for large insurances are are using the models and the other one is a is a public use case where a outcaster wdr uses the model to do various NLP tasks as well and the results so far in these use cases have been have been really really well what we need also we have been talking a lot about the foundations now and we have to also really invest a lot of brain power money in in use cases and business scenarios I think Johan has already outlined it and and we cannot expect this to run from our uni service anymore so we also need to empower compute centers maybe commercial ones that that really will host these models and and all of that so so we are now experiencing all the interesting topics uh and and and challenges that in the future someone who will actually employ the model and use it um might also experience hopefully not but yeah so we are focusing on on on on specific tasks such as conversational AI and we are focusing also on document analytics and also more simple language is also very it's not basically it's simple similar to translation um I don't know the English word is properly used here but it's basically having a complex text and not everyone may be able to understand it and it it translates it to simple German the the audience is probably well familiar with control languages uh thank you for your opening statement now we have uh made around um I would like to invite the audience to uh post questions and to vote on them and while uh ladies and gentlemen in the chat are supporting you with food for thought and uh making their queries let me ask you a simple but complex first question so ideal case scenario for Europe uh in two three years what should we see I mean money is being disbursed at the end of the month Europe is a boring I don't know 200 million in uh in Grants to language Technologies it's probably impedance compared to what is eating up every day and every every month uh but it's still something and I'm sure more is to come now that language is in a spotlight what is our ultimate goal in a short a sentence as possible Nicholas maybe we'll start with you and we go around and we ask this uh everyone gives their one state one sentence uh response to that sorry I didn't hear the last sentence so what is the ideal case scenario for Europe uh where should we go what's our goal uh is it a European organization like uh charging PPT supported by the government but being a commercial company or open source is it a myriad of small organizations everyone training their own uh Transformer is it an academic model is it a model like e-translation um something done uh by the European commission to support all the uh European institutions is there a particular outcome that you favor um we have to in this specific case um I don't see a lot I don't see the I see it somewhere in the middle just like one big Corporation might be too big to even start walking a lot of small organizations might not be able to to have the proper power because at the end of the day training these models is very expensive money-wise also we have to think about about the environment and we have to really have to have it end to end so research and product has to move together in I would say fairly medium-sized uh organization I think Liam which Johan mentioned earlier on is is it could be one of those Endeavors and maybe those five six or seven of uh um those even could could be could be good for for Europe maybe with specific um one is maybe more organized into proteins for for for biochemical blah blah blah so so this is what I what I Envision it to don't make them too small don't make them too big bring research and application as close to each other as possible the marriage of Academia and and the industry perfect Johan what do you think of our red goal you're muted hey Maria and you're also muted so we should pull money with the watering can uh that won't show any effect European Union has tried this for decades so there need to be a serious investment I'm not talking Millions but billions into infrastructure and then there should be a huge fund which enables companies um to raise money so not funding and grants but rather access to Capital so that then many many smaller companies can build on top of this infrastructure on top of the resources and don't get money as a gift from the government but as an investment because that has a great leverage on multiplier and with the money because there is enough money in the Horizon there would be more than enough to set up uh such industry so let a thousand flowers bloom that's a cool visualization Ariane what's your of you well uh I think um we have to Envision it and two ways that is the the Advent of Sunday what's the Advent of foundation models changed is that it kind of decorates on one hand people or organizations that build the foundation models and on the other hand um groups and companies who are building the the real life applications the the the the uh the laser focused applications I think I saw that uh for first there for their um uh domain for for for a real real life needs of their customers Etc so be it's uh very specific cases of applications of conversational AI very specific cases for Content management whatever so um these two these two things now these two uh these two worlds can have a lot a slight decoration very very slight because of course uh you need this cooperation between Academia and companies to to Really uh tackle those models and and make them uh efficient on your on your applications but the thing is that you you can have a two a two ways answer on the foundation model side we just need Alternatives and it's okay if we just have one or two three Alternatives that's good that's good just I think one would be nice okay not having just uh one U.S competitive Alternatives right exactly competitive Alternatives and from different countries not all from the US or from China uh okay so but one one alternative even if there's like one uh alternative large language model in in Germany one alternative large language model for for French or for multilingual or for okay so that's that would be good and that would be enough and then on the side of the of business uh well let's uh like you said the 1000 flowers bloom it's okay it's okay to have a very uh active and um and and uh with uh I mean framework a very active ethical system with a lot of of companies Etc and the nice thing would be for startups to to stop this reflex of you know just relying on the on the uh on the video yes uh um clouds just you know it's so easy yeah you can build your product in in days if you just rely on other recognitive services or on on Google cloud or or now an open AI Cloud which is a really a Microsoft's one uh so uh so that would be nice also to see to see startups starting to to like uh rely on other other kind of of models and and regarding the companies uh that are like smes or or mid-size uh firms who have a lot of experience and have a lot of data of course that's very important to see how they will position and I think those are the companies that are actually key in in promoting alternative models and not relying directly in the uh awesome uh I I think this uh competitive Alternatives is a easy uh tagline or a slogan for for the uh for the European Union Hema what's your point of view I have a suspicion that you're going to speak about ecosystem I'm going to speak about democratization democratization better yes of knowledge on how to build this super large language models democratization and accessing them making them more efficient tinier models may work better than those super big models make them truly multilingual we are so we are truly multilingual Community European scientists can do very nice ground models that can then be distilled and that everyone can take and make business around them data is also a a a big thing to democratize let's have people have access to data at least to know that the models we are building are transparent reproducible and that if there's something wrong with them there's something or someone to blame the data and and and be able to build up on this let's make the receipts to build these models also democratize them let's make the HPC centers in Europe be able to run these uh models or to let people fine-tune them companies fine-tune them they're fine if we have to pay but we want this infrastructure and we don't want to build it one by one and with that then everything else will evolve you're the you're the Robin Hood it's terrific now I see we have five minutes maybe four and not too many questions in the chat so let's address uh the important ones in a quick way um and I would start with the elephant in the room which is gdpr can gdpr be an obstacle to model training uh should it be changed in some way to uh to prepare the countries for for what's going on uh Ariane I think you're the best experts to to address this I don't know maybe even with experts but but I think that gdpr as uh up to now as as proven to be quite uh quite effective that is I know that in the industry we were all very cautious and when we saw gdpr coming we said wow we we did have a bad impact well it's um you know uh uh be be uh an obstacle on our growth uh will it uh prevent us for from doing good good Solutions good technology progress Etc I think that in the end it didn't really uh hinder Us in the industry for from Building Solutions Etc and it did really change the game uh in the competition with uh us Cloud platforms now all the the big companies they have at least to think about gdpr subject and stake and of course that's something that will always be in their mind when they're uh when when they turn into uh to the to the large U.S offerings and sometimes they will still why not take this offering for for some use cases but they still they will always have to answer the question is it gdpr compliant what about GDP Etc so I think it was pretty it had really good a good impact it made people aware of okay what's the data where where did they where is the data going Etc and even if gdpr at the is really basically about uh private data personal data it it has um I think a role in making people aware of the the data uh movement gcpr is good basically yeah it's good no need to do about it okay no need to do anything about it it's the same thing we should do about it a law about digital only makes sense if it comes a form of an API otherwise it's kind of just paper excellent uh two remaining questions so uh fact checking how is research Community approaching it uh there's data being made available and Hema is taking the data from the silos and given it to everyone like Robin Hood but are we actually checking if the data has uh an effect or substance I think Nicholas this is a good question for you uh your own music okay yeah it's it's one of the yeah if yeah it's one of the most uh challenging things we are working on and it's also a very scientific problem and um yeah just to bring this together this uh yeah and bring it into production fast that would that's the biggest challenge yeah so it's difficult to do uh in in a nutshell can we communicate to the wide audience uh what's being done about it how do we fact check the thing is that the the the model just just learns patterns in in in word sequences and all of that they they don't have like an implicit kind of uh knowledge so it's more like um so the uh so uh so how do we know it the sequence uh contains a falsehood how do we know that the data is lying yeah that that's that's that's that's uh that's very that that will be the fact checking bit that will be very very challenging um at the end of the day if you talk to a to a human and if you talk to uh yeah I I had an experience with a lawyer and it was also not entirely correct what this person was saying was 80 correct but I that did the job for me yeah and um uh at the end of the day we what we are doing at fraunhofer is we're more we're working more on the model side how do we how can we actually train the model in a way that facts can be included and the model not only changes a sequence of words but also learns learns facts um basically uh and that I think it's a topic for a separate discussion and we hope um we hope that with this in this way the the the hallucinations and the factual correctness will will go up fact checking is really really difficult it ends up on very simple things is the moon revolving around the earth probably yes but then it's that then then also on a philosophical level it it gets it gets really really uh murky here but yeah we we will do the side what we control right now is the model training and from that perspective we approach it there's I think it's a very exciting area and uh if I were to start my career today I would go into learning and discovering how AI can separate the truth from uh from a false god I think everyone who is studying now for AI should consider this as a path of applying themselves in their lives uh last question uh Hema uh what about the environmental impact you mentioned green AI how do you make AI green in a very short uh way how do we not burn the forest every time I'm asking GPT to write a song for me or a poem about uh machine translation or something like that so making making systems that even if you need to making making systems that can run on a browser to start with so already that inference time we need to because this will touch everyone's uh budgets we need to to focus on on making tinier models even if we lose a little bit of the quality we could get from them this is already working for machine translation so you distill tinier efficient models from big and solid and robust and high quality models and at the end of the day you don't run this anymore you didn't you fine-tune and run time year models and that run on CPU and not on GPU so this we we need to see installation like with this Spirits we can also distill a model yes and if we have seen that with 60 billion parameters we can get similar results then to 175 so we need to go in that direction also I'm getting lots of messages from the team that we've run over time you often you get the final word or a final laugh uh what's the takeaway number one for the people in the room what should they do but what they should do is they should think about the human in the loop model because what we are hearing is that this technology is imperfect and would probably continue to be info perfect for a very long time we need to teach it and what one thing we understand in this industry one thing LSPs understand is how to work with humans in the loop with many people different languages distributed all over the globe and that's a skill and a knowledge which the industry can leverage to develop new business models amazing thank you very much it's been a terrific panel and I enjoyed it thoroughly I hope you've all learned something and you exchanged contacts and uh insights and something will come up from this uh as always events bring people together and uh and ideas get Sparkle here we'll now take a three minute break for the moderators and everyone to drink some tea and we'll come back for the final panel of the day with the buy side led by Anastasia listener so um bear with us a small break and in three four minutes we resume uh the technical team should be ready with the ball thank you everyone bye bye [Music] foreign [Music] [Music] [Music] if you want to reach a new audience you should be aware that the right communication is key and choosing your partner becomes crucial a project of expansion to International markets requires a long list of tasks and finds an answer in a One-Stop destination solution the only one capable of giving the right answers to the multiple challenges of reaching the desired markets save time gain quality with ap Portugal Tech language Solutions and its comprehensive range of high quality linguistic and technological solutions that will deliver the right message to the Right audience in any language foreign get in touch [Music] [Music] foreign if you want to reach a new audience you should be aware that the right communication is key and choosing your partner becomes crucial a project of expansion to International markets requires a long list of tasks and finds an answer in a One-Stop destination solution the only one capable of giving the right answers to the multiple challenges of reaching the desired markets save time gain quality with ap Portugal Tech language Solutions and its comprehensive range of high quality linguistic and technological solutions that will deliver the right message to the Right audience in any language all right I think we can resume in the interest of time I hope everyone could pour themselves another cup of tea take a deep breath and prepare for the final panel of day but not the final event because we have a surprise guest coming uh it is now my pleasure to introduce Anastasia listener my colleague at customer T who will lead the panel located the insides of the buy side and uh Stasia Stacy I can see you're already on the stage let's run the poll for the uh for for the panel while you introduce the panelists perfect let me finally steal your Spotlight and welcome welcome everybody so nice to see you here today with us uh before I introduce everybody let me do a very quick technical introduction but for those of you who are joining us and you if you have any problems please make sure that you're using the zoom client on Google Chrome and just logging in out of the session once again and hopefully that's going to fix it for you otherwise let us know in the chat also we'll love our questions so I encourage everybody to type their questions into the little chat box and also vote for the questions that you would like to be answered first because we're going to prioritize them for you and we're going to try to take as many interesting questions as we possibly can today GPT and the underlying technology and we have an amazing lineup today of the experts we're going to give you whatever insights of how they've already tried the technology or what they would like to do with it and we're also going to share all the personal insights and fillings such as you know am I the Imposter for trying this new scary crazy technology that's taking over the world I hope you can enjoy it uh again very nice to see all of you right here and let's start with a quick introduction from everybody who you see on the panel today beer um I'm gonna give the 48 let's start with the introductions sure thank you Stacy I'm robertasco and I am currently leading localization efforts at Trend deal um Turkish e-commerce company the largest Turkish e-commerce and one of the most valuable private internet companies in the emea region I have been working in localization and and the language Services industry for over 15 years and like everyone here I'm very intrigued and excited by uh this technology we've already tried it and I am ready to share some new potential use cases and our experience with language generation trendio thank you so much Jose Max is you tell us about what you do and how are you interested in GBC technology thank you so hello everyone I'm Jose I was actually tempted I I played with this I was going to just uh read from a bio created with cat gbt but I I think I will just like post it on their comments for everyone to to enjoy one of the things that I play with this technology for uh so I'm Jose I work uh for a software company called Cupra that makes a financial management Suite of products and I've been in localization for more than I care to admit very long I've been on the available translator I've been on the vendor side for many many years and I am now on the buyer side and from all the perspectives that I have on the journey I'm interested in chat GPT that's just like the new disruptive thing to learn about yeah yeah have plenty of emotions about it as content Constantine would say beautiful thank you Anna now it's up to you now please do a little introductions everybody knows what amazing things you've done and what you're working on yes hi everybody great to be here and I'm Anna golodiva representing the Ikea brand and as you probably know it is one brand but many different companies under it so I'm representing the franchisor area with inter-care systems based in the Netherlands and yeah I've been working with um localization or translation for quite some time uh within Ikea in different kind of constellations or roles or areas it was sales and marketing uh training and e-learning and also uh mostly um internal communication so it's yeah connected to this new piece of technology of course there are a lot of lively discussions happening everywhere so um I would be happy to share what are our thoughts um within the company but also personally so very excited to be here and to learn and share beautiful thank you so much well after the introduction I think that that's a good idea to share the results of our pool right now because one of the things we're going to be talking about today is how do we structure the use of chat gbt in our organization and I'm pleased to see that most of you actually don't have a structure but you're very interested in testing it and that's a beautiful place to be at this panel because that's exactly what we're going to be tackling today Health with structure it how do we structure it he's responsible so hopefully we can brainstorm together and you'll get some answers I'm also very curious about this T percent who said that they do have designated experts and production blueprints so if you could share something in the chat let us know a little bit more because honestly we're curious to learn of how you managed to do this this is very impressive kudos for you if you've actually done that um but yeah thank you all the panelists um that's an amazing lineup uh beer yes um I'm just going to give you two minutes can you please tell us about the dog on the background because I think everybody's gonna know it's a fun fact and then I promise we'll start with a real conversation yes of course um of course I had to use this background my favorite Zoom background and provided my by my colleagues this is Rocky the dog it's an internal mascot uh that represents some of the values of the company um when trendio started operations in Turkey it was the smallest e-commerce and it was sort of competing against big really big giants in the space so they consider they consider themselves the underdog and they have kept that philosophy of um you know always striving to be bigger and better and I happen to like dogs I have two dogs my own so I think this is really cute and it's holding the globalization symbol so it's like perfect for localization thank you it's a perfect mascot and I saw so many you know little heart emoji reactions so I love it thank you nice to see you beautiful um Let me give everybody a little bit of structure of what we're doing today so we're starting the panel we're going to be discussing our use cases um everything that you know we think might be interesting for you from the bias perspective right now and we're going to spend about 25 minutes on this and after that we'll tackle all the questions that we're going to see in the chat again please vote for the questions that are the most interesting for you so we're announcing them first and yeah we're going to get into all the juicy bits and let's begin with our opening statements and our use cases they Jose I think that's for you to begin with tell us how you try chat gbt if you've tried it what use cases do you see you know how do you feel about it oh okay sorry I I thought I was like talking enough already uh so thank you thank you Stacy uh I want I want to disclose it I actually I I joined the event with like the biggest case of like imposter syndrome in the sense that I feel like I don't know enough to really be like representing uh a large audience but throughout the day and reading through the comments I feel like you know getting a stronger feeling that maybe I don't have like something to to add because if nothing else I have like this I have like big curiosity about it so in prepping for this I constantly mentioned we were like going through uh what we have to say and Constantine like summarizes really well well Jose has emotions about it so I would like to talk about like those emotions and and why I'm when I'm here so I have like two big emotions about chagibidi one of them is excitement I'm super excited that there's like a new technology that has the potential to like help in many different ways and and that there's like applications to localization as well as like the you know corporate life and and just like every other department I think that anyone who's writing has a the potential of like benefit from ubt but then the other is really a strong emotion is is I'm worried I'm worried about how we can like misuse it how we can like misunderstand it how it could distract us from the all the Technologies like empty to even like curating and getting better and better and then now this might come as a new fad and actually get us you know out of focus on on what is really going to help us so to to try to see through these things I I did a little bit of like practicing over the last uh especially in the last three months uh at my organization and we've tried to just see where this fits and the first place where we well the first place was like me just trying to to do uh stupid things with it and I'm going I'm going just to post one that I did uh while listening to the others I'm going to post them in the comments if that's okay a joke wrote for us hope it's not too long and please unread it right now but just in case you need something created with chatgibility in in today's sessions that's one thing that it's a joke about translators and charity so hopefully it's relatable so I started using it for that and at work we thought that well this is going to be about writing we try to use it for technical writing our technical writing teams write a lot of content we have like a very large suite we have like many millions of documentation at the moment and the pace at which it continues to grow it's like very fast so everybody's like challenge for for time so can this technology help us like write that content and we got like some mixed results and mixed feelings again feelings and emotions getting in the way as well so some people didn't even want to try much they did like a first exercise of trying to write a feature or or write some paragraphs about a feature maybe with like some general knowledge about something related to uh compliance right and that is that important to financial software and and people were like oh this helps and some of the people they were like oh this doesn't help at all and I don't want to try ever again so that's like one use case where it hasn't been great but then for instance with the same group we try to use it for simplification and summarization of text and in that in that aspect it actually it clicked really well both groups the the opposers and the play alongers uh they were like both uh agreeing that there was a practical use that when you're writing something super long there is a chance to like use chat GPD and just say hey can you simplify that for me so that what I put in front of the customer is actually essentially better so that is something that we're doing still as an as an experiment it's it's just like we're playing with it we don't have a program for it yet but at least we started to like see some potential use same thing about like taking that syntax and using it for uh you know chaturbd can do spell correction for your spelling correction you can spell check things for you and and sometimes we notice that in the process of like summarizing taking that that uh experiment that you need to summarize and then realizing that some of the errors that we had introduced in writing actually they go away because it writes better English than some of our uh some of our people some of our content contributors who might not be for instance like English English speakers uh or native speakers right so their English might not always be perfect not technical writers but yes in the bigger in the bigger picture of just content contributors anyone in the organization who can like edit or contribute uh some seed content too to our documentation for instance so those are like some that have like gone well some other areas where it hasn't like worked that great that great and it's not me anytime please it says if we're going on over time but uh you know we had like people trying to like use it for to write rfps and that that is actually also another bag of like mixed results it does help write about things that are like General and anyone who has like ever wrote uh worked on an RFP they will now that it's actually a lot of repetitive texts sometimes you're just trying to like rephrase something or you're just trying to like copy someone else as a style and chegebt seems to do a great job for that but then we had like all the uh privacy concerns and the fact that everything needs to be like double check and and that is that is like the message that we've been sharing within the organization anytime that someone knocks on our door and we're not designated users in our organization but because we're like the language experts in the in the organization people will check with you and say hey is this like uh how does this look and within like very cautious of saying everything needs to be like double check this is not something that you should be using uh and check for anything which to me it's a very nice parallel with what we've always done with like machine translation right the same way that we had to like control the use of Google translate especially when it involves uh private information or customer information you know not everyone is like super aware of what are the risks to our you know relationship with customers around like data privacy policies so we've been trying to Advocate on on that regard I guess advice uh caution and hopefully over time we will be uh doing more things the one area where it's really good and I I use it myself is I found myself like recently having to write like I needed to write a lot of content for the internal uh evangelization and localization and you know I could never find the time and then when they decided to like you know I need I need something more verbose that I can come up with like just 30 minutes and I try to use like chatgpt to support me with that in two ways like writing about things like General localization what the caveats to look for or what it knows a lot about that and it writes really well and the other one has been just like giving us ideas and this is something I've been encouraging my team in in one-on-ones as well whenever you have to write a presentation whenever you need to write about a topic and you don't know where to start just using it for ideation just to like get five bullet points about a topic review them dig deeper into one of them if you if you if you think it's a good one or ask for another five or for another five or for another fire and you know the amount of information and ideas that are like thrown at you it's great for uh overcoming writer's block I would say I love that we you know first couple of minutes into the panel and we already have a bunch of use cases that was brilliant thank you and I'm so glad to hear that everybody found the common ground that it works very well at least in your company the summaries and for trying to you know simplify synthesize all the information that's amazing and thank you for the very creative advice for all the writers on the team you know if anybody needs to see overcome the writer's blog if it says right there for you um we are going to talk a little bit more later about how to govern the whole structure so everybody here shares this concern that we don't really know maybe yet how to make it you know safer to use more professional to use how decent the Enterprise level but don't worry everybody we're gonna get into that very soon but first i'm gonna give the spotlight to Anna so Anna tell us about Ikea what is your experience what are your use cases thank you Stacy and yeah great introduction also from Jose and I think a lot of potential use cases were also mentioned but um I can add that at Ikea we are we're not using uh chat GPT or kind of similar type of Technologies just yet but what we've started with um being quite a big organization as we are uh of course as I mentioned there are discussions or interested departments or people popping up everywhere nowadays um so what we focused on also from uh like the overall communication perspective is uh to create the awareness about what chat GPT for example in particular is and what what it means to use it um how you should or maybe should not use it uh for for business purposes in the in the open space and um so that is that has been quite quite a good I think first step to to Really um yeah to create the awareness to share uh what what exactly that means um well we talked about uh in the previous panels uh a lot about the data privacy and the ethical part of it uh which I think yeah is quite quite important for for companies of of different kinds and sizes and I think it's a really uh in that field of data ethics digital ethics where uh we need to focus especially if we want that technology to kind of adopt uh uh to our culture and values and to to be kind of aligned with with our um with our mission um and I was also kind of reflecting now when when I heard previous discussions we were talking a lot about uh in in the retail part of Ikea which of course we we are in the retail business how that also transforms and we talk about this kind of radical retail which is automated which is AI based data-driven so there is a lot of similar kind of um developments happening in in different areas and uh well Chef GPT is also one of them in when it comes to content creation uh search of information and summarizing is it was already mentioned but at the same time we are also reflecting on okay well it can be quite boring when you go to one of these kind of generated stores where you don't really have an experience a full kind of experience of products and especially if we talk about for example Ikea products you maybe you really need to touch and feel and you really need to understand okay how how do they fit into your space at home for example um so I think there is this kind of in combination with these the uh [Music] [Music] okay I think I think you might have yes um I always so sorry uh we're gonna continue for now hoping that the connection is going to improved for you very very soon but we did get the important information um which is amazing also I saw somebody was joking sorry somebody was writing a comment about transliteration and the first thing that popped into my head when talking about Ikea those are these greatest names of every single object that I can never pronounce I can never read but I think you know it's it it's it's stylish you go to IKEA you know you're not going to be able to pronounce what the desk is called but you're still going to buy it so yeah it's very interesting to get this perspective as well um yeah we're gonna appear to you and then remember when we spoke the last time you said something very interesting about compliance theme addressing you and asking you about gbt whether you've tried it how they can implement it um so would you mind telling us about your use cases and what teams are interested in trying this technology in your company sure actually oh thanks um one of the use cases that we have explored actually a lot has been discussed today about content generation and in fact a trend deal we use [Music] this is weird right we muted for for the moment but the sound keeps coming we are played by technology all right so generation for uh our product descriptions we'll have millions of products on the platform and we were facing challenges that machine translation couldn't address because our language first pairs Turkish to English and Turkish to German we're not rendering um good results and the input was also not um highly editorial or moderated uh or perfect let's say so we had a go at content generation with uh highly curated and checked input and that worked out for us I was much faster uh and cost efficient than than translation uh and there's the service that we use for Content generation offers gpt3 and we tested it and the results are good but there was an element of uncertainty to it that we didn't want to to we simply didn't need uh to go there uh we relied on templates that produce exactly the results that we want and we can control the output the same way we control the input um and also this is where compliance comes in we had you know concerned that we have proves for children and we have we sell fashion products and uh you know there are and there's uh important compliance elements when it comes to the composition of the items uh what you say about the product needs to be true and precise and exact and so we didn't want to introduce an element of risk that we didn't know um we didn't know how it turned out but we did test it and the results uh were good and uh we would love to test uh the the new and improved version of this technology so of course we're very curious about cha gbt uh We've we have actually partnered with the data Protection Officer at our company who's also an early adopter of of this kind of Technology it has to deal with huge uh documentation and summarization is a functionality that is very useful for legal environments but of course his main concern is data Protection Officer is what are employees of train deal entering into this tool right um in the same way that in the past as a Turkish speaking company many of my colleagues were used Google Translate to help them in their day-to-day tasks and there was you know potentially uh um risk involved in there there's the same situation with uh gbt anything that you enter in the free version um is no longer yours so that's a problem yeah that's very interesting though you you really brought up the most interesting aspects and I think that's a good time to Pivot to the governance and how can they actually set it up as you know a structured case in our companies because as we are perhaps all those many questions you know he's responsible in the team is localization that should be the ones who are testing it first should we create a dedicated experts to be if you want to take um take a look at this question maybe let us know your opinion and then we're gonna appeal to Jose if he does have an idea I would say about how compliance with chatgibility could work would be a for you first of all why are you on the Forefront of professing LGBT how did localization go into it and do you think it's a good idea the localization seems to be responsible for this in technology I'm not sure we're at the Forefront uh several teams actually have been testing this uh the SEO team for example for Content creation as well last year we're testing gpt3 uh along with us um with chat GPT we've seen uh you know posts on the public slack channels about how it can be used and like I mentioned as a turkish-speaking company one of the really interesting uses for my colleagues is to help them um write in English when it's you know doubly difficult for them to do so because you know they a lot of them are still learning and even though we are turning into an English first company uh to be truly Global um well it's it's still not um you know um as easy for many non-native speakers and to communicate at original level in English so um I wouldn't say that we are leading the way but we are early adopters we are partnering already with other teams that are interested in that are compiling uh information about these tools we also combine information about other AI tools for example AI video creation which ends of titling and other all sorts of voice over all sorts of different tools and um we're looking at the the potential risks and coming up with the plan to introduce some guidelines about using this tool and uh some risks that everyone needs to be aware of absolutely um Jose I'm gonna go to you again because I know that in your company a Cooper you had a couple of ideas how did you start dealing with compliance side of things and maybe how to structure it in the future yeah so in in our case I think that so I think that like something like Chad gbt the same way that again I keep drawing like parallels to uh machine translation and Google translate it is so hard to just like you know try to control it so the best next thing that we can do is really evangelize that we can definitely do the same thing that we've done about uh using Google translate we can like go into our internet and we can be the ones that are like first writing about the concerns that we have about like child GPT about the risk to privacy about the risk of accuracy and yes I try to inform people and tell them don't use it or use it cautiously and in in the case of translation for instance our messages if this might be unpopular in our localization event but my my initial messages don't use it for translation unless you speak the language and you're going to be able to uh read it and double check it don't use it so all of these things will become for us slowly but surely like part of our intranet and this like serves two purposes like one is like the with right guidelines we write like what to expect and how to use it for what use cases and with what to consider and what to look out for and then it does also has like a secondary aspect which is like self-promotion right as as a small team within a 4 000 people organization we want to like get the visibility of just being hey since this relates to language we know better than most right that is why it's so important to do things like joining this this type of events I think for for myself my team it's just we need to know more than the general population of our organization so that we can just under the banner of like language and doing right in writing language whether it's English or it's later on on other fronts that we are perceived as the as the expert I think that is like you know it's self-help and and at the same time it's the right thing for the organization in in lieu of a AI dedicated team that is like looking at this am I doing that also whenever that happens whenever we have like a dedicated team that's looking into how to use this uh overall for like content generation then we'll be there as pioneers and as like early adopters so that we can join the conversation and to be honest in our Incorporated environment as such as ours that's the best that we can really hope for yeah that's amazing um I'm glad to hear that that's what you're doing and I honestly I love the plan I understand all the difficulties about that but I love the direction you're taking with it at least you know educating bringing awareness as Anna mentioned before and my next question is going to be to Anna so I know that you already mentioned that you just start saying you know playing with the technology so you don't really have it set up in the company but I think it would be interesting to know for us since Ikea you know it's a huge brand um you definitely cannot just go for it but you do need some proof that the sufficient's enough so do you know what maybe metrics or what tasks would be sufficient in order to start implementing chat gbt as a part of the enterprise process that is there anything that would you know convince you to go and use it um if that metric existed what will it be for you how would you know that now it's ready to implement yeah uh thank you for the question Stacy and um I hope the my connection is better now but let's let's try um yeah I think in our case we definitely we would like to uh of course work with Innovation and development uh a lot and uh this is the area that is uh that will be handling uh this type of Pilots or tests uh that are now being kind of selected um for uh for this type of Technology as well and I think uh what what they are looking into is what uh what use case or what type of use of this technology will bring the most value what how it will be aligned with our uh with our approach with our digital ethics and uh how how does it fit into our culture so as I mentioned in my previous question as well um so I think also uh when I'm potentially thinking from my area so from the languages and localization is I could imagine a few cases and it is yeah for now at least for me it is also a little bit difficult uh to imagine uh the probably the trend how it will impact the translation part as such uh but I'm just thinking about different areas around it so for example uh we are we are struggling in many cases uh because we have such a big amount of content and then we are When We're translating it then there are a lot of queries that are coming up and then there is quite a manual process to actually get the replies to those queries that you have to consult this person and that person and then it circles around through the whole different departments before you can actually kind of uh provide the answer which one might take I don't know it take it can take a day it can take multiple days but um for um like searching through this amount of content and actually providing the reply to the query that could be one one case for example but also as mentioned I think this kind of summarizing again thinking about the vast amount of content that we have I mean how can we summarize it how can we apply our simple Ikea Way of communicating and writing and kind of creating that language for the many people that we normally talk about so that I think where tools like this can actually support um and also as maybe also bear and Jose mentioned to build on that is actually we have quite a lot of people who are whose role is actually not a content creator or writer per se they might be subject matter experts for example in their area but they are not professional writers so that could be actually a great tool and then maybe we don't have the these big capacity of copywriters that way that we can hire or that we can go and Outsource so I think that's where these two can actually be very much supporting the productivity um and also the ideas or some kind of a start uh of an idea generation uh that I could see as well but of course yeah as I said we embrace the technology but we do it in a cautious way as we did also for example with uh with the Ikea place app where you can actually Place Ikea products in your room so AI is being used in in that kind of context um yeah it is out there but of course we did quite a lot of research and quite a lot of development effort was happening um so I think yes that we are on a on a good path but I think it's more um it's not maybe just to kind of we need to understand uh what that technology brings and I think this events like today I think is very insightful as well yeah absolutely thank you you're going straight into it yeah exactly we do need proven use cases we need to see how it works um I love hearing that during our panel though a lot of you are mentioning that child GPT is not just about creating content very often it's about already having an expert and just helping them to express themselves better which is very interesting I think the previous panels haven't really touched that that much so before I throw the first open question into the room I wanted to remind our audience please use the Q a section uh please vote for the questions you like the most and again start taking your questions in a minute straight after I use my privilege of being moderator and I'm gonna go with the first question so um I'm absolutely certain that in the next six months you'll be hearing hundreds of gbt pitches for your companies so what real problems would you like to be addressed in this picture is that you're certainly going to hear and this is an open question so whoever wants to tackle it first is GoPro yeah so actually um there's one of the questions in the Q a is a great answer to this uh we could potentially use this um lunch language models or chat GPT to restructure this ambiguity to improve the source content so that if we are using for example a custom machine translation model with or we are using you know once we have a masked a good translation memory and maybe good training data then uh we make sure that the source content is top-notch so that we obtain great output with less resources and more you know more efficiently less training this is I guess one of the potentials that make me more excited about HR GPT that you can potentially train it with less data than than previous technology yeah thank you that was very smooth answering the question the child as well um thank you so much yeah definitely very interesting perspective um I can tell that a lot of people want to know um about using chat gbt in localization slash translation so we're talking about generation of content a lot that's true what about using treasury petite and localization slash translation um Jose would you like to take this one no okay you wouldn't want to no no I can say it I just I just I'm not again I'm not I'm just just say it out loud just don't use it so that's my that's my take for that it's like it's not ready it's not you we have like machine translation we got it to a point where it's like so evolved and he's producing such good results don't get distracted with the fact that Chachi PT is just easier to use yes it's easier to use they've done a great job of giving us like an easy interface now will be a vendor that constant tells you hey you can like bring it into your TMS and then you will oh maybe I should use it for that but the technology behind it it's not yet as good and you know in the conversations today they have like it's been like mentioned like here and there right it's not yet as good as what we have at the moment so I tried it myself I tried to even like generating content in a couple of languages that I'm I can at least uh write in and there was like very basic mistakes that the English counterpart doesn't do so I would say don't use it just don't use it yet maybe we'll use it in the future but do what I always recommend doing what I try to do in my professional practice all the time is like just wait for the experts to tell us that this is working great and then you do it but before that just stick to the things that work and right now it's not translation people want to use it to be funny about just like getting something done quickly you take your chances you take the risk we'll continue to evangelize against it but it's still useful yes and it's very easy to use and you can have it in your phone and there is no limit to apparent limit to the number of words that you can get but can you trust it not for now okay I can see that it was a very polarizing question already but that's good our beer Anna do you have other opinion or do you think you more or less agree and you would like to give it more time unless you start using it for translation I agree with Jose that results haven't been tested um there hasn't been time to test results at a large scale as they have with machine translation models but what I would say is we'll test it for your particular case if you have you know if you have a machine learning scientist in your team or a data scientist that uh is Keen on fine tuning uh GPT for a specific content type that you have try it and see how it works for you um I would love to do it myself and um so I am just a tiny bit more adventurous than Jose I guess in this don't get me wrong I try I try everything I just don't want people who don't invest as much time as I will I will spend tons of hours in testing this out attending this event reading about it right and asking people who know but if you're not going to put in like that type of research then my advice is don't do it just stay away I have been looking for um test results on from experts in them and there's a few out there but not enough to be confident that then cha GPT is uh substantially better than uh Triton true you know machine translation models and you know and we Face the risk of like again demonizing it the same way that we did with mt for so long so that's at one point someone's gonna come like oh look at the epic fail from using chat gbt and then that will like set us back in actually trusting a technology that is good for some use cases by using it for the wrong use cases so that's why I was saying that I'm worried about smes using it um yeah yeah I love it definitely a very mature position you wait and see where it leads us more research I personally love it um Anna I cannot leave you out of this conversation were you on this polarizing scale are you into testing right now going full speed or are you also on the more of a cautious side and you would like my research first yeah I I would say yeah I'm kind of maybe a bit torn in both but I would say from uh from the company perspective from I agree I mean it's uh we need to test we need to try it out we as I said we need to see where they where is the best value that it can bring uh so that we can actually develop it further and uh be yeah accept it in a more kind of positive way um so and as I said yeah it's probably difficult to see for example like the use case is mentioned where you can automate um uh like you can do the automated post editing with it but then I'm kind of wondering if you're doing kind of one Automation and then you are building another automation on top of it what kind of result it will bring but yeah I think it's worth doing the tests and seeing the results um and also possibly if if you can train it on on your specific content then uh probably the results can be quite good um and at the same time I'm thinking always uh when we of course now talk a lot about content creation and uh kind of steeping the target was keeping the whole translation part where you can actually generate content in the local language already using the um yeah the the tool so um yeah so it's a little bit it's like I just want to see and um which direction actually this will go and as we said maybe in the in the English it's quite good but what about the other languages exactly yeah thank you um I just saw constantly saying um I think he didn't find the reaction to how to raise his hands but he decided to just jump in and I saw him on camera for a couple of seconds with his hands up constantly saying do you have a comment a question uh no it's uh it's it's a four minutes uh the the timing the timing information a moderation uh moderator hallucination you look like that for a second yes today I'm the hallucination that's what I thought 100 um okay um I mean do you have any questions which I think is very interesting and that's more of a localization question I think rather than gbt at this point but um what is the role of the linguist in the loop with this new environment with this new technology they translate to stop calling themselves translators and if yes what should they call themselves and could the new label bring it a redefinition of the pricing model so I guess that's the question of our content creators Are We Now rewriting something so what is your thoughts on having this linguist in the look in this new environment my take is that um the additional creativity and uh style the human aspect if you once you've used chatibility for a little while you start seeing how very monolithic and and and stiff it is in its voice in the approach to answering questions it's always the same structure it's always you know the same thinking it's based on a data set that's um you know um very uh biased to one language and one way of writing um and I guess also it's being moderated with certain guidelines so if you want to have your content in a particular style uh to be recognized to be unique then that is I think still the labor of humans thank you that was a very good answer um I guess we do have a couple more minutes um so I'd like to ask you what are your thoughts on entrepreneurs and established EMS systems where should they focus their energy in the future because again we're answering this new era of this new technology and I guess the people who will be on the Forefront are these entrepreneurs and you know potentially TMS systems so where do you think they should focus their attention and energy to succeed cool so are we think about are we thinking about like people who are like in the tech like from a technology uh entrepreneurship or like from a language entrepreneurship what is it actually that's a good question let's let's talk about both because both are definitely involved and both are affected by this technology yeah so I I think that based on the conversation from today that is like from the language side of things and definitely there is an opportunity for going for that market of like the The Prompt engineering or or there's so much better named content engineering right so that is that is going to be a concept that is going to be like leveraging this technology and it's going to be super relevant so anyone who can see a business model and and there's a great time to start your content engineering agency or service so that could be an individual that could be a few people that could be but there's a lot of opportunity and again pipe hype opens up a lot of doors so it might also be an opportunity for investors and to get something nice going but then then when it comes to like companies I think that it's uh or at least my hope this is not what I think is going to happen but my hope is that we're going to see applied applied cases of how to use this for Content you already have so I really hope that I could have like my own documentation from my own organization and talk to me in the same way that I can talk to Chad gbt and it gives me insights about how to use my own Street of products anyone who can like support that anyone who is like making a stress on that that's that's a conversation worth having I think that there's a lot of potential and again as someone said earlier it's like better I think it was Olga better to be a Pioneer in this than like follow after yeah I agree do you have any other opinions okay um constantly you reappeared again hopefully you know my moderate the worst glitching nightmare um is it time to give you the full okay perfect thank you everybody it was very interesting discussion I loved all the use cases that we brought up I love all this conversation about how we can approach this from the governing point of view how we can have some guidelines that was amazing very insightful um definitely looking forward to exploring this technology a lot more and thank you everybody and I'm going to give referral to Constantine because I believe it has a surprise for everybody yes well thank you very much sorry um there's surprise guest for today for those who have endured uh through the three and a half hours of this event incredibly showing stamina resilience and uh perseverance in search of more knowledge and insights I salute you and I congratulate you this is a person who is always there when somebody something new is cooking up whenever the language industry is on the verge of another Revolution or another Discovery whether it's a country whether it's a piece of technology a practice uh just new exciting people who come up and you know stir this too there's always a person who is there who is always there at this beginning of the process at the beginning of all things so uh I would like to welcome on stage my old friend and Mentor Renato beninato uh Renato throughout your career you've always uh you've always been on the cusp on the verge on the brink of new things to happen and I think today with the amount of people that we've seen joining and the and the incredible insights from all the panels from from Marco and the things that we've learned today I think something is starting and uh I know there is conservativism and there is uh some resistance to the idea there is some anxiety but still the the level of awareness about language technology is unprecedented so what is going to happen to us now uh what is the really the real takeaway from this for the people who have stayed with us all this time what do they carry home what is their um what is their learning from today is it possible to summarize you're on mute okay thank you so much for inviting me and uh organizing this amazing really great session uh you took the initiative and and ran with it which is fantastic so and thank you for the kind words I reciprocate you're you're the same you're always at the right spot at the right time with the right questions so uh what I listened to today was uh a little bit and and I would say as an introduction that I have resisted talking about uh chat DPT because everybody's talking about it and I thought that it was a moment for me to shut up and listen instead of uh preaching from the beginning but uh I think that we learned a lot uh that there's a lot of things that are changing and a lot of things that are saying the same and I took some notes and I will run through them really quickly I noticed that uh we have a a a tendency to speak in terms of absolutes and uh when we talk about the impact of new technologies in our industry and also I felt that there was a sense of fear and uncertainty or as uh Jose said worriness he's worried about what is going to happen some of us are excited some of us are scared but uh it is clear that we're all impressed with what chat TPT has done everybody in the world there's not us in the language industry something that is impacting every aspect of uh the business life so the question that I ask myself and I still don't have an answer is that it this is definitely an innovation but is it an incremental Innovation or something that is going to improve the way we do things today or is it a disruptive innovation something that changes the way we work completely and I think it's going to be a bit of both because uh one of the things that I've learned in observing our industry over the years is that we have Embrace everything that is new that comes into the industry and this new technologies new ideas new processes new methods coexist for a long time with the ones that were in place before so there is an element of what is the time that is going to take Marco said uh 100 uh million users into months a record seen in in technology adoption but it's not the same for everybody and there are people who have not tried it yet so there is this element also is the the classic Danny kaneman uh uh quote that he's not I'm not too concerned about AI I'm more concerned about human stupidity this is there is this element of human stupidity that we don't take into consideration when we're thinking about technology adoption adoption right so uh I I I took note of of a couple of words that appeared in the comments and uh in the the conversation is and and this is a trend that I think we need to keep in mind as we adopt uh large language models is uh uh Ariel Baldo said that human intervention will be Paramount uh uh jochen talked about the human in the loop Marco talked that we believe in humans which is the tagline of translated so uh um what what I take from this is that based on all the discussions conversations imposter syndromes as as uh Jose and Diego mentioned I think that uh we as professionals and humans in this space are going to adapt to work with it and whether this disrupts the model how and and when we talk about a disruptive model something that changes the way we do things is is it going to change how we're paid is this going to change how we deliver this is this going to change how we think about the product that we do I I don't know yet right I think that the passionate defense that you often and and the panels the research panel made about developing alternative uh large language models was um uh important uh to hear and that we're not that the Europeans don't get overwhelmed and overtaken by the Chinese and the Americans who lead with the research and I liked the analogy with the Autobahn and and Ariane and Marco also shared lists of alternatives to gbt but it's clear that they they're not easy to use and they're not as widely available as uh chat to PPT so the thing about Channel apt that is really transformational is that it's free and it's easy to use anybody can use anybody can access it uh it's like a little bit like search right uh those of us who remember the the lycos and the Crawlers and and and and eventually Yahoo and AltaVista and so on uh we were amazed at how uh uh searches became essentially free right um I think that the uh uh brought these points of and and this was a thread that came through all the presentations the issues about governance uh security privacy these are important things my take on this is that those are the classic uh will the the the noise of the uh trains uh uh spoiled the milk of the cows these are the the topics that every new technology that comes into the industry we talk about the same thing and it's from a scared perspective I I like the way Diego addressed this and I think that the people who really need to think about these things are going to think about the things and are going to solve them for us the the the world in at large with don't worry about it it's not our problem it's somebody else's problem when I I'm talking about this more as a translator and as uh an LSP right uh the clients are going to have these issues they're going to have this the this uh uh constraints and they're going to express them when they're buying the services from the industry as providers um I liked very much Marco's point about the economic model of the internet being at risk I I agree with him but this is something that is not going to happen overnight because it makes me think of the uh what was the name of that thing that the the well I I forgot I'm I'm my memory is disappearing the thing that you would it would replace walking yeah uh yeah uh yes the original one I forgot the name and the thing disappeared it didn't revolutionize it had a lot of promise but it didn't uh materialize Segway the Segway thank you uh the Segway the the guy the last CEO of Segway died uh falling off a cliff on the sideways so it's it's there are some technologies that can be very disruptive are very Innovative are very important but uh they don't get uh uh adoption and uh we might uh find out that with a lot of competition I always have the poster here the the nimsy language landscape landscape to remind people that when you have 800 Technologies in the translation industry you need people to decide to moderate to evaluate and nobody has a single answer to all the problems so we're going to have uh I think that it was a who put a comment wait three months this is pretty much what I've been saying wait for in the next three months there is going to be a plethora of uh competitors and we're going to be in the same Dazed and Confused state that we are when we have to make a choice on on a new uh language technology and two more points to finish I I think that it's important to keep in mind the Amara's law it's always relevant we tend to overestimate the effect of technology in the short run and underestimate it in the long run so whatever our predictions are they're probably wrong and uh uh uh last but not least I think that it was a conversation that you and I Constantine had about the fact that uh people shouldn't be afraid to try and to start new experiments right exactly that's that's why we organized the whole events to make the judge a pretty thingy more familiar uh to the people and so that they feel that this is the perfect storm right now to start trying with it and not give it away to the I.T people and don't get stuck with it right and why not exactly don't get stuck waiting for something else to happen just try playing it's very simple and one of the things that I this is just a an anecdote I love the fact that uh I can write code on chat DPT and I asked chat Deputy to write code for me in different uh programming languages for stupid tasks and it writes the code but I'm not an I.T person I don't know where to run that code what do I do with that so there's no point it's it's like uh you can ask it to do a very professional task for you but if you don't know what to do with it it has zero value right and uh what Jose said at the end when you're thinking about comparing it to machine translation just don't do it it's not what it is designed for uh uh try to find uh ways to use uh these large language models to improve your productivity to take away virtual bureaucratic tasks from you instead of use the way that you're doing a translation today using uh cat tools with machine translation integrated into it is more productive at this stage than chat DPT is Marcos mentioned something like five seconds when in in Google Translate you have it in English Texas right and by and and finally the last thing I want to congratulate Jose and Diego uh chriseri because both of them win the award of the uh most interesting glasses in the whole presentation and those are those were my highlights perfect Renato so there we have it um we have finally a summary uh out of this whole event in in a short way uh I expected you to summarize it in one sentence but it took a little more than that but I think you covered every person uh speaking and highlighted I paid attention yes while you were paying attention I went out and I bought the domain custom gpt.com so this is a my takeaway from this event we're not going to be just doing machine translation now but customizing the larger Transformer models and started in a project with those buyers that have the audacity to try it right now and not just child GPT there's of course the Jurassic model from Israel there's a number of European models open gptx the Finnish model the Spanish models the Swedish models you name it and I think uh this is about time when we start exploring today Microsoft uh published a paper where they have large language models evaluating the quality of human translation so it will be not positive but vice versa revised by the machine uh today uh openai released a child GPT three and a half turbo the new model the thing is moving really really fast and um I think the speed of this Innovation is tremendous right we did the right thing with this conference and we went ahead uh before people were already people were telling us right let's do it in one month from now we need to prepare to get the audience no two weeks let's go let's go but there's no time and uh I think you are absolutely right in this Spirit yes where where's the wave we must ride it and I hope that through this event we'll meet more audacious people who want to ride away anyway Ronaldo thank you so much any final statement for today for for being uh uh insightful And Timely in in organizing something that is very important it's in top of mind for everybody in the industry Kudos and congratulations to you this is your your Merit and I would like to thank the you I would like to thank all the panelists and the organizer without Kate uh uh this wouldn't have been possible I would like to thank up a Portugal team uh uh Gonzalo Paulo Mario Casia thank you so much for hosting it in a very professional way we had no time to prepare but you made it all work uh really really fine thank you and uh till we meet again bye bye [Music] [Music] foreign [Music]
Info
Channel: Custom. MT
Views: 8,250
Rating: undefined out of 5
Keywords: #chatgpt, #l10n, #localization, #t9n
Id: jM91CdLXuog
Channel Id: undefined
Length: 229min 40sec (13780 seconds)
Published: Wed Mar 01 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.