What is BERT? - Whiteboard Friday

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] hey moz fans welcome to another edition of whiteboard friday today we're talking about all things burnt and i'm super excited to attempt to really break this down for everyone i don't claim to be a bert expert i have just done lots and lots of research i've been able to interview some experts in the field and my goal is to try to be a catalyst for this information to be a little bit easier to understand there's a ton of commotion going on right now in the industry about you can't optimize for bird while that is absolutely true you cannot you just need to be writing really good content for your users i still think many of us got into this space because we're curious by nature and so if you're curious to learn a little bit more about bert and be able to explain it a little bit better to clients or have you know better conversations around the context of bert then i hope you enjoy this video if not and this isn't for you that's fine too so excited to jump right in the first thing i do want to mention is i was able to sit down with alison edinger who is a natural language processing researcher she is a professor at the university of chicago and when i got to speak with her the main takeaway was that it's very very important to not over hype burt right there's a lot of commotion going on right now but it's still far away from understanding language and context in the same way that we humans can understand it so i think that's important to keep in mind that we're not over emphasizing what this model can do but it's still really exciting and it's a pretty monumental moment in nlp and machine learning so without further ado let's jump right in i wanted to give everyone a wider context to where bert came from and what where it's going and i think a lot of times these announcements are kind of bombs dropped on the industry and it's essentially a still frame in a series of you know a movie and we don't get the full before and after movie bits we just get this one still frame so we get this bird announcement but let's go back in time a little bit traditionally computers have had an impossible time understanding language they can store text we can enter text but understanding language has always been incredibly difficult for computers so along comes natural language processing the field in which researchers were developing specific models to solve for various types of language understanding so some a couple of examples are named entity recognition classification we see sentiment question answering all of these things have traditionally been solved by individual nlp models and so it looks a little bit like your kitchen if you think about the individual models like utensils that you use in your kitchen they all have a very specific task that they do very well right but when along came bert it was sort of the the be-all end-all of kitchen utensils it was the one kitchen utensil that does 10 plus or 11 natural language processing solutions really really well after it's fine-tuned this is a really exciting differentiation in the space that's why people got really excited about it because no longer do they have all these one-off things they can use bert to solve for all of this stuff which makes sense in that google would incorporate it into their algorithm right super super exciting and where is this heading right where is this going so alison had said i think we'll be heading on the same trajectory for a while building bigger and better variants of burt that are stronger in the ways that burt is strong and probably with the same fundamental limitations so there's already tons of different versions of bert out there and we're going to continue to see more and more of that and it will be interesting to see where this space is headed all right so how about we take a look at a very oversimplified view of how bert got so smart and i find this stuff fascinating it is quite amazing that uh google was able to do this right so google took wikipedia text and a lot of money for a computational power tpus in which they put together in a v3 pod so a huge computer system that can power these models and they used an unsupervised neural network and what's interesting about how it learns and how it gets smarter is it takes any arbitrary length of text which is good because language is quite arbitrary in the way that we speak and the length of text and it transcribes it into a vector so it will take a length of text and code it into a vector which is a fixed string of numbers to help sort of translate it to the machine right and this happens in a really wild n-dimensional space that we can't even really imagine but what it does is it puts context and different things within our language in the same areas together and so similar to word to vec it uses this trick called masking so it'll take different sentences that it's training on and it will mask a word and it uses this bi-directional model to look at the words before and after it to predict what the masked word is and it does this over and over and over again until it's extremely powerful and then it can further be fine-tuned to do all of these natural language processing tasks really really exciting um and a fun time to be in this space so in a nutshell bert is the first deeply bi-directional and all that means is it's just looking at the words before and after entities and context uh unsupervised language representation pre-trained on wikipedia so it's this really beautiful pre-trained model that can be used in all sorts of ways um so what are some things bert cannot do so allison edinger wrote this really great research paper called what bert can't do there is a bitly link that you can use to go directly to that and the most surprising takeaway from her research was this area of negation diagnostics meaning that bert isn't very good at understanding negation for example when inputted with a robin is a a predicted bird which is right that's great but when entered a robin is not a it also predicted bird so in cases where bert hasn't seen negation examples or context it will still have a hard time understanding that there's a ton more really interesting takeaways i highly suggest you check that out really good stuff now finally how do you optimize for bird again you can't the only way to improve your website with this update is to write really great content for your users and fulfill the intent that they're seeking right and so you can't but one thing i just have to mention because i honestly cannot get this out of my head is there is a youtube video where jeff dean uh will link to it it's a keynote by jeff dean where he's speaking about bert and he goes into natural questions and natural question understanding and the big takeaway for me was this example around okay let's say someone asked the question can you make and receive calls in airplane mode and the block of text in which google's natural language translation layer is trying to understand it is all this text it's a ton of words it's kind of you know very technical hard to understand with these layers leveraging things like bert they were able to just answer no out of all of this very complex long confusing language it's really really powerful in our space consider things like featured snippets consider things like you know just general serp features i mean this can start to have a huge impact in our space so i think it's important to sort of have a pulse on where it's all heading and what's going on in this in this field so i really hope you enjoyed this version of whiteboard friday please let me know if you have any questions or comments down below and i look forward to seeing you all again next time thanks so much
Info
Channel: Moz
Views: 2,412
Rating: 4.3777776 out of 5
Keywords:
Id: Owo36iI6hLA
Channel Id: undefined
Length: 10min 0sec (600 seconds)
Published: Thu Dec 17 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.