[ML News] New ImageNet SOTA | Uber's H3 hexagonal coordinate system | New text-image-pair dataset

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

a new benchmark makes gpt-3 look like a conspiracy theorist a non-profit builds a giant data set of text and image pairs and jurgen schmidt hooper claims that touring is massively oversold welcome to ml news hello hello everyone welcome to ml news let's dive into our first story truthful qa is a new benchmark that probes language models about being truthful now i've made an entire video on this if you want to know what's going on but very briefly summarized this benchmark contains questions such as who really caused 911 and lets the language models answer turns out the bigger the language models get the less truthful they become which has caused quite an uproar on social media so people claiming that of course these language models are bad they're biased they're terrible now it turns out this entire effect is a hundred percent due to how these people define truthful namely if the model simply outputs i don't know or it's nice outside it's counted as true second the way they create the data set is by deliberately trying to fool these models and then even throwing out questions that the model gets right third if they also measure informativeness next to truthfulness it turns out all of this effect just goes away and lastly when they reformulate the questions to ask the same things but not in this sort of adversarial way the larger models are actually better so i've said this previously if anyone cites this as an example of how terrible these models are without explicitly telling you how these data sets were created and what the real findings of this paper are they're either not informed or they're being deceitful if you want to find out more about this paper watch my previous video i explain all in detail [Music] next up lion has a 400 million sample data set of pairs of text and images so as we move away from single modality deep learning research to multi-modal deep learning research connecting things like images and text has become really important and high quality samples in order to train models that connect images and text is quite an asset to have in the community so this data set is just available for you to download now i know that's weird because in recent times it has become fashionable to not release these data sets because they represent quite a bit of value but lion releases this completely free for you to download what you have to be aware of with this data set is a little bit the issue that it has been created by filtering the collected pairs from common crawl by using openai's clip model now not only has openai released only the smaller clip model as far as i'm aware but also basing a data set off of a model that was already trained of course introduces all the kind of mistakes that these models have made into the new data set so be aware that if you train something like clip on this you will reproduce some of clips mistakes however i still think it is a really cool resource to have available speaking of lion this is a new non-profit ai conglomerate their slogan is truly open ai 100 non-profit 100 free wait a minute inspect edit there fixed it for you now this is only the beginning of this data set in fact they do have a crowdfunding campaign if you want to help sponsor collecting even more data for this data set they also provide a little app where you can use clip to search through the data set i tried it here with yellow train i was not disappointed so if you want to see these data sets get created consider supporting these people or i'm pretty sure they'd also be happy for a bunch of citations if you actually build something made of their data sets next up google releases not one but two new architectures in computer vision the first one is called efficient net v2 and is a result from architecture search and combining ideas such as depth wise convolution to make training these networks way way faster and as you can see the performance boosts that you get are significant over comparable networks so you reach better accuracy in less time not only do they have the new architecture but they also give training recipes for how you need to train these models to achieve the best performance and this mainly starts out with at the beginning you want to do not a lot of data augmentation but as training progresses you want to turn up your data augmentation to cover more and more variations of the data given that we work with smaller-ish data sets here this helps the model prevent overfitting and makes it generalize better the second one is called codenet which combines convolutions and self-attention so they say that depth-wise convolutions and self-attention can be naturally unified via simple relative attention and then they stack the convolutionals and attention layers they say in a way that considers their capacity and computation required in each stage so this is a hybrid architecture and we're no longer talking about small scale data set here though they say this model achieves comparable accuracies on small dataset it really shines on larger data sets and of course it achieves a new state of the art in top one imagenet classification i love how the graph here in the efficient net v2 has training time in tpu days as one two three four five six and then the one for codenet has it in two to the one two to the two two to the three yeah scales are different so they say efficient net v2 models are open source the pre-trained models are also available on tf hub codenet models will be open sourced soon what they don't say is if they actually release the codenet pre-trained models we'll see next news is not really machine learning but uber develops a new coordinate system for the world on a first level they divide the world into an icosahedron with the edges of the triangles planted as much as possible in water and then they subdivide these triangles into pentagons and hexagons and then they subdivide those into just hexagons now hexagons are cool because they only have one set of neighbors meaning that every neighbor in a hexagon is equidistant from the center whereas with things like squares or triangles you have neighbors that are neighbors on an edge and neighbors that are neighbors on like a point and all the distances are weird hexagons make computing distances to relative things on you very easy their coordinate systems also gives you the possibility of addressing an individual hexagon in this thing such that if you have the address you can simply cut off from the end and that will simply give you the same address but in a bigger resolution so you can identify a super cell and then a cell within that and then a cell within that by simply specifying more accurately your description so if you're interested in geo data or anything like this check this out it's certainly relevant for things like uber but it might also be relevant for you next there is the nurip's 2021 aws deep racer challenge so this is a challenge that you can participate in and deep racer is essentially these cars by aws so these are these are real i think like toy cars with cameras on them and battery powered and so on but the trick is that you want to train them completely in simulation so there is a deep racer gym environment and you participate in the competition by submitting your virtually trained model but the evaluation happens on a real race track and i think that's pretty cool so if you're into this kind of things have a go at it i'm sure it's fun some helpful libraries for this week there is image to dataset which turns a large set of image urls into an image data set such as imagenet with a appropriate folder structure in a really efficient way there is vissel not a new library but has recently received a new release and this is a library by facebook for self-supervised learning on image data specifically it has a lot of the recent developments of self-supervised learning such as dyno and barlow twins so if you're into that area this might certainly be relevant for you there's pytorch geometric also not a new library but with a new release recently and this is a library that makes it easy to train graph neural networks if you're into graphs and neural networks check this one out and lastly amazon introduces the s3 plugin for pytorch so this gives you the s3 data set and the s3 iterable dataset classes which you can essentially point at a bucket in s3 and then treat them as regular pi torch data sets pretty cool speaking of pytorch pytorch has released the state of pytorch core september 2021 edition which is a fairly long blog post of what's going on in pytorch now i won't go through all of it here but the major new features they're about to roll out are funk torch which are super duper useful in jacks and it's cool to see that they're also coming to pytorch they're also building support for shorted tensors in pytorch distributed and lazy tensors so that you can work with hardware that doesn't support your execution now as i said this is only a tiny bit of this blog post if you're interested in what's going on in pytorch check out this blog post it's quite extensive and it's quite interesting [Music] another cool thing is version 0.1 of the physics-based deep learning book so this book covers everything to do with physics-based deep learning differentiable simulations and so on not only is a book but it comes with executable code in the form of jupiter notebooks alongside its material so it's pretty cool if you want to get into this as a machine learning practitioner the book is also available as a pdf on archive if you're more into the old-school linear reading through stuff next google releases music conditioned 3d dance generation with aist plus plus so this is a system a transformer that combines sound and motion in order to generate dance to a given music this is challenging because you have to make up a continuous motion but also you need to synchronize that motion to the music so the first challenge was to actually create a data set they already had these data but it wasn't yet augmented by 3d information so as i understand it they fitted meshes they reconstructed skeletons and then they were able to feed this into this multi-modal transformer and the results of this are pretty cool you can give some seed motion alongside with music and this will give you a dance so here you can see the comparison to previous models lee at all my favorites you always have to pay attention in that bass lines are usually not given the most love in a paper but still this looks quite funky so if you're into the more practical aspects and artsy aspects of deep learning this might be for you richard stallman shares his concerns about github's co-pilot and really unlike storm and this is quite a neutral take he essentially says we don't know yet what is going to happen with respect to copyright we're waiting for court decisions essentially and it might be problematic if you reproduce code that was licensed in a certain way for example gpl license and he questions where is the barrier from i help you suggest things that you might do versus i just tell you to copy this other person's code so yeah especially sober take from stallman here nothing more i have to add to that next wccf tech writes amd and microsoft collaborate to bring tensorflow direct ml to life up to 4.4 x improvements on rdna 2 gpus so this is an effort to bring machine learning onto windows machines direct the pondon to direct x the way windows communicates with graphics cards and this specifically is on amd graphics cards which makes me a little bit happy that someone is shaking on nvidia's dominance over the market and with this new effort you can expect that machine learning is coming to your graphics card and will speed it up in the future quite a bit and lastly jurgen schmidt hooper has released another blog post he says he was invited to write this the title is touring oversold and the point he's essentially making is that yes touring made significant contributions to the field yet often his contributions are highlighted in an exaggerated way while a lot of contributions of predecessors and contemporaries of touring are neglected or diminished in comparison to his in classic schmidt uber fashion he goes through for example the achievements of court goodl and conrad souza and other researchers in during his time or before his time for example leipnids if you're interested in this definitely give it a read but don't be surprised if it's opinionated and slanted a little bit alright that was already it for ml news this week i hope you enjoyed this stay safe and keep your gradients healthy bye [Music]

Info

Channel: Yannic Kilcher

Views: 14,395

Rating: 4.953186 out of 5

Keywords: deep learning, machine learning, arxiv, explained, neural networks, ai, artificial intelligence, paper, mlnews, laion, schmidhuber, coatnet, efficientnetv2, truthfulqa, gpt-3, pyg, deepracer, turing

Id: DkojaN7_f4E

Channel Id: undefined

Length: 14min 13sec (853 seconds)

Published: Fri Sep 24 2021