CppCon 2019: Andrei Alexandrescu “Speed Is Found In The Minds of People"

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Any idea how these micro-optimizations and overall idea with threshold work with parallelization of the algorithm?

👍︎︎ 12 👤︎︎ u/JusticeHunter 📅︎︎ Sep 18 2019 🗫︎ replies

What do you mean, there's no rotate in insertion_sort?

template<class FwdIt, class Compare = std::less<>>
void insertion_sort(FwdIt first, FwdIt last, Compare cmp = Compare{})
{
    for (auto it = first; it != last; ++it) {
        auto const insertion = std::upper_bound(first, it, *it, cmp);
        std::rotate(insertion, it, std::next(it)); 
    }
}

👍︎︎ 13 👤︎︎ u/TemplateRex 📅︎︎ Sep 19 2019 🗫︎ replies

I sort of follow the trick at 28:20. The least element will be begin[0]. However, the min-heap property ensures that begin[0] < begin[1] and begin[0] < begin[2], but it does not follow that begin[1] < begin[2]. So how is it ensured that the second and third element are ordered correctly (the unguarded_insertion_sorts sorts from (and including) the 4th element to the end, right?)

👍︎︎ 30 👤︎︎ u/Space-Being 📅︎︎ Sep 18 2019 🗫︎ replies

I enjoyed the talk. I'm not convinced if attributes are the right way to go. Consider annotating the "complexity" of a function with a condition, with a slow computation only inside one branch.

Wouldn't it be better to do this sort of thing in "profile guided optimization" ? Maybe there should be some way to prompt this information. Just thinking aloud.

👍︎︎ 8 👤︎︎ u/danybittel 📅︎︎ Sep 19 2019 🗫︎ replies

Being able to specify user definable attributes and query them at compile time has always seemed a hugely important but severely lacking aspect of C++ as it is currently

Currently, if you want to serialise a data structure, at some point you have to do something like

auto get_should_serialise()
{
    return std::tuple{m_one, m_two, m_three}
}

Or whatever. But what if in some situations you want to serialise something only over the network, and not to disk?

auto get_disk_serialise()
{
    return std::tuple{m_one, m_three};
}

auto get_network_serialise()
{
    return std::tuple{m_two, m_three};
}

Ergh. Not only is there a lot of repetition, but it also can lead to a lot of template instantiation depending on exactly what you're doing

Somewhere, somehow you need to specify this sort of stuff for other systems to consume

I often wish I could write

struct my_class
{
    [[disk, network]]
    int my_variable;
    [[disk]]
    int only_disk;
}

or some equivalent, and then another system could simply introspect at compile time and process this

👍︎︎ 10 👤︎︎ u/James20k 📅︎︎ Sep 18 2019 🗫︎ replies

The talk was interesting, but I couldn't get behind his conclusions. User defined attributes for describing complexity sound like a maintenance nightmare.

He was right when he said his talk was niche. If anyone on my team started replacing conditionals with arithmetic tricks they would get some training on writing maintainable code.

👍︎︎ 24 👤︎︎ u/wright_left 📅︎︎ Sep 18 2019 🗫︎ replies

Are the slides posted anywhere?

👍︎︎ 4 👤︎︎ u/RumbuncTheRadiant 📅︎︎ Sep 18 2019 🗫︎ replies

I'm always looking forward for Andrei's talks, but this talk I think I've already seen :(

👍︎︎ 7 👤︎︎ u/R3DKn16h7 📅︎︎ Sep 18 2019 🗫︎ replies

What a great talk. It's great to have someone like Andrei in the community. I'd never really considered this introspection-based programming before. I wonder if this can be added to C++ - the language already feels like it's buckling under it's own weight sometimes.

👍︎︎ 10 👤︎︎ u/Ayjayz 📅︎︎ Sep 18 2019 🗫︎ replies

Captions

I'm not gonna give you Andres bio he literally has a Wikipedia page if you haven't read it you should read it some of you may not have seen an Andre talk before and I kind of envy you because those of you who have seen one know the experience they're about to have for the first time you're going to laugh you're gonna learn and when you go home I haven't heard it yet but I can guarantee this will be the talk you will tell other people about so get ready this is a famous algorithm everybody knows it is just implemented an assembler when you figure out what it is raise your hand you've got 15 seconds I just want your attention it's not it's just one random code all right official announcement hurts gonna do book signing after this talk there's no time mentioned he's gonna eat when he's gonna end all right then he's gonna wait I talked to him he's gonna wait until it ends you're prisoners here and this is because we have a shell of material to go through cells real quick what sorry excuse me sorry it's a confusion I said shed load I said shed load it's a word it's a decent word it's in the cambric dictionary can bridge like not Cambridge Massachusetts can bridge breaks it okay too soon say it's a word so what's the deal with sorting what is the deal with sorting well it's just the most researched problem in computer science ever like you know if you google like if you go to scholar Google come and look for sorting algorithms is just a bazillion papers and the most recent is 2019 and it is like it's meaningful it's not like an analysis so yeah it's new sort algorithms and new stuff it's absolutely amazing so sorting is absolutely fundamental from computer science CS complete if you wish like the basic it's if you're a musician you gotta play preludes for anna maria magdalena bach right any instrument you play if your yoga i heard a good left there if your yoga guy you gotta do like savasana I don't know like services I can do suffice and he's like you lay on your back and do nothing that's awesome so if you're a programmer you gotta implement sort once in your life just got to do it sit down and do it and it's not even very difficult because we have a number of consecrated algorithms of which quicksort would be naturally the first to come to mind so why is it so popular in all implementations of standard libraries for C++ and other libraries for C++ and other languages you're going to find quicksort like the pre-madonna of sort implementations now it's very it's a very nice algorithm it makes a lot of sense it's the textbook application of divide and conquer it just you can spend time and talent and you know you can have a lot of insight on corner cases to optimizing it because it's so simple it's easy to Anna actually it's kind of interesting quicksort is easier to analyze than quickselect which is a simpler algorithm so quicksort iana it's just easy the math is not difficult quickselect is absolutely crazy yeah I actually like the the latest and greatest like they had to use mathematical software to analyze it because they couldn't go the formula comes like so it's more difficult and Einstein's general relativity okay maybe I'm exaggerating it just a bit so it one nice thing about quicksort is it does very little work on data that's almost sorted or sorted if you cut it right if you kind of take care of it and everything it's gonna kind of work very nice yes it's very nice property idempotence which is also polite word and meaning it if the data's already sorted is not gonna shuffle it there are studying algorithms that are going to shuffle data even if it's sorted such as such as Thank You heapsort the first order of business in heapsort is we gotta put the largest element that you know the very back of the array the first off business let's take the greatest that I'm putting at the front of the array so hips are like worst case in heaps or it's just as good as the average case which is kind of like they had to find that it's kind of a fundamental disadvantage heapsort has not to say there's been lot of people who spent on improving it but has difficulties and it's kind of on average is fast and that's kind of a you know computer thing computers like to be fast on average they don't want to be fast in the best case or don't was they want to be faster on average for large data sets and quicksort last but not least is well balanced across comparison swaps meaning it doesn't do too many compares compared to swaps it doesn't matter you know it kind of does about comparable when with a native naive implementation of quicksort the usual is like two recurrence but I replace one recursion with just to be clear right so it's more recursion normal OOP nothing to see here except we have a primitive partition pivot that's going to choose a pivot somehow and it's going to make sure that it arranges the array such that everything littler than less than the pivot is going to be on the left side and everything that's greater than a pivot is going to come on the right side and then you're going to sort the to have remaining halves of the array the ideal case the partition finds the exact middle of the array the median and is going to nicely divide the array into equal halves and that's going to be the best case the worst case of course is when you find a bad pivot and you're gonna kind of make no progress so almost no progress just gonna sort one element so the less naive implementation that I submit for your attention is with early stopping so quicksort uses early stopping meaning whenever the array to be separate to be sorted is less than a threshold it's gonna say you know what I'm done here so I'm going to delegate that the task of sorting that little array that's left I'm going to delegate the task to a different routine that's special that's good for a smaller race it's bad for bigger race problem being quicksort is great for big arrays it's terrible for small arrays right so if you crack up on any standard library implementation you're gonna find something like this and actually I did and threshold is 32 on visual studio and it's 16 on all different philosophies I guess Windows Linux I dunno and client makes a very interesting choice which we're gonna talk a bit more about it makes it it turns the threshold depending on the characteristics of the input so you can say well you know if the the the type you're giving me to sort is trivia trivia movable and assignable then I'm gonna use a larger threshold otherwise I'm going to choose a smaller threshold and we're gonna roll like that so all of a sudden we see this interesting notion you tune this is called a meta parameter and I'm not making this up so meta parameter is a parameter to the algorithm that's nothing that are passed on by the user but it's kind of chosen in a sort of a hidden way so the threshold meta parameter is chosen by Clank depending on some relatively obscure characteristics of the input so you know keep that in mind because I believe it's an interesting tidbit and for now we're gonna gloss gloss over the pathological cases you know kind of whenever it becomes quadratic and there's a there is a meta algorithm if you wish called interest sort that's going to detect monitor and detect the performance of quicksort whenever things go bad is going to just throw everything out and fall back to heapsort or some other algorithm right so it's just gonna redo but you know these pathological cases never come up except when they come and we're gonna see one it's amazing but for now let's let's gloss over that so they've like that you have right now is decent let's say so now the we're gonna focus on small sort small sorting small arrays I'm gonna bore you the rest of the I'm not kidding I'm gonna bore it the rest of the stock with sorting small arrays it's not just really very small arrays because well for large inputs we just talked like quicksort is great for very small inputs like five ten elements we know we have optimal solutions that are hard-coded right you just sit down and it's actually a hard problem to find like the optimal sort for like 15 elements and over the years people have found better and better algorithms for for small sort and they got to the theoretical minimum for up to 15 elements now the problem is if you saw 15 elements with one of those Road route it's gonna happen it's gonna be a very large routine so we don't like that so that's why I say it's kind of solved with error codes because not really solved it's kind of been you know in flux but essentially from a theoretical standpoint at least it is a solved problem small arrays you know we're done so now the challenge the real challenge is nobody knows how to really sort fast like 1,000 elements we have no idea so that's what we're gonna work on like a dog with a bone we're gonna small we're gonna sort these medium sized arrays hundreds of elements and we're gonna see what's what so the challenge is in this algorithm that we just discussed increased threshold without compromising performance this is the topic by the way so this was submit as a Tech Talk not a keynote and it was a late submission so John said I'm not sure if I can sneak you in so then just as nothing like I'm like my top was not accepted John said nothing for like a month and then I get an email like from Michael's birthday she said congratulations what for he says all for your keynote what keynote CPP can here the front page of Silicon was like oh my goodness this was a tow for like 30 people like people are interesting in officience in that kind of stuff like you know highly niche and so on the way to from that from the airport to here I'm thinking you know what maybe I don't put my seatbelt on there's an accident I'd I don't have to be here so anyhow going on with the show so our this is our this is our like mana make maniac all monomaniac all like focus here we want to sort smallish hundreds of elements erase fast this is it so let's see what they do like no and Clank and via visual studio what did I do they do what's called optimistic insertion sort starting from the left you cannot do a Linea it's gonna have like a progression in the array this portion is going to be sorted this portion is gonna be uncertain again you're going to do a linear search backwards in the sorted portion for the new element and you're going to insert it by moving everything else so just for illustration courtesy of code for geeks you're gonna have like these green elements being inserted in the red portion of the array which is already sorted and as you can see here for example here I mean this is the worst case worst case this is best case it does not move there's no moving because it's already in the right place another base case and here you have another worst case you have kind of a make case right yeah I'm gonna have to insert the five here and all is good and then we accept the complexity in the worst case is gonna be quadratic because you gotta you know you do the linear searches and you do the linear move a rule swapping that around all that nonsense right so we don't like that however it turns out that you know for smaller race is pretty good so for example the average number of comparison is gonna be half the worst case it's just a simple distribution there so it's gonna be like 240 comparisons and swaps for 32 elements so that strikes me like quite a fair amount of those which we want to reduce agreed I hope this kind of you know bit of overkill right so we don't like that so of course those fools who spend their careers implementing standard libraries they never heard of binary search but I did so let me try it and of course I I knew there's something wrong with binary binary insertion because nobody does it and there are people who like you know this they wake up and they think of it and I think of like five minutes a day right so it's like well wait a second what's going on here so you're gonna figure out what's what's gonna here so the strategy now is let's do binary search instead of linear search and that way we gain a lot for example CEO of 32 the number of compressors were 32 elements drop down by a lot for 250 to 155 so slam dunk right there so let me code it and test it and we got a very neat 15 percent reduction of comparisons I mean quicksort on 1 million elements using this binary search and sort as as the fallback for smaller race and we get to a very nice from 23 million something comparisons to 22 million so looking at 15 percent win in the number of comparisons which is awesome so I stir make the code I ran the test I was very happy about it was jumping up and down the house and I've measured a 13% pacem ization of the runtime 13% worse with fewer comparisons notebook is gonna tell you that forget Cormack throw it away no book is gonna tell you that sorry no problem gonna mess with the threshold and stuff nothing works binary search sort is always slower than optimistic linear insertion sort like for all meta parameters possible I tried and obviously the standard library implementers don't do that because they measured and they saw it's bad so why do it so I got to thinking what's going on here and I thought well it's it must be about predictability and entropy informational entropy consider this when you do a linear search in that simple assumptions or the linear version each search is going to have one failure per search at the end when you found it so we're gonna have a success rate of comparisons there like that's going to be huge there'll be like 90% however the binary search is in the exact opposite each binary search by its definition information is going to extract one bit of information it's the most information revealing test it can make the binary I'm gonna search in the middle does you gain the most information that way so it has that you know the most it's like the best thing you can do from an informational standpoint and it's the worst he can do from the point of from the viewpoint of who's that cash branch predictor right sorry I didn't mean to single anyone out so so the branch predictor is going to be super happy just you know just going with pretty highly predictable comparisons but when it comes to a binary search the branch line she's gonna have 50% chances of doing anything right right so it's just gonna be branch prediction is powerless and I made this very unpleasant realization I like maybe like you know like all of us have like a couple of meters of algorithms books at home and at work right you could kill an ox with one right and you know all research and all textbooks and all the classic literature goes minimize comparisons and you're gonna be happy minimize comparisons gonna be happy not I'm not happy I am happy so well what do we do now so what's gonna and if you thought it's not weird enough there exists such a thing which is called branchless binary search so I coded that too like oh if branchy is not good then branchless is gonna be awesome branches branchless has a different problem though it can't do early stopping branchless you cannot go through all the bits you're gonna have to click go all the log and steps all the way down and so essentially there's no early stopping for it for each search you're gonna have to go all you know logging operations so actually branchless is even slower than binary em branchy binary search so I was kinda dead in the water here more ideas let's see what to do what to do well inspiration from tinder you know those ads on tinder that say I want somebody who's like strong but also sensitive I want you know somebody we can wear like a fancy dress but also like hiking boots at the same time right and I want someone who's smart but also poor I want an algorithm that is smart but also boring so it's boring enough to make the branch predictor happy and it's also smart enough to kind of beat the baseline alright who's taught the Silicon Valley show alright so you know be loud compression awesome so how about me loud insertion we start from the middle and go like that we take advantage of the fact that we have two places we can grow the the sorted portion we keep a sorted portion in the middle and we gray like that and that way you get to swap half the times because you don't just swap you know you don't chase turkey and swap like everything you insert here swap only a little thing so an average you're having the number of swaps whoa let's go that out middle sort awesome all right we're gonna patent this so you code it very carefully so by the way whenever you call things like this very careful to not do extra work so you're gonna start from the middle and notice one detail here which is gonna come again and again here sighs and one auto right equals first plus one plus size and one I'm positioning myself in the middle of the array but depending on whether the array has an even or odd number of elements I want to position myself differently because you know the middle of if it's an array with an odd number of elements I want to push you myself just a bit sideways so I'm gonna grow the right way otherwise I'm gonna have two things that are identical I end up doing extra work so notice how here I'm integrating this not there's not if here I'm integrating a conditional within the arithmetic so there's no if sighs and one you know right plus plus right this is gonna be a lot faster it turns out because it's a it's a compare without a jump if you code it as an if is going to be a compare with a jump big difference okay always try to integrate conditionals within your arithmetic just the condition is like a boolean it's an integer that's 0 1 that's it consider it an integer 0 1 and then you know make it flow through the calculation so this is I think a highly up you know highly thought thought of highly optimized version of what I would call middle out insertion sort but by the way don't think we're very hurried you know here like me and many of you and 10,000 undergraduate students in India have coded this okay so it's a class you know there's a whole cottage industry there's a little community of people who write variations of insertion sort there's a couple if you which they want to take over the world okay there's just a bunch of people who devised the spend our days devising this kind of stuff to have a time assertion K at a time assertion shells or binary merge sort library sort of very recent algorithm all of these can be considered insertion sort variants so we're not you know we're not gonna patent you know so awesome seven percent better comparisons for the million doubles 13% better moves as expected we gain on moves big time however no significant improvement in time run and this is where you can make a difference because at this point you can quit where at this point you can say I'm gonna make this the purpose of my life right now hey so what to do in if you know every every key note has a having time like by you know by tomorrow this time I'm gonna forget everything I said but if there's one thing I want you to remember is try silly things because trying the smart things didn't work did it trying the good things did just didn't help you one bit did it so how about I try something stupid see how the hell that goes so you're gonna go the other way we're gonna acknowledge that the computer gods are so complicated their you know willingness and whatever they do that we're gonna give up on that I don't want to understand what's going on and because I don't understand I'm going to try things that don't make absolutely no sense and let's see what happens ready I heard one years like literally like one like golf or intensity yes yes please so here's the shower thought that motivates this the you know what's bad about this is whenever you move things if you move small elements in large sorted sub-arrays you're gonna spend a lot of swaps so the promise like the small elements are too much to the right so I want to have some pre-processing that brings the statistically smaller elements to the left so then when I do the insertion sort is gonna be better swap wise enter stupid small sort stupid small sort now a smart pointers then every how about some stupid algorithm right stupid small sort and it's gonna do two things make hip and the classic insertion sort thank you very much I'm almost sorry there's no rotating their IRA which they're like by rotating just works awesomely so I'll eat this is it I'm doing extra work because I'm comparing this with the same function in which I come in the first line so literally we're doing more work instead of just calling insertion sort let's make a heap before which is kind of a structure that has its own interesting properties and it's kind of a you know it's kind of fine-tune thing an interesting delicate structure they're like you know what I'm gonna make one of these interesting delicate dainty structures and they're like an idiot I'm gonna forget all about it and I'm gonna do a stupid insertion sort all over it with no regard for the structure of the heap is this stupid enough right no we want more stupid okay don't test me here's one here's a little improvement I'm going to use unguarded insertion sort raise your hand if you got it I see one hand in the in the little last row one hand and with the middle finger pointed up she was like what so here's I've shared with a mathematician I'm corresponding with brilliant mathematician also brilliant hacker he has like all these awesome like routines and so he first didn't understand what I what I meant to do it was so stupid he he said I don't understand what you want what you plan what you want to do here I said make a heap and then play nice action sort and that was his all his email that was the entire email he sent me very motivating so let's recall the I do have an angle here ladies and gentlemen everybody like I do have an I do have a thing and my thing is the following in a heap you're going to have the very smallest element at the top of the heap ie it's in an array it's going to be the very first element of the array so it's gonna be the smallest element at the being of the array and everybody's happy everybody's happy because then I can do this unguarded insertion sort meaning I don't need to do bounce checking when I test for inequality going down the array because you can't be smaller than the smallest so sooner or later you're gonna hit somebody that's they're smaller than you and then you're done and you insert the element so if you know for a fact that the smallest element in the little array is at the very beginning of the array you don't need to do bounds checking agreed awesome and why is this big in plus 3 here it's not begin and it's begin plus 3 what's with 3 yes I'm hearing some good points so it's beginning possibly because the first is the smallest the second is going to be something greater than the smallest so the first two are already sorted so I start with inserting a third element into the sorted portion of the two first elements believe me this three is three percent I am not kidding so is this kind of stuff make sure you do the word that needs to be done but not extra extra work is not being compensated in computer science no it's not like school don't do extra work so that's why you know that's my whole idea I'm going to statistically I'm gonna put the smallest in the front which is a huge advantage to start with awesome I don't need to do the boss checking and then it's you if you look at these sub arrays here like you know one two seventeen twenty five one three seven so I'm going to have a few sort sorted sub you know streaks within the array the way this is arranged by the way the heap when you arrange it down with just inorder traversal so it's gonna be one two three seven P 1936 seven 25 100 and that's my array now this array is not sorted is far from being sort and in fact if you look at that seven there right or it's 3604 seven so it's kind it's not sorted at all however it does have this partial sorting behavior so what I'm trying to do here by calling my heap is to essentially put mine my rain up in us in a topology that's going to make it favorable for insertion sort let's take a look so imagine we set up the rig we measure when we get a very nice improvement nine percent implementing comparisons a whole whopping twenty percent less in swaps a lot less data moves by doing more work so I find that fascinating is like it's a gambit is like you know I'm gonna do this extra work in hope that arranging things for the next stage right 9% pessimism my wife and kids hated that day they didn't like that day I was grumpy and what the hell is happening here leave me alone right so I'm looking at a 9% pessimism but there is hope because maybe you know the direction is real good so I'm reducing a lot of metrics by a lot and you know maybe I'm gonna increase threshold and experiment with that because now I can afford to increase threshold just because I reduce the swaps India to comparison so much so I can play with that and I actually can start to micro optimize dear colleagues it's time to put the hard hats on and go down in the mind to shake hands with the devil micro optimizations the way to transform an array into a heap is called Floyd's algorithm is from the 60s great algorithm very nice from half of the affirmative rate down to zero you're going to insert into the partial sub sub he below and you're going to done after inserting the root which is element at index 0 sorry on the side so the way it's done on rosetta code is just you have this chiefdom routine called inner loop and that's pretty much it and the shift down routine itself is kind of hairy so looking here at some calculations and stuff and we have like one two conditionals three four conditionals five conditionals swap so we have quite a bit of work within the inner loop remember this is a loop inside a loop right so I have a loop inner loop right so we got five compare and jump decisions points and we have three ads and shifts and six assignments and we must work on reducing this what they do to improve an algorithm that is implemented in a standard library you look how they do it in a standard library I gotta give it to the folks who implant the standard library they gotta write all of those underscores all of those damn matter square is just killing me just you know just for that they should get a bonus like every year the underscore premium you got right right now underscore bonus you'd go like ten percent of salary this is it for like you know extra dangerous in your work but actually it's a very meaningful algorithm so I looked at the analyze it and you know it's kind of what also the other libraries do and they use moves East at those swaps number one so they use the they take advantage of state move and friends so instead of swapping things around they kind of register things so they move into a register and then they move back from the register into the thing and for doubles movies gonna be on no op it's gonna be a simple assignment so essentially in my code I'm not gonna have the move but you can assume that for a general algorithm we're gonna have the appropriate moves in in there and the inner loop is to comparison jumps and for thematic and to assignments and the author loopy there's a bunch of fixed approach to handle s not let me explain what's the deal with the last note here because this last note I hate it all right so what's the size of this array one two three four five six seven eight nine so have a nine elementary what if you had the ten element array there'd be a a child for the nineteen element right this guy so this node would have a child but an only child he turns out whenever you do that thing that spoiled brat of a child is gonna just mess up things because you need to test in every iteration there whether this is a heap with a lone child or a heap with two siblings and that's what new does that's what Microsoft does that's what everybody does they're just going to test for this here actually you can see this is the fix up for the launch out so very unpleasant and you know I'm gonna spend out of time thinking how can I eliminate this situation I hope you're gonna like this we're going to ignore the spoiled brat we're going to treat that you know the lone child if it exists we're going to treat it as if it doesn't exist so we're going to make a heap without the last element but if only there was a method to push an element that's not in a heap into the heap push heap so you make a heap for the advantages case which is faster and then you have one note at the very end and then you call push heap and it's in the heap log and cost and you're done and you save all the comparisons in the loop so you save a lot and in the end is idea fine I'm gonna push it on one more thing about optimization it's very good to study the standard library because you're gonna see that there's people for whom like waste of work is like black lung two miners they just they they are going to debate every single cycle every single cycle lost is going to end up on their forehead so if you optimize I think a very good state of mind is you know I want every I want to draw every cycle possible from this code I want to be able to defend everything I do in a court of law this is the attitude and if you look at rosetta code that's not the attitude which is fine because for teaching but if you've got a standard library implementation you're gonna see that there's people with that attitude and we're going to do just the same first insertion sort keep stupid sort a surface last you know whatever they would erase be well-formed I'm going to computer size and then if the less than three I'm going to do a simple routine which is sort for zero and first size equals to what the heck is that who knows well if size is two is gonna be sort - you know so - element is like swap them if they're you know they're not order right so so tree is like a trivial but the thing is you sort to first of zero and first of one if size is too but if size is not two sides gonna be one so you're going to sort first of zero and first of zero which is a does nothing the key point here being that you integrate conditionals within your arithmetic maybes and German everybody else please understand this is crucial you don't want if you just want boolean swith in your calculations right and then we're gonna make hip and we're gonna do the on guard intuition so this is my stupid sword make hip I'm going to start with the first parent it turns out to be size minus 3 by 2 he turns out many implementations they do size minus one by two in the wrong they they do more work and then I'm going to can be the first right kid which is like the you know the first right kid or the parent which is gonna be the first pair plus one times two so that's gonna be my simple calculation and just want to save that calculation doing over over again because then I'm going to simply decrement the parent and decrement the first right kid and that's my outer loop by the way this is an infinite loop as of 2019 I announce that I've always going to use infinite loops I give ivory ivory neg I give up finite structure loops they're for bad people I'm done with finite loops no more structure loops I'm gonna use infinite loops for all my work firm commitment except for like most loops but that's all right and then if the sizes are not numbered then nothing to do because it was one of those heaps with both children at the end but otherwise I have one more kid to push called pushy and we're done very nice outer loop so I'm putting this on different slides because this is just the code doesn't fit so I'll totally remember we had this first right kid and first parent and we have the element that must go down let's call it Lucifer because it goes down right naming the hardest prong in computer science so I'm going to compare with Lucifer and stuff and whatnot so this is my outer loop notice that this cannot be transferred into a structure loop it has this break just before the end of the loop but if you break here you're not gonna do these two operations in the turns out this is important that's why I don't I repudiate structure for loops okay the pit of hell the inner loop here we are in the core of Chernobyl and actually that's an apt metaphor because the density of current inside the CPU is the same as compared same order of magnitude as the density of current in a nuclear reactor and that's because the density is like division by the area through which the current is spent there is like division by zero is very small but the density is huge is humongous so this is the pit of hell every cycle every hat every percent of a cycle here matters and here's what we're gonna do which nobody does constant a junior equals right kid - comparison result no if so I compute a best she a child to use for my decisions I choose the chair by doing arithmetic not testing not decisions not jumps so I pick up the right kids either the right kid or right in -1 which turns out to be the left kid and then I you know I move on I ever eg start this particular I move from this a of junior to current and then I compare and break early and I do my operations to update parent and right kid and I have a go - what bring it yes thank you so I have a good I'm not ashamed it saves me one test if you can replace this with a structure loop I'll eat it thank you bianna for leaving go to in the language in spite of pressure this go-to is the most honorable thing I've done in my life ok so I stand by it so then I had the push sheep guess what push sheep is slow like pushy people either by the you know bike know and by clang and my vs is slow you know it's slow because they you structure loops because they use finite for and they do more work the problem here with push sheep is that how many times dieter in this loop here in a push sheep it's log and times it's a short it's a small amount of times it's like five times if you iterate a loop 1 million times then you do a bit of extra work at the million of time it's just fine but if you do if five or six or ten times and you do extra work at the end of the tenth time then you're losing 10 percent right so here this loop is very carefully written so that it breaks the hell out of the loop as soon as humanly possible so you look at the parent compute the parent and registers the AI and parent it breaks early here and again it breaks early here and everybody else in the STL implementation is going to do this extra work this actual assignment which is measurable because this is the pit of hell this is where the action grows this is going to be executed million millions of times so new rule always use infinite loops except of course for most cases but always am i clear always use infinite loops except most cases but always alright I don't make the rules don't shoot the messenger and just like saying what happens alright so don't forget this alright so let's make a bit of analysis here in the inner we have three comparisons but only two compression jumps so we're good and we have holy truth matrix and to a size so we're good so let's put this to the test and we're getting close we're getting close but not quite there we're within two percent but now remember we reduced swaps we reduced comparisons so we're in good shape we cannot increase threshold we can increase the small side that's the size of the smaller race that we're sorting and finally after millennia of work we got it if you set a Thresh of just 63 you enjoy a good win 2% here in these swaps you essentially it's good for everybody it's good for you know all metrics are better including time so elaborate eyes get only better because they're going to have you know more expensive comparisons and more expensive swaps so you know everything is going to do better than double so double is kind of the worst case in a way I'm gonna use this template for the rest of my slides because now we're gonna get into completely like weird territory here I was kidding it's just but consider this let's plot comparisons as a function of the threshold and of course comparisons are gonna grow with the threshold this is expected with the quadratic thing and I'm very happy to see that comparisons with the blue which is the insertion sort the hip insertion sort they're gonna grow slower so that's nice that's good but compress this increase with a threshold so it's get worse if you increase the Thresher that's fine same about swaps moves you know so the swap is kind of two moves several swaps we're doing better in move you know we're doing better with the blue plotter fewer a butter sauce gets is just growing right with the threshold it's kind of a problem and this is the weird thing time continues to drop with all the metrics are doing worse but the tank continues to go better I kid you not this is one of the weirdest moment in my career when I plotted this and I took a look and say what is happening here I'm doing I mean I'm pessimism things metrics are getting worse as expected but time is getting better so the sweet spot like turn a 55 so I have an array of 20 55 elements and I'm better off with a bunch of comparisons and bacio swaps it just works better so I find this like really weird and disquieting because it means every guy I learned from books is no longer valid there's no there's no there's no compass for the territory we're in right now there's no guidelines there's no like oh you know if you do this is gonna get better you know what to do you don't know what to do have no idea so trying theory silly things we increase shall we increase compressions we increase what but time continues to drop down and we got all the way down to 4% better over the base line which is a significant improvement by the way you shouldn't expect I'm gonna get like twice as fast as sort it's not a high margin business right it's a like there's a political limit you can't go under and we're kind of in that area so this is you can like say Oh improve like 2x wow this is awesome so 4% is a good result if we can reproduce it across a number of data shades right but this kept on gnawing at me what the heck is going on how what matrix can I define and use to improve my sorting comparisons are not good swaps are not good so here's what I tried collect D of n the average distance between two subsequent array axises so you have the array that you want to sort and you read you write you read you write at different positions in the array and you collect the average distance between two successive accesses that's gonna give me an idea of what now you can say cash yes it's gonna give me an idea without actually having a cash specific metric it's gonna give me an a cache-oblivious manner of saying you know what your algorithm is not as good because it does kind of weird accesses at different ends of the array right so I looked at the distance and for quicksort the distance is very large because partition always start from the at the end the worst case it's gonna start from the you know opposite ends of the array is gonna go like that that's quicksort partition so that's a problem and let's take a look at D of n and indeed it does go down with a threshold for both algorithms and it goes down or for actually no it goes down or for the plane algorithm so now your engineers what do we do now so we have to measure that are weird we have a third metric that's kind of there but also again doesn't tell the whole story so the solution is build a composite metric that encompasses all of the three so you have a blended cost of your computation which is C of M plus s of n M moves em open plus a constant times D of M and once you plot that you're gonna see that it follows the time so this is the right metric to use for improving sort algorithms throw away Cormac throwing News honor it shouldn't throw ignores but you know understand that this there's no notion of this in all of the algorithm books in we have there is however in recent research literature because research is just kind of ahead of everybody it's kind of there's a lot of research in fast algorithms and how to do how to do them how to implement them and you can find papers on this kind of stuff the books are obsolete kind of a weird spot the papers are kind of giving good information the industry is giving good information because you know they measure right and the books are kind of out of you know out of date so this is the right metric you blend compares and swaps and the average distance between two array axises that's going to give you a good proxy for a computation but wait a second not all data is random I hear you say well more data more measurements so I measured on the following this is a kind of a typical corpus sort the data 0 1 2 all the way to n minus 1 sorted data reversed and minus 1 all the way down to 0 organ pipe right actually I'm wrong because allgemeine does it be like this because logarithmic right it goes like that and like that but organ pipe right goes up and down rotated instead of starting at zero I started one and I go to n minus one and then I put a zero at the end oh it's the result of Kali rotate begin plus one and begin begin plus one end right so rotate did occur in this talk thank you for the golf clap and last but not least we have like random there I did my best to create some random daredi here for your pleasure 0 1 0 0 1 0 1 1 this is random believe me so random 0 1 is essentially n by 2 zeroes and by N by 2 ones all scrambled this is going to test a lot of duplicates ins the sorted array a lot of duplicates but also high entropy which is difficult in difficult situation right the difficult case and the key detail here does is that you don't get to you don't get to choose threshold depending on these statistics because you don't know the money in advance so what we do is we optimize threshold for random data because that's sort of the that's the worst case right that's sort of the baseline you optimize threshold for random data and then we test it on these guys and see what happens I approach this measurement with huge trepidation because for example the sorted array is the happiest case for insertion sort this you do nothing so partitions gonna do nothing and then every insertion so it's gonna do nothing so there's gonna be 0 moves and it's just gonna be a bonanza of like speed that's why it's good to measure friends sorted data we destroy we destroy plane insertion 9 percent improvement by literally doing more work I had a hard time explaining this to myself because this is really paradoxical least like insertions the happiest case for everything and the key you know the keys the keys unguarded insertion because he'd one guarded and the other guy must do guarded they are at a disadvantage so the fact that you get to make heap which also is uh no op like no make make is gonna move no data the array sorted any sorted arrays a heap but the fact that you don't you know that a smallest element is already there so you don't need to to guard it's gonna save you one extra comparison per loop which turns out to be a win so we have doing more work on a sorted array gives you an improvement nine percent even more significant time for random data reverse data we D file plain assertion three point seven percent that would be the exact opposite I have bad news for Graham pipe though we lose four point six percent but here what I noticed is the following depending on the choice of threshold I saw C saw in the runtime of hippie insertion it's very good actually this is sort of my worst case the hipsters worst case because if you choose treasure like instead of 255 is like 200 you're gonna get a much better behavior on organ pipe and that's because I need to investigate so there's a seesaw behavior it's like very bad very good times very bad times very good times very bad times depending on threshold so there is some chaotic behavior there that needs to be analyzed needs to be looked at and the nice thing is you know you notice it and you kind of studied I didn't have time for this talk but definitely there's something to be done there on organ pipe is just a weird anomaly that's why it's so good to test on so many data sets right it's beautiful rotated Dana rotated data you know I knew rotate is gonna be in trouble I knew rotate them in trouble so again rotate is the smallest element is that at the end of the array it's actually uh it's a plausible shape of data because you may have a sorted array you have to you know push back some new element it turns out to be very small right he turns out rotated data hits a new partition in the worst way making tweaks or quadratic so all quicksort is irrelevant because he falls back to heapsort you can't measure it fortunately a visual studio folks don't fall for that trap I've been talking to them and Clank folks don't fall for that trap they have just a different different way of doing things so the result the new leap stood C++ team you got homework you can't afford this this is a bug in the library because the plausible data she's not a made-up weird data shape it's a plausible data shape that leads to worst case okay also I have a paper that also shows the worst case in new element which also becomes quadratic nobody took notice so if you know somebody say something all right tenth element is broken up no this is broken up no okay so with rotator current the jury's still out random zeros and ones we obliterate plane insertion obits 25% is like this huge margin I pre awesome so the short result would be HIPAA fine before insertion sort is going to be a significant win significant scientific sense you measured the significance significantly faster most tested distributions I want you to take a second to figure like the the paradox here it's a weird gambit it's doing more work it is unstructured because it doesn't you know it's not a smart method I think it would be much smarter to make the heap and then sort the heap kind of cleverly allow smooth sort smoothes or does kind of that kind of thing but it's very complicated this is the problem ooh so nobody can implement it so and I think nobody can implant it real fast because there's so many you know so many details to it so so many weird corners of it but if you just do this silly stupid thing you just make it before the array with a good make hip with a good push heap pill please fix your library STL implementors then you're gonna be better in sorting by the way I got you know I got a qualify here I kind of criticized that I still implement us because make heap is not fast enough but they have like a whole a standard library to implement and maintain and make keep - do you call me keeping an applicator make keep in one application like once I make hip and I mess with it right and here I come with this use case oh I need may keep a hundred thousand times nobody's gonna like what was the use case here well there is a use case now right so you gotta make it fast I wanted to share a few thoughts about this all with you because it looks like what do we make of all this this is kind of a bizarre situation to be in and there's this very nice notion in in philosophy which is first order thinking and second order thinking and first stories like the immediate conclusions the immediate the obvious consequences of certain thoughts and situations and realities that would be first order thinking I'm hungry and Anita cooking right but this was the second order thinking which is like what we are good at like you know advanced primates right and second I think is you know if I eat too many cookies maybe I'm gonna get diabetes so that would be kind of looking at the profound consequences of facts and situations that would be second and third and thinking kind of thinking through the immediate because there's a lot of immediate conclusions I could draw here like throughout the structures in algorithms books away look at research papers and measure that's what we can do so you know the interesting the research is what's happening and the books are kind of really out of date device your proper performance metrics and proxies and measure everything like a maniac and most of all try silly things because silly things are our ultimate line of defense in front of an Amber big complex machine right try silly things these are the first-order conclusions these are the immediate like if I can conclude my talk with this they'll be fine dislike these are the things that I can say as a as a closure to the or deal with B through right but there is actually a bit of second-order conclusion oh by the way sorry so I got ahead of myself first-order conclusion like overall please remember this after a keynote is gone code that wants to be fast is left-leaning if you want to write fast code you gotta be a coming and I understand the irony that you're here in this in an Eastern European accent if you want if you have fast code you gotta be with the Communists that's how it goes I don't make the rules what do I mean by this repeatedly in my experience including in this very code that I showed you today in my expense repeatedly it cans that code that wants to be fast his left goes to the left of the page goes to the left of the page friends you see what I'm saying fill it so if you like a loop enough if and a foreignness and a switch it kind of weird enough not gonna be fast no by the way the Linux kernel you know the coding standard is 8 characters tab 80 characters line width you can't write bad code on Linux in the Linux kernel it just can't you can't right it'll slow code there and I think it has to do with you know complexity and also like speed if you can like the moment we have like too many ifs and decision points about all that nonsense in your code your kind of efficiencies out the window and especially because there's many consequences to this especially because you're gonna inevitably end up mixing hot and cold code which is a anti-pattern inefficiency you don't want to do that you want to have hot culture get hot code together call code together that's why in all my routines if you saw I kind of whenever I have a chance I get the hell out of there I break I return I get out and then I go to the fix up which is gonna find this is called code I don't care that much right so code that wants to be fast is left-leaning does not mix hot and cold operations does not make many decisions if an ass is the anathema fast computing just integrated as your normal arithmetic okay and that's that's how I write fast code but I was saying I was getting ahead of myself there's a few second-order conclusions to this friends and you're not gonna like them what is the perfect sort for the C++ programming language imagine I know you know ass is like white it's a clean slate you to do everything you want any anyway you want well I can only assume that we're gonna have something like use hard coded version for very small sizes up to five I'm just coding the routines and actually clang does some of that and Visual Studio does also some of them so you some hard coded like really quick inline code for the small cases and that may be for small integers and default ordering let's use radix sort which is very fast and very nice but it's only gonna work with small integers and the default less or greater right certain pretty cuts and then maybe I'm gonna use this algorithm with quicksort and threshold and small certain whatever but the point is how do you choose the threshold because remember from the beginning of my talk visual studio it was thank you 32 G blue was 16 he shouldn't say that not the same person should say the same thing 16 and clang was same person which was what no it wasn't 30 it was 30 or 6 depending on type characteristics and this is where it all gets crazy friends this is where it gets like interesting you choose threshold actually choose threshold depending on many things depending on how much it costs to move things around whoa how do I even measure that well if it has trivial move the cost of moving is proportional to size of thank you so we already have a notion of oh if it's a trivial to move type I can make a calculation that's proportional to size off that's gonna be the cost of my move in let's say compute cost units right cost of comparison ah that's I told you you can do anything you want but this is difficult how do you estimate the cost of a lambda what yes what's sodium arrival and I'm gonna sort these strings by length and you know how do you tell the how do I tell the sort routine that your length comparison is actually one side how do you convey that information from one end to the other of the algorithm and this friends I think is an interesting problem and I dislike all of you know all through working on this album singing I'm working on doubles but I want to make this generic I want this to make for many types it was clear to me the threshold must be chosen depending on some weird minutiae depending on the type so we have the cost of compress and we have the size of em we have how about the data contiguity because if I sort of vector is one thing but if I sort a deck that's a different thing because all that distance between axis is gonna be different it's gonna have a different impact and I'm going to customize the algorithm everywhere depending on whether I have trivial move and copy because if you saw at the elbow if you look at the algorithms I'm gonna register some elements of the array save them as temporary it's messed with them then set them back and that kind of stuff you don't want to do that if the element is expensive to copy so depending on the cost of copying things around careful here depending on the cost of copying things around my implementation is going to have very many minor differences with regard to and registering creating temporaries say you know you know I have this temporary here if it's cheap otherwise I don't use temporaries because it's expensive right and all of these all of these decision points for the best sort algorithm are in tension with the tenets of generic programming they just run against it this is not generic program this is like customized everything depending on types programming it's not generic programming France dear colleagues this is not it what is it if it's not how do you encode the cost of operations for example what mechanism do we have for that how about user-defined attributes you can devise a simple library in which you assign constants to primitive operations like compared to integers or whatever and swap to integers and doubles and primitive types so you have your baseline you have your baseline cost in terms of elementary operations and then you have fixed attributes to functions that are going to estimate and propagate those costs of computation so then you have information with your lambda or with your function as to how much it should cost in terms of multiples of elementary operations I wrote such a library for a different language so you have a very nice mechanism in terms of user defined attributes and calculate an algebra of user-defined attributes you can devise not difficult that's going to allow you to convey propagate and inspect the estimated cost of doing operations so then you can in al and I can say to the soul routine my string comparison you know my land up cost like one elementary operation so I'm looking at in of course the key here is that you get the introspection going because introspection is the way to communicate from the land a predicate into the sort routine things and in this case the sorting is going to introspect the land of the the predicate is guys do you have this attribute complexity or cost do you have this attribute ball cost yes I do have it what is the cost well it's five times the compressor of three integers oh cool then I'm gonna choose threshold 42 42 right 42 so whatever there's no concept for that casis just just hit the wrong barn door there's no concept for a pipe that's gonna lead to a threshold of 42 it's just a simple introspection and you look at the type see what he can do and whatever right if the introspection finds no user defined at you before the cost is going to use a simple heuristic to say well then I assume the generic you know generally cost model right but the thing is you have the possibility to dissolve so there's a tension because in general programming we look at this strategic approach we define a few broad categories like you have input iterators and forward iterators and all that nonsense okay and we devise algorithms and we specialize and all the good things but actually there's this whole thing which in which I don't look at all that it's I do all I do is introspection I look at the type what you can do what you know how much it costs what you do and you know these kind of things these are you trivial to move I easy to copy and depending on that i optimize I change the design details of my algorithms in millions of ways so each sort routine for each care of types is going to be customized for that particular type in unexpected ways and I've done this I've done this for years and it turns out it's unprecedented what you get is like very small algorithms that work very well for very many types that's what you can get this is the this is the shiny city on the hill that you can get with introspection I've given this talk in Germany in November last year it was about the next big thing in C++ what's the next big thing in in C++ introspection this is what opens the this is the key to the gate of hell this is the key to the kingdom this is what she was going to enable us to do many new wonderful amazing things as opposed to other things that they don't they may be disabled as from doing bad things but I don't want to not do bad things I want to do good things you want to do good things welcome to my last public talk generally programming is why we can't have nice things I'm here laughter okay like the two first two first roles applaud everybody this is like frozen it's like oh my goodness please listen to me I'm exaggerating of course but consider this I've tried to implement these algorithms for just you know plenty of like everything and they still and a lot more with a mindset that generic programming and I failed and once I realized it's not the way the right way to look at things the right way to look at things is to introspect every type customized infinitely your algorithms depending on the my need characteristics of the type do you this does this range no its length in advance in all of one because if it does you're gonna call the completely different algorithm than if it doesn't that kind of stuff and what we have with Jennifer is like with distance we are like two specialization of distance and thank you very much that's not enough so I'm not saying this is an exaggeration of course I'm not saying GP is bad I'm not saying Jerry Brown is bad but it's not enough so remember the 90s was all the craze he hurt us in virtual functions be awesome and then a 2000 came and joint program came in C++ never was like yeah 30 second grade we was awesome what's coming next designed by introspection is coming next inspect and customize everything everywhere we my friends I understand I pushed a few buttons I fully realized I pushed a few buttons but this is in a good spirit right because if well if you if you all set here right now and compliment each other how great rotate is there will be no progress yeah rotate is great enough about it and done with rotate I'm never gonna is rotate my life anymore okay and infinite loops those anger you know that kind of stuff so progress comes when you kind of realize what we have in the mindset we have is not enough for what we want to do what we want to achieve is the best sort we can't achieve the best sword with generic programming and even if I pushed a few buttons you should know that like this is like one of the best moments of my life is like an amazing crowd and amazing community and it's very good to be here so from the bottom of my heart thank you very much [Applause] we can take a few questions at the microphones we have a few minutes ladies thank you for the talk first second regarding the replacing gift statement with arithmetic after watching couple of them at stock I come to think that a compiler the compiler can do a really good job at it so yes yeah the compiler is very good at finding these opportunities but not good enough you're better so I noticed like for example in for comparing jones the compiler replaces one I replace the rest the other three so that's why global dot work is such a good resource because it tells you where you need to really work on it so the compares are not they're not even close not yet yeah yes please so thanks for the talk thank you I hope this is not obviously I'm just not seeing it but um you don't really need the heap structure right so why do you not just why do you do the push in the end why don't you just leave the element where it is like the last element X thank you very much for asking this is I wish somebody would ask that so why do you worry about that last element and keep on pushing it into the hip he turns out either three percent it makes a difference and you know why because it may so happen it was random data that that's the smallest element and there's a way push heap is gonna push it real fast but kind of inserting it is gonna be real slow so imbalance in turns out it's an advantageous thing to do but thanks for noticing thank you yes please in C++ we have been able F and now we have concepts do you consider that constraint based programming distinct from design by infra spectrum or do you consider that birds of a feather constrained based programming what what is that for instance you describe inspecting based on whether size was order one and leveraging that I can dispatch based on the presence of that with estin a is that along the lines of what you mean or do you think that that sufficient so let me restate the question so do you think that sort of static dispatch based programming and concepts are a good tool for what you want to do is that the right question so far I don't think that the right tool I think I think both are significantly difficult to use tools that are not going to be enable you to do the right the right things everywhere it's just gonna it's just so much the algorithms end up being generic but they're gonna be like huge it's very unpleasant to use these these tools and concepts with causes like you can't express as a concept you know cite the concept of all tied with size 16 oh yeah I have the castle all size 24 and so on there's no such thing with constants you know you just got a do introspection not defined concepts and just use what you got by looking at types yes yes please hi David Holman Sandia National Labs thank you for your mention of user-defined attributes I know that maybe these days you don't get as much time to follow the committee activities but we actually have essentially a proposal for a library level solution making its way through the committee right now P 1393 properties that essentially addresses this concern where you have user-defined extra semantic pieces of information that can be used for optimization like thresholds that can be attached at the top level and then use generic in generic code have you seen any of that Craig yeah I was yeah so that's what I was referring to sorry for not making that clear it's my take that attributes are like the essential tool for introspection based design okay that the essential tool because they allow you to carry information through statically across function called boundaries attributes without introspection is like a wedding without music right introspection without activities I don't know like another wedding with no music right so it's you want to have both thank you for for the announcement yes please John plays dramaturge there's a branch of complexity theory called parametric complexity theory called parametric complex my magic complexity is Downey I'll give you the names later they the idea is that you don't just worry about the size of the data set but also certain parameters describing the actual structure of the data and so a lot of problems which might in theory be very expensive turn out to be quite practical given the structure of the data my impression is that what you're proposing is parametric algorithm designed with and it's very much an analyzable with the parametric complexity basis thanks for dimension so essentially this whole this whole user-defined attributes and complexity and this can there's a lot of research in the area and there's language that do that sort of by default they don't need annotations they kind of deduce the cost of operations and stuff so definitely this is rated to that but can't being a bit more modest than just looking at heuristics to to assess the cost of operations not something really precise but yeah there's a lot of research and development in the area yes please hi my name is Sheldon representing snapchat and the improvements the 3% improvement to random data was really impressive I think it seems like you'll be able to pull it into stage sort do you have an idea the practical bounds here in other words how many more 3% improvements to sits or do you think we might see preventive data how many more 3% the improvements can we squeeze I suspect there's more and I suspect if if you if you take this idea and build out other 2/3 ideas on top of it is gonna compound because there's a lot of smart things you can do once you have the heap in place you can kind of be clever about it what I dislike just like a silly like insertion sort but I do believe there's there's more room to go this plenty that's why I present this idea not as the sort of the end of is not the end of some things that begin of something because right now a lot of people have heard of it and there are wait a second what if I do that yeah yes please so in your combined speed estimation function M plus C plus KD over N you said it was K with some constant what was that it was 1 by 500 I hand pick it because there's no easy way to mix this I chose it to feed the curve right it's 1 by it's 1/4 500 Justin's like this nice number but yeah you got a one of my two deuce is to figure out a principled way of computing that K thank you for thank you for asking hello yes please I have you tasted this on different CPU architectures and also whether you would want to introspect I testing it only on Intel but different Intel machines like laptops and every I got that foam right I suspect on other machines you're gonna get different results but there's only this much you can reduce the comparisons and swaps and not get not see an improvement so the fact that it reduces the comparisons and swaps and the distance is a clear indication that we're heading the right direction so I suspect is gonna work well on other machines as well thank you yes please more follow-up to questions about the heat portion of the algorithm there are other algorithms that will also provide the first three or however many elements already sorted is that the primary goal of creating the heap and if so what other algorithms and parameterization points that may be pull out four or five or six of the first initial ones or is there something more fundamental about the heap structure that is offering the improvement thank you I think that heap is um is a very subtle and interesting data structure it has some informational properties that for example is the minimum of comparisons that you can get for that structure so in the heap I notice for examples this other question well how about that last term and we care to push keep it in it turns out it does make a difference so the heap is not only that I get the smallest three elements at the beginning it's also that I get these sorted sub-arrays the sorted like if you wish spans in the array that are also sorted and of small length their log in length but there are many there are quite a few and it's the entire structure that helps is not only the top three elements but thank you for asking sir excellent yes so if we had an ability to say how much a copy was or anything like that how could we do things like saying how long a branch predictor is going to be hot or how this algorithm is gonna be there like you've mentioned that seem like a big point of it but how could that be conveyed how can you convey the cost of a branch prediction failure and that kind of stuff right or even how often my algorithm is going to fail to be you know it's gonna be more it's kind of a high probability or happy well you know there's likely there's the macro likely it's kind of the bottom you know the first line of resistance we have like the means to kind of hard code the likely but I think that's a hard problem really you can encode is gonna cost me this many comparisons but it's difficult to say well you know the probability of a successful comparison is that yeah I I did this is take a second-order thing I love it you know it still really take things forward yes please Eric we have these attributes that say how much does a copy costs say and I have to hand in it and say it costs like four units and then I go ahead and add an extra member to my class and now I have to go and change and say it's not five units so it kind of becomes almost anymore um the maintenance issue is real but you don't say five units you say five times the element is you say that's one introspection is needed because you need to say member you know members member count times this size of times this elementary cost so you don't say a hard number you just say proportional to your to your number of elements in this in the structure so it should all except for the primitive operations it should be all propagate and computed depending on the interest with the results of introspection thank you very much for asking yes please Ian's hi does your coat hold up against the spectrum fixes because you're like depending on the branch prediction come again the spectrum traffic's us and the other box which were in the branch predictor the spectra odd the specter yeah I'm out of my depth here so I don't know it's a it's a good research direction I'm gonna take one more question yes please hello what we were saying at the end reminded me to one interview where there is one sentence saying one of the things that is central to generic programming as I understand it now is that complexity or at least some general notion of complexity has to be associated with an operation and that's from 95 Alex definitely interviewed two dogs journals so does this somehow ties in with that all right it is definitely preternatural and I think so I have this library in which you can actually say Big O of the cellar is the first parameter times log log the first parameter so you can say sort has the complete the all bigger Greedo complexity and cost of operations different things right because amortization and all that math says right so it's different algebra but with I think that that was pretty natural step enough to say hey you know we should have that and yes we should have that in the form of attributes that are have an algebra on top of them and are transportable in there you know they can be passed around in computer than and say you should know by looking at the signature of sort is gonna be n log N and that's possible with user-defined attributes thank you okay thanks again very much for those of you who stayed thank you [Applause]

Info

Channel: CppCon

Views: 126,348

Rating: 4.9322033 out of 5

Keywords: Andrei Alexandrescu, CppCon 2019, Computer Science (Field), + C (Programming Language), Bash Films, conference video recording services, conference recording services, nationwide conference recording services, conference videography services, conference video recording, conference filming services, conference services, conference recording, event videographers, capture presentation slides, record presentation slides, event video recording, video services

Id: FJJTYQYB1JQ

Channel Id: undefined

Length: 89min 55sec (5395 seconds)

Published: Wed Sep 18 2019