Realtime Data Pipelines with Elixir GenStage - Peter Hastie

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Applause] thank you have yes so I'm going to be talking today about Alexis gem stage module and putting it in the context of data processing as though I am Peter hasty I'm online of Scilly POG on Twitter and github and everywhere else I've been working at which to report for about four years in kind of a variety of roles I started at the front end developer and after about a year I transitioned to the backend and originally on the beta engineering team and then more recently on consumer services and you can read a lot about my exploits on our company blog that has a lot of great insight from a lot of developers on the team so I have really only been working with Alexa since last year which I'm sure means people wondering why I'm up here giving a talk but the module that I'm going to be talking about has also only been in the language since last year so there are no experts here except the James who may or may not be in the room he actually is an expert on it so gen stage is a new elixir specific OTP behavior so unlike something like gem server which is coming from the Erlang world Alexa was built so far only on Alexa which is kind of a nice win for the language there and it was created by Josie and James fish as they currently maintain it and anyone can just add it to their project through the mixed file as a dependency which is a little awkward when I started just working this presentation gem stage was designed to be or expected to be part of the main language but it grew so big that they decided to keep it out and keep it as a separate thing so I am presenting on someone else's library that I have not contributed to no I intended but the good side of this is that now the implication is that anybody can put an OTP behavior up on a hex which I don't know how everything about that exciting times so a gen stage was designed to allow you to exchange events with back pressure between the legs of processes and this is a phrase that I've lifted straight from the announcing gen stage blog post that came out last year which is a really fascinating read it gives a lot of great insight into why they created this module and even though a lot of that blog post is now the documentation for the module I think it's still worth reading it's got a great video at the end actually where they build a data pipeline so if this talk isn't very good you can actually watch a good version some shows they doing the same thing but I want to unpick that phrase a little bit what is that pressure and why do you need it because normally we don't have to worry about this and with when you're working on data collections in elixir a lot of the time you'll be working with the email module or the stream module and these are both they make use of synchronous functions so as a piece of data moves through the streams as your piping functions together each piece of data is fully processed and then the next week will start so the back pressure is given to you and built into the system however once you switch over to using gen stage and which is built on top of processes and now you're in the asynchronous world and you do start have to think that you have to start thinking about back pressure and because if you do not have back pressure what can happen is you'll have a very active processes producing messages sending them to a second process much faster than that second process is able to deal with them so for example you're reading files off disk and you have to process them decrypt them something like that so as these messages are coming in they're filling up the consumer mailbox and there isn't really a bound to this prior to airline 19 and the consumer mailboxes fell up so large that it would take up all of the memory on the machine and bring down the entire beam which is probably not what you had in mind when you are planning to let it crash so that has been a little bit alleviated now there is a feature flag called max heap size where and as processes hit that limit you can kill them but if that happens then you're going to lose all of the messages that we're in the mailbox so if so there's still a penalty there and in contrast the the model that gen stage is using for back pressure is to have the consumer actually make a request of demand from the producer and it will say I can handle 5,000 messages right now send them over to me so the producer will send those messages as many as it can no more than that number though they all fit in the mailbox the consumer will work on them one at a time and when they're done the consumer will ask for more make more demand so this means that a system like down stage can be really helpful anytime you have control over the rate of production so if you are pulling messages out of a queue or if you're reading files from disk this can be helpful if you have something like a TCP server where you can't control the rate that messages are coming into your system then you will have to look for a different solution there so this is the example pipeline from the documentation and before I go into this actually I had meant to ask has anyone in the room worked with Jen's days and used it in a project at this point okay not many people so I will go into this part in a good level of detail or I'll try anyway so the way that this works is that and you start to design you and your applications in stages and where you will have a producer stage as generating data and a consumer stage that and does the final transform in that data and any number of producer-consumer stages in the middle so in the example pipeline and that they give in the documentation and you have a producer that's just generating numbers and an infinite number of numbers in response to demand from the consumer and then the producer consumer stage receives those numbers and it manipulates them in some way in this case it just doubles them before handing them on to the consumer and and the consumer has a sleep delay just built into it to simulate that that notice could be some kind of long-running processing so because the consumer is the rate-limiting step in the pipeline if you want to increase the throughput of the overall pipeline you would have to increase the number of consumers that are available so you would just spawn a more Pro more consumer processes there and this is how they define the the set up in the application code where they are setting up for consumer processes to ensure that the throughput is pretty good so here you have your starting one producer which is a one producer consumer which is B and then for consumers which is C and then each of the consumers is subscribed to the the upstream B process and the B processor subscribes to the producer and one thing that jumped out to me here was that there is a lot of boilerplate here as you as you add a number of stages you have to write more code which being very lazy didn't seem great to me so I created a an extension of that example project to kind of explore some of the beginning ideas of working with gen stage so this is up on my github profile the supervised gen stage app and I'm going to attempt to run that now okay so I should have probably made the font bigger here that's big enough for people to read cool okay so and yeah here is my version of the application this is basically it's almost exactly the same there's a small difference but I don't have very much in my application code at all I just am setting up the supervision tree here and I don't have any subscription happening between those stages and I'm actually going to start at the end of the pipeline with the consumer because everything really starts when the consumer makes demand and that happens down here now so in the if this is the example see module it's being initialized as a consumer so this is how you would and this is the other way that you can set up your stages you set up your consumer set up any state that it needs to have in this case I'm just keeping the pipeline name for logging and and the delay to simulate the long-running process and then the key part here is that you subscribe to your upstream process and you set the demand here so in this case I believe the demand will just be one that's asking for one message at a time and so that demand that we've created as this initializes will be sent because that's subscribing to the producer consumer and the producer consumer is subscribing to the producer the producer will receive that demand in this very OTP like Nana where you might be used to like handle call and handle cast here we have handle demand which is this is the new thing that makes Gem stage special it gets that it receives that demand and it tries to fulfill it and then asynchronously sends these events which would be an attempt to fulfill that demand back down the pipeline and so that self subscribing thing was really likely where key to getting past all the boilerplate but it also meant that now that I didn't have as much stuff happening in the application file and the modules were subscribing themselves I was able to start exploring what happens when a module dies and now that it can self subscribe can we bring it back up in a super vision tree and the great thing is that the supervision trees are there's nothing different about them you would just supervise these gen stage processes exactly the same way you would sacrifice anything and so here I have my like supervisor sort of downstream stages so basically I have the producer consumer and the consumer being supervised and the key thing to get this working was really I had to find a way of allowing the because the consumer is going to look for producer consumers based on the name I had to have an discoverable name but it also had to be unique so I passed the pipeline name into the producer consumer so that it can build its process name just by joining like pipeline names so if it was slow pipeline plus the B so that way when it launches you just build a new atom from that and I know building atoms on the slide can be R and no no sometimes that here we know how many we're going to have so it shouldn't be a problem so I will try and run this also make this bigger okay so this is just going to have the it has one producer/consumer one consumer and asking for one message at a time and it chugs along and you see every third or fourth line first first pipeline with the ped and then it's printing out an even number because this is the result of their of doubling something so that's not very exciting at that point but that gen stage in action oh actually I I had a second thing I can show you with that sorry so I was wrong I forgot to comment a line of out of line of code so we actually do have two pipelines so there's one producer but there are two and sets of consumers reading from that so one is a slow pipeline that takes five seconds to process the message but asks the three messages at once the second one is the fast pipeline which only takes a second and on off one and time so you can see the effective asking for different amounts of demand here because the first pipeline every time that writes out it just writes one message but just about to scroll off the top there was slow pipeline every so often there's a slow pipeline that has three messages associated with it so you basically you ask the three events you get up to three events and and then just to show it works okay I typed in observer dr. can't really see it and so here because everything is supervised now if I take down one of the pipelines kill that process then we get our little ever error message but everything chugs along if we didn't have it in the supervision tree and it wouldn't survive that and so what's interesting is you know the last time the first pipeline worked printed like 144 somewhere along along the line we lost 146 because that was in the mailbox when it went down but it came back with the next message so you still lose your message but this time you're only losing one message which is a lot better and the other thing that's interesting is that the first pipeline the consumer was paid 162 when it comes back even though I killed the B pipeline sorry the B process that wasn't directly the parent of the consumer it took down the consumer as well and brought that back with a new page so that processes subscribing the processes together is also linking them which wasn't immediately obvious to me this is up on github and the way I have that we post structured is that it's and I have a branch for each step so you can go back to like step one and see like the world's simplest down stage pipelines we have that's interesting but hard to follow during the talk there's a an easier way of running it for yourself so I'm going to change tack now and talk about some of the data engineering challenges that I was working on a bleacher report over the last few years that kind of brought me to this point of thinking about gem stage when I joined that team we were supporting a number of legacy applications that had been designed to store specific pieces of data and this is data that would be just user actions in the app so if someone reads an article or subscribes to a feed on Facebook and will receive those events and the business analytics team can have make strategic decisions from that and the big problems really was that we had too many different places that data was stored and unclear flow of how data moves from one location to another and the business team really wanted a single data warehouse passaic is used to store between a single warehouse that they could use to make decisions from so we designed a new pipeline that was should handle a hundred million events a day although our current pipeline is actually doing 160 million a day so that wouldn't have worked and we I wanted to separate the the capture processing and retrieval of event in two separate applications just keep it so that it was very clear where they what state data was at at any point and not have data retrieval effects data capture which you'd think wouldn't happen but and we store for one of the applications we were storing data in Amazon's redshift database and if the query got stuck in there it could it could be stuck for several hours and the we're no longer able to copy tables out of DynamoDB into there so tables start to back up in dynamodb and eventually we might exceed the amount of capacity we're allowed to have there which cost us money and stops us from being able to create new tables so I wanted to make sure those steps were more isolated once the store events in their original format and not have them put into normalized databases that are answering specific questions we wanted to be able to go back with new questions for the old data and still be able to answer that and expose it by an SQL interface for the team so this is what we came up with and like the key thing here is that it's very linear so it's very easy to see like how data moves through the pipeline data comes in from clients we collect it in a node.js application the reason we use node was because we had used it before and we were able to use a lot of the code we had but elixir would have been a really great choice here and we write that to Amazon's Kinesis system which is a queue a bit like katka that allows us to batch up all these disparate events and write them in batches we write those out to s3 and then key thing here what drives the events through the pipeline at that point is that s3 fires an event our Ruby application called palette will retrieve the file that it got the event about processes so that it's ready to be imported into redshift it can't write it directly to redshift so it writes it into a new s3 bucket the same process happens and the final application called forklift will do the bulk upload of that converted file over to reg yeah so the the key really is those events coming out of the bucket s3 can event emit events on any change but the only change we care about is an object being created and we also only care that it's the new raw data coming in because when we write the process data we'll be putting it back in the same bucket because you have a limit on how many buckets you can have and so we also restrict by the file name and those events can be sent through SQS SNS or lambda and the reason why we chose FQs for this was that we're expecting pretty continuous turnover of event so the overhead is spinning up lambda functions is probably excessive we also expect that our application will go down maybe if I build it in elixir it wouldn't have but we don't have we have nine fives on Ruby rather than the five nine so when that's down if we used a simple notification service that despises messages into the ether and expects listeners to be available for it so so having them on a queue is better there and SQS also guarantees at least ones delivery so we won't have any files left behind and the key thing also is that we wanted to ensure that multiple instances of pallets like if we want to increase the throughput through the pipeline we'll just spin up more instances we wanted to make sure that those are not we don't have two instances processing the same message at the same time and that's also the reason why we even need messages and need the events in the first place if we were just polling there's a bucket looking for new files then to two instances doing the same polling you're going to end up retrieving the same things the way sqs guarantees that you're not going to have two instances showing the same message down is this visibility timeout when one instance of Palace asks for an event and it will get that and then in the ideal world it will finish processing it and then go back and delete it but if it crashes for some reason or if we if it just get stuck then it will never send the delete and that message since it was always still on the queue it was just hidden after a timeout it will just become available again and another instance can can pick it up so yeah I'm really going to dive in a little bit too to the pallet application it was originally I wrote it in Ruby and it just it really is just a loop it uses an sqf polling loop to keep firing message up it messages up to sqs as soon as it gets a message back it will block do all the processing on that message and once the file has been fully processed and uploaded to the next bucket the loop will continue and it will start asking for new buckets so there's no aspect of the application that was there was parallel running anything in concurrently and the way that we were going to scale this and it was actually scaled really nicely on my laptop because a develop everything in docker containers and if I want to spin up another one I can just say docker compose give me four instances of this and that's the same process that we would use in production we run everything on elastic Beanstalk so that means that and if we want more of something we just need more instances it's not very efficient but it works and that's really where the one problem that this application had is that because the SQS Polling loop is built into each of the applications as we launch more of them it means that we're polling sqs more and more frequently but there's only going to be the same amount of messages and processing a message takes no time in comparison to processing the whole s3 file so so it'd be much better to bundle the request for messages together and and since Amazon charges per a request this can get kind of expensive and Amazon did really well out of this so you're welcome Amazon share I turned it off now so you can see it yes so and so I thought is there a better way of doing this and I was just getting into a lick sir at the time and gents page literally just came out the day I realized that this was a problem and I'm like alright well I imagine that rule that will do the jobs look what's going investigate it so I built this application there's the sqs f3 gen stage app it has the same thing you can check it out you can step through there like first branch it's a very simple version and then it builds from there the structure of this application is it's based on a dense stage pipeline where we have a producer that is responsible for and fulfilling the demand from the consumer to actually request messages from sqs it hands those messages to the producer consumer which downloads the appropriate file from s3 and then the consumer will in this case the consumer just counts the number of files lines in the file and but you can obviously do whatever you want to do with it at that point and the key is that we'll be bringing up more producer consumer / consumer downstream pipelines together using that supervised methods that I showed with the first example and so because we have multiple consumers making demand we don't want the producer to block while it's looking for messages we want it's still to be available to get more demand so the actual job of polling SQS is done by a separate module which just has a loop function that runs as a task and the producer will turn that loop on and off as its as demand for messages comes in I don't want to spend too much time actually like going through the code on this because it is it got complicated by the end but just in general and this is the basic structure of the application pretty much what I described in those in that diagram and the way the server works is this loop function is what does the work of receiving messages I'm using the X AWS library to interact with sqs and then the messages come in and I use sweet XML to parse those out and get the body the body of the message is the location of a file on it or is the event that came from s3 which I have to use poison to pause because that's in JSON so thank you Amazon for your consistence for us which is serialization there so if it receives a message it will send those messages back to the producer and I'm doing that you through here the it makes a call to a function that I've defined in queue it's doing this originally I was just using like send to send it as if it was a process because it is a process but I feel like if you're going to go with OTP and use Jen servers and Jen stages you should just go all in and define specific API s and make the relationship between your modules really clear that way when the when all of the demands that when the number of messages that were asked for have been found the loop will exit and if it still has more to do it will wait another five seconds and then look again that's really kind of the key part of this the producer is controlling it so through through the pull method here if there are any super servers looping when more demand comes in it will just kill those first before it starts a new a new set so I have a video of this in action it's not the most fascinating thing but it's basically I have no files on my in my output folder yet but there's nothing up my sleeves at this point run the application so it sets up all of the stages there's a lot of law in exists because I was trying to see the order that things happen and now the server is starting to loop it's looking for events looking for three events because it has three downstream consumers it does this for a it will just keep doing this every five seconds there are no events right now I don't have anything in the s3 bucket but I'll upload a bunch of files each file just has 200 events in it I probably should have varied the number to make the demo more interesting there we go we've uploaded the events now and in a second there they've all come through we've got a bunch of messages into the application and it very quickly processed them yeah well this is just a proof that they they really are up on s3 and the server process done very quickly because they were very small files and and now it continues looping looking for more and and now I attempt to check what's in my output directory at this point oh I have an an analyzed version of each of those files they just should each one should say the number of lines that were in the original s3 file so 200 witches the right number I can't believe it work okay and so you know from there you just go ahead and fill in all the details I'll leave that as an exercise to the audience so like this file is like I feel like this is the key part of my talk so I used a couple of libraries to learn with this when I built the application in Ruby I use the official AWS SDK which does provide low-level features for interacting with sqs but because it's been around for a while it's very mature it also has these higher level abstractions like the cue polar and it makes a good argument for using it so it uses long polling to keep your costs down and it's already built for you so I thought it'd make sense to use this and it had a very profound impact on the architecture of my whole application and led to the the scaling issue that's going to mentioned in contrast when I built it in elixir and I use the x ews library for this they don't provide any higher level abstractions for working with sqs so I was forced to I was forced to use the functions directly and in doing so I really had to understand how sqs works didn't know anything about that visibility timeout before I started working with this and I really had to understand how OTP works how the messages are being passed around and you know starting tasks dealing with asynchronous processes in much more depth than I would have otherwise and I notice now that they are the s3 module when I think I started working with this project I don't think it has a higher abstraction so it does now at one point those were actually based on gem stage but I think that's really an important thing to think about when you're building an application and you're using libraries for it like it's going to have an impact on how you design things and also I would say working at that low level it was one of the most satisfying programming challenges I've had in a long time and I swore a lot during the process which i think is a really good sign that like learning is happening and yes I bleach report is a super fast moving organization and we're already onto a new pipeline and instead of managing it ourselves we now have vendors that we work with which means I have time to come to conferences I don't have to respond to the love anymore so and we're partnering with nparticle to handle the client events and they duplex the traffic to all the different analytics providers so Google Analytics we also use flurry a few other things but the crucial part for me is that it also writes the events to s3 and then the other partner who if anyone hasn't worked with in tirana like a joy to work with really great they provide an analytics API that lets us query the data and they also have a really lovely web visualization that lets the analytics team play around and explore the data so before we could really commit to working with these vendors I had to do a lot of actually myself integration who's in the audience did a lot of technical due diligence with these companies to make sure that they could fulfill our requirements so one thing that I had to do was just count to make sure that we were getting the same number of events in in tirana that we actually thought we were getting on s3 so the events that come in from them particle this is expanded each file so each line in the file is a complete JSON object but within that there are multiple timestamps sorry multiple events each with different timestamps and if I haven't exactly worked out what the relationship between all of them is but it makes it somewhat challenging process to get a count on those events the way that I chose to do this you know we're talking about a pretty large amount of data I was using the streaming mode on elastic MapReduce I don't know Java and it allows me to work with languages I dunno so this is basically the local machine equivalent of the streaming mode where you just cat the directory of events pipe the output into your mapper sort the output of that and pipe that to your reducer and eventually you get your output so I ran this was very very like I'm only including this Ruby code to show that it is very simple there's not a lot of room for error here the matter just extracts timestamps converts them to days and exports them with account as one for each of those timestamps and the reducer will then take all those counts of one and just add them up or each of those days it's very very short file there and so everything works great locally up on elastic MapReduce after 12 hours it errors with no explanation I'm thinking there must be a better way to do this I'm streaming I'm adding and reducing that all seems like stuff that elixir is really good at so first attempt at this would be to lazy load the data using streams so this is almost the canonical example of how you would stream a file and do a MapReduce algorithm on it it's very short it's very simple and and works nicely but we have more we have better options in this now enum and stream and well sorry and so where gem stage brought us into their asynchronous world there is now a module is built on top of that called flow and so while stream builds lazy evaluation it gives you lazy evaluation over the enum module flow gives you concurrent evaluation so you can you can take advantage of all the callers on your machine to process files now and the way that this would work would be you'd stream a file into the flow from innumerable function and then you call you pipe that into flow flat map and flow will decide how many matters and it wants to pass that through and similarly how many slow reducers it wants to pipe it through and you don't really have to do any thinking about this if the machine knows what to do and these roughly map to the the stages that we know from gen stage so the great thing is you don't really have to change your code very much to take advantage of flow the only real difference here you know I'm using calling the flow module instead of the stream module and we take the initial step is to take the stream in them and do this from enumerable to turn it into a flow but from that point you're just mapping and reducing again and the contents of those functions are the same and this is great because it was about twice as fast as the original flown version unfortunately it was also twice as wrong and every time I ran it I got different number they were always wrong but they were wrong in different ways and it's actually interesting they're they're only wrong for the larger numbers the smaller numbers are correct and that gives a little insight maybe into like what sizes of batches of data flow is working with and it moves things through the pipeline and the key was that I had missed one line from the documentation which was to partition the flow and what that's going to do is set up new stages for reduction and manage bringing in all of the states back together at the end but flow partition is where flow turns from being a super cool easy-to-use alternative to stream to a John Steinbeck novel of documentation and death it's amazing the thought that's gone into it but there is there is a lot to digest there and using partitioned laser data correct again but we lost a little bit of performance with that and this was entirely because of the structure of my data and so this is like another key takeaway where like whenever you're setting up these pipelines you've got to know your data and you know the one one size is not going to fit all with these pipelines so what was happening for me was that I would take a file for a certain day I would send that to multiple mappers and flow is setting up multiple reducers as well but because the key for 99% of the data from a certain day is going to be the timestamp of that one day almost all of the data goes through one reducer and we're not taking advantage of the the other processes that were being set up so it was similar to me saying please just partition it over one stage which is basically saying don't make this concurrent but there's a lot of other functions available in flow and I was able to get around the problem by using group by key since we're grouping by the key of the timestamp and then I use phloem it's to just kind of dump out the state of each of the map stages so what I'm getting back now is hasn't been put together for me but since it's just a handful of timestamps for the different dates with counts to those putting it together looks like a lot of but it's very simple computationally because it's a tiny amount of data at this point and we're back down to the good timing that we had before we can still do even better what I had in this previous version is that I have my two files I I pass them into the function one at a time and manage reducing it myself over this ah there's my global reducer here so the final step is going to keep accumulating that for each of the files but there is a better way which is to use flows from enumerables where you take all of your files you set them up in multiple streams at the beginning take all of those streams post them into flicker from enumerables and here and my global counter is now just a single map and I don't have to manage it anymore so I think it's awesome that as Rosalia sort of thought about how people are going to use this and work with it and from enumerables is always fast and like it seems like you should never not use it so but all of the different flow cases I tried and that was the best so they're all of the code examples in that last section there in the exm are module and if you want to take any time to digest back another kind of flew through it the very last thing I want to talk about is how the X AWS library was using flow back in the day so if you want to see how a real professional programmer would use it it's that I think it looks similar enough to my version I'm not ashamed and and yes so it's basically doing the same thing here it's downloading files from s3 but it's chunking them and pulling it putting each of the chunks through different pipelines and putting it back together at the end but that is even though this came out in 2016 which is no longer the hottest newest way of doing it alexa one-floor introduced tasks async which is really perfect for the use like gam chunking or multiple downloads anything like that so that that's already changed and again if you're working with the library you don't necessarily see this and unless you you're digging through it for a presentation or something so yeah in conclusion and yeah you can use downstage anytime you want to introduce parallel steps into your programs it will help you to regulate the flow of data between your processes and you probably you can and you probably should use it in supervision trees doesn't seem to take any extra work and it is becoming the basis of new abstraction so if you don't act quickly you may never get to interact with it yourself and all the fun will be done for you so thank you very much for listening my slides will be online at that location and there are the links to those of repos that I mentioned [Applause] caryn SQL and that's actually a very interesting question and so different companies optimize the different things and bleachy report does not optimize the cost and what I always found a little disappointing when it when I was on the data engineering team it kind of became apparent to me through the process that a lot of the really hard problems in moving data around have being sold and it's become a cost management problem where there are so many solutions out there the right solution for you is the solution you can afford that gets the work done and for me I never I never got to see what our bill was for anything and no one ever pushed back to say we should find a more efficient way of doing this I think that's why we've gone with vendors now because if we've got the money to build a great solution ourselves we can pay someone else to make an even better solution so yeah at the moment the FQs bill is actually zero because we have moved away from messes around research invented one that is an excellent question so yeah the question was whether I could read the messages directly from Kinesis and bypass s3 and SQS so Kinesis is a newer Amazon product and the API for that does evolve a lot so I know that the firehose capabilities that became available while we were working on the project which did allow us to extract the data in different ways but one of the key things is that we really wanted the data to be in s3 that was almost the main goal in in some ways because having s3 as the source of truth for all of the events means that they're there in there is a nice sort of human readable JSON format they're easy to reprocess if you want to and our previous version where we were writing data from DynamoDB directly into redshift anytime we wanted to do any MapReduce jobs on that we had the Annoying step of having to extract it from redshift into s3 before we could do an EMR jobs so a lot of business really kind of comes down to what Amazon the API capabilities are there any point in time so for us we weren't able to read directly from Kinesis to get what we want to desert right there is also and something I didn't want to get into too much here was that we when we were writing data into redshift and we we were putting it in as a star schema which isn't necessarily the right way to work with redshift but we had to do yes absolutely we had to do a lot of processing of the data before we could get it in so for having it move through those different applications for processing with it was a key thing there
Info
Channel: Erlang Solutions
Views: 6,379
Rating: 4.9619045 out of 5
Keywords:
Id: trpueWn8DIM
Channel Id: undefined
Length: 45min 38sec (2738 seconds)
Published: Fri Mar 24 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.