ElixirConf 2021 - Pablo Meier - A Tale of Two ORMs: Comparisons of SQLAlchemy and Ecto

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] [Music] okay everybody hi thanks for coming by uh my name is pablo meyer um i was one of the founding engineers of ramp so one of the sponsors of the event uh come talk to us after the fact we're kind of new in making ourselves known uh but we're very happy to be here at elixir conf very happy to be sponsoring and presenting uh things we talk about um yeah you can read write it right at more pablo.com or uh some things on twitter it's in your pablo so the the title of my talk is a tale of two orms design decisions between sql alchemy and ecto uh just to get a show of hands how many of you written python before wow a lot of you how many of you written sql alchemy before still more than i expected awesome uh have you written nectar all right cool uh me too uh so before we create an apple pie from scratch we must first invent the universe so we'll talk a bit about relational dbs as to like the theory behind them and why it can be hard to write norm i'm going to come in hot i'm going to come in and say that in the entire field of computing go back to world war ii alan turing making machines to break codes all the way to whatever the pinnacle of computing is now in 2021 i'm going to say kubernetes because it scares people um but like from this entire spirit of time i think that relational dbs are one of the crown jewels of the field they are just one of the best ideas implementations so much work has gone into them just relational db's absolutely rule and they've survived every revolution we've had in computing they've survived moore's law for 30 years moore's law dying and going distributed uh they've survived solid state drives like we cannot get enough sequel no matter what we do um we owe kind of the idea of sequel in relations to this guy ted codd who um not to get all ken burns jazzy or ken's burns documentary where it's like the whole world worth one way then one visionary changed it all like this was a lot of work going on at a time but he's the one most credited with this um he came up with something called kind of the two relational algebra um basically it's this sense of like structure and then you've got these greek characters because it's very math-like where you have like the projection the selection the rename operator and credit to um someone else for making this great graphic but it is kind of intuitive what we understand it's equal to b he did this without sql to reference he was building something that would become sql but it is this notion of relations that you can do operations to that compose very well to get what you need this is a bit of a different world there's really only one data type for all these things it's relations even when you do an aggregation like a sum you are returning a relation of one element one tuple so relations come in relations come out sometimes you take two of them like in a join sometimes you take one like in a projection but this is a world of relations the relations themselves contain tuples which is the compound data that we work with those tuple elements look like primitives as far as most databases are concerned and the world is expressed declaratively which is a big change from how we actually write our applications that's a bit of a problem we don't usually write our applications as declarative structures unless some of you happen to write prologue who hears written prologue who here writes prologue for their job yeah hands down right like this is not true this is some prologue that's gonna solve sql by kind of saying what you want pro sql to or sudoku rather to look like we don't write prologue for our job we uh by the way and if you add recursion to sql you can do some fun things um it is ability to like you have the ability to write kind of anything you want this query if you put it in postgres will print the mandelbrot set which is really neat um yeah find me after the talk if you want to see this it takes a long time to run but it does work so we could write these things declaratively but we're not going to do that anytime soon we kind of live in a world with object oriented paradigms in some places we live in a world of distributed systems and implementation details so it gets a little wild uh so often we will get into fights the python people and the elixir people about like the right way to do things we'll say imperative we'll say functional say imperative we'll say functional but then sql shows up and it's kind of like hi and it's just going to blow us both back it's a declarative world of tuples they come to me and they say like enlightenment reaching you and i'm like what are you saying so writing an orm is challenging okay let's talk a bit about orms uh what is an rm it's going to take these relations and these tuples that we have so here's just some sample data of a blog post i could write and when i say orm which stands for object relational mapper it's going to convert these tuples into a more rich compound data structure so in an object-oriented language this would be an object this is an elixir struct of post and author here but we want it to do that kind of manual work of convertes these things do you really need an rm uh if you ask a lot of people who have blogs no no and you shouldn't you dirty like like no like this is a terrible idea it's harmful it's bad and they're all kind of saying the same thing they're saying what i was just pointing out that it's actually kind of hard to do right and there's a lot of magic and maybe if you rely on that magic you don't see on it then it's a bad idea so no all right thanks for coming to my talk no no um the truth is uh every major application platform ships with an orm right so if they did sync you these would not be popular we would not build companies on these but i've been at like six companies and all six companies have had these kinds of um platforms behind them so it clearly is not going to sink you i view it as kind of a reality you will usually use a feature full orm so it's good to think critically again why it is hard and what choices those orms made in order to get you to where you need to go to answer the high level question what is useful to a software developer when working with a relational database so i'm going to tell you some things about sql alchemy a lot of hands came up that was really exciting to see so maybe this will be kind of old hat for some of you is it an orm or is it a layer over sequel are you writing python looking sql are you writing objects who knows the answer it's both [Music] both uh it's both uh if you look at the reference documentation they have sql alchemy orm and sql alchemy core so you can write code that uses the object model of python you can see that this author inherits from model and post inherits from model you give it some metadata to tell you what table you're talking to you declare your fields and then you can write queries that look like this which is like give me a query on author filter by the name pablo pull it out and then here we're going to join where the author is equal to pablo on the post so you can write this you can also write something like this which i'm not showing what author table is or post table is but these are other label expressions but these do not require you to use python's object system you don't have to use the extends model or anything like that you can just kind of write sql looking things and return tuples which is kind of nice another thing about orms often means relationships this is the documentation for how you specify relationships in an orm and python it is one function with a primary argument and this many uh keyword arguments and there are defaults so that's a lot of options and the last thing i'm going to say about sql alchemy before kind of moving on the high level stuff is it's got multiple back ends you can write it for postgres you can write for mysql you can use it for sql lite you can use it for sql server you can use it for oracle database these are the first party supported ones you have an infinite list of un like non third party ones all this to say that's equal alchemy follows that very famous python dictum there's more than one way to do it this is a joke uh if you're old enough if you remember this is something python does not believe they position themselves against pearl by saying like pearl said there was more than one way to do it python came in and like the mid-odds and says no there should be one way to do it but see what alchemy kind of proves otherwise and this is something we can learn from our orms i'm not trying to knock on sql alchemy is that the realities of our industries are people need a lot of ways to do a lot of things when you become useful enough kind of on that note another fun thing another reality i've learned about working with sql alchemy is what i call the tyranny of integrations um i've worked on four companies with this stack you have flask small web development you don't need too much sql alchemy it's an rrm uh does realization and pi test for your test and all these things do one thing and do one thing well so what do we do we fill it in with a lot of other integrations someone conveniently writes you a wrapper for these things marshmallows equal alchemy but also marshmallow flask and flash forest plus um flat sequel alchemy factory boy like all these things have things so if a bug happens who's oh no was that too loud oh yeah uh thank you um so with all these things if there's a bug where do you look kind of any of these um stack overflow we have a winner uh the fun thing about this i'm not gonna tell you which ones but um all of these provide also look for and conditionally import different json libraries so if you're passing json between these layers you get different things so i call it the tyranny of integrations because these are all useful it's better to use these than to not use these but at many of your companies you'll be like oh man um you can enter the world of pain so let's talk about ecto a little bit um is it an oram layer or sql layer i feel like i have more answer to this it's both yes is the other answer it's both you can write ecto queries that look like this where there's not like a rich object here you're just saying give me a table name give me another table name and select a map this could have been a tuple just as easily but i could also write something that allows me to use the closest thing we have to an object which is a schema and then query it like this um but there is a funny turning of integrations happening here too but it's a little different one and i like this one because it tells me something about the elixir language itself um we know that schemas right now have this great macro and it defines a struct that has all this cool database functionality enabled who here has used type struct does anyone use type struct all right this is a really fun elixir library which looks at the fact that many deaf structs you'd like to do have a few different properties more than the raw destruct you can enforce certain keys and say like okay this must always be true and you also sometimes want to put a type spec and say like i'm going to define a type t that is my module so type struct lets you with this macro put all that in one definition it's kind of nice who here has used x constructor anybody okay x constructor is another cute little macro that says you want to be able to handle string field names or atom field names and be able to create a version of this what is the problem with these three if you want to use all three yes you have to yourself manage them together the answer is you have to integrate them together you can't just declare all three because you're allowed one structure in every module in elixir and all three of these are macros around defstruct so you can't really easily say i want a schema that i also create a type t for that also has these uh very convenient methods added to it so in a way the intersection of these is an exclusion rather than sequel alchemy where it's just a big or of all these other spaces in between and hard to figure out what to do next so i call this like an interesting fact of working in this one okay now we're going to talk about the topics we're going to talk about exactly how they do things a little differently so topic one how do we touch the db differently in both of these orms uh let me introduce you to the session um sql alchemy as a whole likes to do things as like you're working with instances so my examples are post author comment you can imagine a world where i have some post objects some author objects and some comment objects and they're all like working happily it's just like working in object oriented python very cool but i see that there's a thread between them and if i walk toward that thread oh god it's the minotaur session is kind of the big thing that you need to learn if you want to use sql algorithm effectively what is this it is a large global object mutable that handles all your db connections and it is referenced by every one of your entities so it does a lot of things for you it manages the db connections all instances of orm models if you have posts comments blogs authors managed by the session it will run queries as needed and update things as needed so you might ask how do i get the session like how do i get a session that works it's just these session python is great because it's allowing one thing to happen at a time if you've ever used flask sometimes you want to get the web request and you didn't pass it in like you would pass a plug or a con so how do you get the request you import it at the top level you get the request because at any point the execution there's only one request happening it's python and it's the same thing going on with the db session there's only one session alive at any given point because it's python so you need the session you get the session and you speak to it from an api perspective though the session tries to handle as much for you as possible so what does this look like i'm going to run this thing that says db query on a blog post and i'm going to get the post id and i'm going to pull one this executes it and now this is a post object so this post said comments a funny thing is by default if you don't specify a different join strategy it will do a lazy one so when you call for comment in post comments if you didn't specify differently there is a sequel query happening here that you didn't see and then we do things with those comments and mark them as red who sees where this is going [Music] what if i want to do this what i want to say get all the posts and then find spam and all andy comments we're now doing end calls to the db with the default relationship loading strategy so sometimes queries can happen when you didn't mean to in fact literally three days before this talk i was looking at my friend's python code and i saw this which was we're going to create this query object and then we're going to get like the title for each post in the query but you'll notice i didn't execute this query but the query object knows when it's in a comprehension like this and will execute it at this point you never said actually run it but it knew that's what you wanted to do right so we have another sql query happening here as it makes this happen ecto is a little bit different it is repo or nothing so you run this macro to create this query object and then you must call dot all repo dot one or report out something else and if you didn't touch the repo uh module it just never happened so this is a design difference on one hand sql alchemy is like kind of try and figure out what you want to do and often try and make a default of if you didn't need those comments you never fetch them and if you did need them you fetched them at the time yes that was the punch line for the whole talk um okay uh if you're on youtube you didn't hear that um but that's absolutely true this is an explicit call out um i'll think of another punchline yes uh topics as relationships um how do we specify relationships i showed you that really funny function signature but the truth is relationships are the heart of why orms are hard no matter what you do because that declarative composable model so it's going to be messy in whatever you do i have to look up put change and whatever it is on ecto every time i use it so relationships are hard sql alchemy lets you define your relationship with the model this is some of that's equal alchemy code i showed you earlier and here in the relationship i'm giving it the join strategy that i want so the default one when you don't specify was what i said earlier it's this one select whoops where's my little laser i think i'm dead it's this one select and select will um just call it when you need it as you access it um whereas lazy joined we'll say that we'll join at the moment the query happens uh lazy raise is my favorite this means that if you try and access this it just throws an error it's a bit like preload which we'll get to in a moment um so yeah what you can do then is that at query time you can override these you can say um options lazy load post dot comments and what this means is that before postdoc comments got described as please join at the time of the query but in this specific query i want you to be lazy about it and do it after the fact so you have the ability to override a query can this go wrong yes of course it can because we're in a computer conference um what can go wrong you can specify that you want things to always join at the model level and then you can pit an a calls a b always join b calls a c always join d c uh joins a d always call and then if you query a few a's you might get surprised to find a massive tree especially many of those are one to many of everything you could ever want again you can but you don't have to specify what you want at the query ecto is a little bit different ecto does let you say belongs to and has many but it does not specify a joint strategy it does give you joint things that you need if you need to pass through for a many-to-many table but it doesn't tell you now load it every time it really only lets you call it here at this point so if i call this which is say give me a post and then pull out the author first name we all know what this looks like that's a repo that's a pre-load error we now have to tell it explicitly at query time either in the query definition or in the repo call that we need to uh pull it together and how we join so it's explicit every time as pointed out uh so it's always per query uh another topic how do we express a change in these languages um as i mentioned before sql alchemy uh has the session and the instances and so it tries to make this as natural as possible um for some context python first came out in 1995 object orientation was all the rage that's how we were going to scale our software and do everything right so the idea of looking like classical object-oriented software development was like a good idea so in order to do this if you want to change the title of this post you say you just set it with a normal mutation assignment and then the session tracks that and when you tell the session to commit later it has all its track changes and pushes them up again that minotaur you know how it works you can kind of predict what it does ecto we know is a little bit different ecto has a set of change sets and so i'm going to get this post and then i'm going to wrap it in a change set and this is no longer a post this is no longer a first party entity and i can't use it naturally like a post i now have a change set which carries with it the modifications like the title better living through mechanical keyboards um graphically um sql alchemy has the instance wrapping its own changes and ecto has its changes wrapping around the instance so the instance is still there you can fish it out if you need to but this is how we express changes most idiomatically and from here you can do validations different ways we know about validations on ecto or many of us do where you can have the module define validations on the change set itself and you can compare against them in your test cases so you don't have to hit the database whenever you kind of mock out your test sql alchemy does also have validations that it allows but you can um also raises an exception on these assertions it's a high level view on how they look at changes lastly migrations and the data definition language who knows how sql alchemy does migrations there are zero ways to do it um this is really tongue-in-cheek actually actually it's kind of true but there is no way to do it technically you need to use another project called alembic and alembic it looks like it got like folded in but it has a very different vocabulary a very different like set of config files and ways to talk to it but alembic according to its documentation is a lightweight database migration tool for usage with sql alchemy data database toolkit it's also in the sql alchemy umbrella project so at this point it is so standardized it might as well be a part of sql alchemy though if you look at the special thanks in front matter it says it's developed by this person and loosely associated with the sql outcome project so it seems to have began life in a different way and then kind of merge its way in but these migrations and data definition have a different set of primitives languages and things you import before you run them ecto does give you the full data definition language i've opened this page on devhins.io hundreds of times every time and it tells me what i need to know so it gives you a data definition language but it also lets you pull in the repo so you're able to look at this and say like i can do arbitrary code from elixir this has led to a few foot guns of its own who was here at adam's talk yesterday on pattern matching anybody yeah this happened to me when i was working on the pleromo open source project suppose you are building something and you have this point in time you create a migration x with a version of one of your structs a person version and you pull in the repo version and then you run it if you're already at this point the database this migration will pass and if the present looks like this you continue developing your application you work in work work you add more migrations and incrementally production was here when you moved it and so were you you increment and now person looks different i had to clone this git repo and run it entirely but it died here because person was being pulled from this point in time when this migration was written expecting person to be from this point of time like adam's talk we got hit by the representation of this so i find it useful that you can pull in all your data definitions and things like that but there is a foot gun to be aware of when you do [Music] this so to recap a few of the major different ways of doing things um ecto prefers to be explicit on a lot of things relationship loading um and like join strategies uh when you call the database explicitly has to happen a certain way um it tends to uh give you the data definition language whereas the other doesn't and generally building orms is a bit challenging but i gave the version of this talk and um to my partner she was like why are you telling this talk and i was like because i think it's interesting i was working on both stacks and it seemed really interesting she goes now you got to take a stand you got to have a point of view like people don't care about this stuff for its own reasons i was like i do um [Music] but like i get what they're saying investment and personality matters so now is like the point for just my like opinion man um uh it's not hard to guess also because i'm speaking at elixir conf so i do have opinions about this and i probably i'll say on the whole i prefer ecto um and i'll talk about why but first i want to disclaimer to anyone watching on youtube or anything else like that i love sql alchemy too i think it's a feat of engineering i think the people who worked on sql alchemy have done more to create value for other developers than i probably will in my lifetime i think it is awesome awesome awesome so i love sql alchemy but i prefer ecto um and i wrote a blog post about why i brought elixir to ramp and what i thought about it and i think the something i wrote there is something kind of applies to the ecto way of looking at things i said a bit of a hot take i think most dynamic languages are a poor fit for startups who have intentions of being long-term businesses you've created an environment that's optimized for your founding engineers to build something quickly in the first seven months and you set up a set of recurring obstacles that everyone else pays down for the next seven years um i've been at stage series b c h i've been a lot of places and the first employee and i've found that a lot of my job in the later stages is working past things i can't understand people fight about static and dynamic types but i think it's not controversial to say invariants help you reason about software and most dynamic languages don't give you many invariants so with that in mind when i think about the explicit nature of where database queries happen or explicit nature of how things join it seems a little less convenient at first because it's not going to just derive what you want but over those seven years where queries get thrown in and out all the time i find this to be very valuable and so i like the design decisions of echo because normally i know what is happening i went faster than my normal runs on this so i just want to say big thanks [Applause] um the question is uh did we put policies in place to avoid uh the migration truck tissue not formal ones um we're still kind of small and quick enough and we don't pull in the dynamic definitions as often most of our migrations are just pure ddl at this point but will be something to keep in mind yes uh question is where do you put the validations in the ecto modules especially for like seven year time frames um our elixir is pretty new because the company itself is two years old so we haven't really had an issue with that i've i have really loved and found value in change sets themselves and being able to create different sets of validations for different cases as you go into the same struct i found that cool um and i don't i imagine the context is also a fine place depending on where you would go but i think the module uh itself has been fine for us so far yes that's awesome i didn't know about that uh warth yeah um he's describing the mix task to dump your current state of the db so you have to run migration sequentially but keep them in the repo that's awesome i didn't know that uh yeah no no i haven't worked directly with the django rm um i understand there's a lot of like string of parsing it does to figure out what you meant when you pass in a string uh i love stringly typed code as much as anybody um but that scares me uh if that's correct but i haven't worked with it on interesting interesting cool uh yes oh that's a great question what do i miss in sql alchemy that's not in acto um sorry i'm getting some chuckles for that because i don't have an immediate answer um i don't know i mean i think actually ecto is great documentation but i also like the sql alchemy documentation very strongly and it's got like it's there's that like that that view of like the four types of documentation tutorials references like guides or whatever there is just so much material out there receivable alchemy that if you want to no matter what your preferred way in i love elixir documentation culture but it just hasn't had so many hundreds of bodies trample over like every road you could want to go on so that's one thing that i missed when i wanted to yes [Laughter] too real uh yeah for anyone watching the video uh we you might miss the global interpreter lock that was good uh anyone else awesome thanks so much [Applause]
Info
Channel: ElixirConf
Views: 709
Rating: undefined out of 5
Keywords: elixir
Id: 5Ep9CDXUHNU
Channel Id: undefined
Length: 27min 53sec (1673 seconds)
Published: Sat Oct 23 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.