How to Create Dialogue Audio like in Celeste and Animal Crossing using Unity

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey everyone my name is trevor and in this video we're going to add some sound to this dialogue system that plays over the text that's typing out in a similar sounding way to games like celeste and animal crossing we'll start by creating a simple and flexible system for playing the audio using some generic beep sounds where we can tweak parameters like the frequency pitch range and the number of sound clips to create some really unique and interesting dialogue audio we'll also cover how to go about using different audio configurations for different speakers or in other words having different voices for different characters and last just for fun we'll do a loose replication of the dialogue audio for both celeste and animal crossing using the systems we create in this video just to note we'll be starting with a dialog system created in some previous tutorials i've done but you don't have to have seen those at all to understand what we're going to do in this one that said the implementation we do for handling different audio for different characters towards the end of this video will be somewhat specific to that dialog system so for that part you'll need to translate the approach we take to how your own dialog system works and one last thing to know we'll be using unity's built-in audio source component to play the audio so if you're not familiar with audio sources in unity i recommend reading up on those first and then coming back to this tutorial afterwards and of course like with any of my tutorial videos everything that we're going to do can be found on the github project which will be linked in the description of this video with all of that said let's jump into how this is going to work we'll start with a really short sound clip which can be anything but in our case we're going to start by using this quick beep sound here the dialog system we have in place already uses a typing text effect where each letter types out one at a time so to start we'll play that beep sound every time a letter types out using an audio source however depending on how fast the text types out and the length of the sound that's being used this can blend together a little too much or we might just want the sound to happen a bit less frequently as a preference so in that case we'll add a variable called frequency level to control whether we should play the sound every single character every other character every three characters or so on next each time we play the sound clip we'll also add an option to change the pitch of that sound between a minimum and maximum value which will make the sound come across as a bit different each time it plays even though we're using the same sound clip and last we'll also add the option to select from multiple sound clips each time a sound is played so that we can combine different sound clips to make the audio feel more dynamic as for how we choose the pitch value and sound clips each time a sound plays we're going to start by selecting those randomly this means that for the same line of dialogue the audio will be randomized for each and hence different each time that line plays after we get that in place though we'll use the hash code associated with the character that we're on when the sound plays and then plug that into an algorithm to decide the pitch and sound clip that we should use for that specific character and with that approach if we play the same line of dialogue twice the audio will be exactly the same each time i think this is a nice touch because it kind of adds the illusion like the audio is another language or something and i'll explain more about this approach when we reach that section of the video and last as for managing different audio for different npcs we're going to use scriptable object assets to store different audio configurations like the frequency level the pitch range and the sound clips each with a unique id associated with them and from there we'll pass some metadata from the dialog itself to inform our system which audio configuration it should be using for any given line of dialogue since the dialog system in these videos is using something called ink we'll be using ink tags to accomplish passing that metadata but i'll go into more detail about that when we reach that section of the video as well and that's how it's going to work the first thing we need to do is create or find a sound clip or sound clips that we can use for the dialog sound like i mentioned previously we're going to start with this quick beep sound here but there are a few things that i wanted to point out so that if you're coming up with your own sounds you'll be creating them in a way that works well for this technique and for those unfamiliar the program i'm using here is called audacity and it's completely free to download and use i won't be going into detail on how to use audacity but i will put a link to it in the description of this video so first a lot of sounds you'll find online or even if you record your own sound you'll likely have a bit of silence at the beginning and end because we'll be playing this sound clip over and over again at a really fast pace as the dialogue types out having a bit of silence at the beginning will cause a delay in playing the sound and screw things up so we'll want to delete the silence at the beginning so that the sound starts right away when we play it and likewise it's also worth trimming off any silence at the end just in case there's unwanted noise so that our file is exactly the sound we want and nothing more the next thing i want to mention is that the technique in this tutorial works best with very short sounds i think around the 0.1 to 0.2 seconds mark is a good place to start but of course this entirely depends on how frequently you want to play the sound and it's something you'll want to play around with a bit yourself and the last thing i want to talk about is the format in which we should save the sound file the mp3 format encodes and saves the sound data in fixed chunks which unfortunately may add silence at the beginning and end of the clip which is something we really want to avoid for the sound so i wouldn't recommend saving your sound in an mp3 format ogg files are actually superior to mp3 files in terms of size and quality and they won't add that silence to the beginning or end so this is a really good choice and the last one i'll touch on is dot wav files which will give you the best quality since they're uncompressed but they have a much larger file size if you're not sure which one to go with i recommend going with the wave format since that'll give you the best quality and then you can always convert those files to ogg files later on if you need to optimize the size of those files now before we jump into the implementation for this i want to give a quick overview of the relevant parts of the unity project that we're starting with we have a really simple scene with the player and three different npcs that the player can talk to without getting into the finer details of this dialog system the most important part for this tutorial is that the dialog itself is managed and displayed by a singleton class called the dialog manager which gets triggered when the player talks to one of the npcs the dialog manager is where we're going to implement most of the code for this tutorial to show how this all works but of course you'll need to decide what makes sense for your own project for example you may want something like an audio manager for handling your audio sources but again we're just going to do everything in the dialog manager for this video and last there's also a sounds folder that includes all of the various sounds that we'll be using and for this tutorial they're just imported with unity's default settings for audio files so the first thing that we're going to do is get the dialog to play a sound clip each time a letter types out in the dialog manager class we'll add a serialize field private audio clip variable called dialog typing sound clip and to be able to play that sound clip we'll also add a private audio source variable called audio source then in the awake method we'll create that audio source component through the code by assigning that audio source variable to this.gameobject.ad component audio source the dialog manager class handles displaying each line of dialog using this co routine here called display line by setting the text of the text mesh pro object in the dialog panel to the new line that we want to display then setting the max visible characters property of the text mesh pro object to 0 and then incrementing that max visible characters property to show a single character before waiting a fraction of a second after which it'll loop and then increment again to show another character and it does that for the entire line of dialogue what we want to do here is play the sound every time we add on a character to the line so we can call audio source.play one shot and then pass in the audio clip variable which will play the audio clip a single time every time this code executes back in the unity editor be sure to drag in whatever sound clip you're using into the dialog typing sound clip variable for that script and then we can go into play mode to see how this sounds so far and we can see that when we talk to an npc and dialog plays we'll hear that sound play for every single character that types out as expected [Music] you'll notice though that the beep sound overlaps with the one after it which may or may not be what you want one way we could go about cleaning this up a bit is by making the sound itself a bit shorter but another way we could go about it would be to actually stop the sound from playing before we play the next one so we'll add a configuration to do just that at the top of the dialog manager class we'll add a serialized field private boolean called stop audio source and then in the display line method right before we play the sound clip we'll add a conditional statement and then stop the audio source with this line here if that variable is true then back in unity we can check that box to make sure that variable is true and then play it and then we'll see that the sound is stopped before the next one plays so they don't overlap [Music] one important thing to note here though is that because we're stopping the sound abruptly like this it can potentially cause some clicking or popping sounds depending on the sound clip and where it stops at if this is happening to you i unfortunately wasn't able to find a great fix for it but what i recommend is either shortening your sound and not using the stop option that we just added or you could try playing around with the sound itself by doing things like lowering the amplitude of the sound fading out the end of the sound and so on or of course we can make the sound happen less frequently which might be desired regardless depending on what you're going for as mentioned earlier in the video one way we can go about this is by playing the sound every other character every three characters or so on depending on how frequently we want the sound to play back in the dialog manager we're going to create a new method called play dialog sound which takes in an integer for the number of characters that are currently displayed by the typing text effect then we'll use modular division where we take that integer and mod it against the number two and then we'll check that to be equal against zero which is basically just checking if the current displayed character count is divisible by 2 and therefore will be true every other character likewise if we change this to 3 instead it would play the sound every 3 characters then we'll cut and paste this chunk of code up here where we were initially playing the sound and then put it into the play dialogue sound method here and finally where we were previously playing the sound we'll want to call our new method passing in the max visible characters property which tells us how many characters are displayed and we want the sound to play right at the beginning and right now it wouldn't because the first time this goes through the max visible characters property will be 1 rather than 0 so we'll actually need to move this to be before that increments so that way it's zero the first time around which means that the sound will always play right at the start and if we go into play mode to test this out we'll see that the sound is playing every other character instead of every character resulting in a half frequency [Music] so next let's make this a parameter so that we can play around with the frequency level later on at the top of the dialog manager script we'll add a private serialize field integer called frequency level and default it to 2. we'll also add this range annotation here which will give us a slider in the inspector that goes from one to five since that seems like a pretty reasonable range and then we'll switch out the number two for our frequency level when we do that modular division in back in play mode we can play around with this slider to make the sound more or less frequent [Music] we'll stick with a frequency level of two for right now next we're going to add the functionality to randomize the pitch back at the top of the dialog manager we'll add two new float variables one for the minimum pitch and one for the maximum pitch and we'll give these a range between negative three and three since that's what the pitch slider for an audio source component is restricted to in the unity inspector then in the play dialog sound method we created earlier we'll set the pitch to a random number between those values inclusively with this line here before we play the sound just to note though if you bring the pitch too low or high for a given sound the sound may disappear so i recommend that for whatever sound clips you're using you play around with the pitch value to figure out where an appropriate minimum and maximum is for that specific sound clip so for this beep sound we'll do a minimum of 0.5 and a maximum of two and if we enter play mode to try this out we'll hear that the pitch is randomizing like so next we're going to add the functionality to randomly select from multiple sound clips instead of just one back in the dialog manager we'll change the audio clip to an audio clip array and then also add an s on the end of the variable name so it's plural just for readability purposes and then in the play dialogue sound method we'll select a random integer between 0 inclusive and the array length exclusive then pick the sound clip to be played from that random index and of course switch out the sound clip to be played here for the randomly selected sound clip now back in unity we can add three different slots for three different types of deep sounds where the first one is going to be the original one we've been using and the other two sound just a bit different and with that in place we can go into play mode and we'll see that the sound clips are being randomly selected just like we wanted [Music] and that really does it for the bare bones of this system but next we're going to use hash codes to make this a bit less random so right now with how things are configured we have a sound plane with a random sound clip and a random pitch every other character there are a lot of different ways we could go about making the audio predictable for a given line of dialogue rather than completely random we could do something like create a map of english characters to different sounds that we want to play for those characters which is probably closer to how games like animal crossing go about it but that ends up being pretty tedious especially if you factor in things like localization where you might want to support different languages with entirely different sets of characters so with that in mind instead of doing something like a character map what we're going to do for this tutorial is take the character that we're playing the sound for and then generate a hash code for that character and then use that hash code to select a sound clip and pitch using modular division let's run through a quick example of this let's say we have 5 sound clips a min pitch of 1 a max pitch of 1.5 and we want to play a sound for the character a let's say that the character a will always have a hashcode of 6357089 so for selecting the sound clip we do modular division between that and the number of sound clips that we have which equals 4. so in that case we'd select the sound clip in our array with index 4 whenever we come across the letter a for selecting the pitch since we're dealing with a range between two floats the math is just a bit more complicated first we'll need to convert the float values to integer values for the max and min pitch by multiplying them by 100 then we can subtract off the min pitch value from the max pitch value so that we're working with a single integer that represents our range and from there we can use modular division between the hash code and that integer just like we did with the sound clip and then we can convert that back by adding on the minimum pitch integer and then dividing by 100 which gives us a pitch value that will always correspond to the character a within the range that we've specified the main downside here compared to doing something like a character to sound map is that the sound clips for a character will still be selected in a somewhat random way using the hashcode meaning you can't easily match certain characters with certain sounds unless you manually pair up the algorithm results for a hash code with the desired sound but i do think this approach has a huge upside of being much easier to implement and maintain especially if you're going to localize your game but of course this all depends on what you're going for as for how we generate the hash code for a character we can do that in a lot of different ways but for this video we're just going to use the git hashcode method that's built into c-sharp but just as a quick disclaimer about this method which i didn't realize myself until after already putting the rest of this video together and that is that the hashcodes generated through this method are not meant to be permanent or in other words the hashcodes generated can be different depending on the platform net version and overall environment that your game is running in meaning that the dialog audio will still be consistent for the same line of dialogue just like we want however if the runtime environment changes then the hash codes will be different and thus the audio will be as well if that's not consistent enough for your liking i recommend generating your own hash code instead of using the get hashcode method which can be done in a lot of ways and if you want to go that route and this is all confusing to you i recommend reading up on the computer science topic paul hashing but again for this video we're just going to use the get hash code method and with all of that said let's jump into implementing this at the top of the dialog manager we'll add a boolean variable for whether or not we should go with the code approach or with the randomized approach and we'll call this make predictable then in the play dialog sound method we'll add a parameter for the current character we're on that will be used to choose the sound clip in pitch next we'll add that make predictable conditional statement here cut and paste the randomized configuration into the else statement of that block and then finally declare the sound clip variable above that if else block so that it has the appropriate scope to be referenced down here now if we want to make the selection predictable first we'll get the hash code for our current character with this line here then we'll use modular division against how many sound clips we have to get the index in which to choose a sound clip and then of course set our sound clip appropriately with this line here for the pitch we'll multiply our min and max pitch values by 100 and then convert them to integers like we talked about earlier and then subtract off the min pitch integer now for this next part if for any reason the pitch range int variable is zero we'll get a divide by zero error so we need to check if that's the case and if it is that means our max and min pitch values are the same so we can just choose one of them to go with if it's not though we'll use modular division to select the pitch and then add back on the minimum pitch value and last we'll convert that back to a float by dividing by 100 and then set that as the pitch value for the audio source and it's really easy to miss but this f here is important since it indicates that we're dividing against the float which will make the result of that division a float instead of an int which is what we want and last of course when we call the play dialogue sound method we'll want to pass in the current character which we can get by calling dialog text.text and then treating that text like an array where the index will be the max visible character's property and because arrays start at 0 the first time around this will actually get the first character just like we want it to so having this before we increment the max visible characters property actually makes this all line up perfectly and with that in place we can check the box so that dialog audio is predictable for any given line and then we can play this to try it out we'll see that we get the same predictable dialogue audio for the same line of dialogue whereas before it would have been randomized so that pretty much does it for the audio itself but the next question you're likely asking is how to take this technique and have different audio configurations for different speakers what we need to do here is take the audio configuration which we implemented in the dialog manager and abstract that out in a way where we can support multiple configurations there are a lot of ways we can do this but i think this is a pretty good use case for scriptable objects so that's what we're going to do in this video if you're not familiar with scriptable objects i'll put a resource in the description of this video that i think is a great introduction to them however we're not going to do anything overly complicated so don't worry too much if it's a new concept to you so for each configuration we need we'll create a scriptable object asset that stores those configuration details along with a unique id we can use to reference that scriptable object asset with that setup we'll just need a way to tell our dialog manager which scriptable object asset we should be using at any given time to do so we can create a dictionary that maps that unique id to the configuration data then this next part is going to be a bit specific to how your dialog system works so you'll have to translate the rest of this approach to fit your own system's needs but with that said the dialog system in this video uses a narrative scripting language called inc where we can use metadata tags to inform our system of whatever information we choose to so anytime we want to change the audio being used we can use a tag that looks like this in our ink dialog where the key is the word audio and then the value is the unique id that maps to the scriptable object asset and then we can use that metadata tag value to look up the scriptable object audio configuration we should use for that line of dialogue using that dictionary so first let's set up the system to use a scriptable object for the audio configuration in the project we'll just create a new folder under scripts called audio and then in there we'll create a new c sharp script called dialog audio info so and then double click it to open it up we can remove these placeholder methods as well as make it extend from scriptable object instead of mono behavior and right above that we'll add the line create asset menu giving it a default file name called dialog audio info adding it to the menu as scriptable object slash dialog audio info so and we'll just give it an order of one then we'll give it a public string id which is how we're going to identify the asset and back in the dialog manager we can take whatever parameters we'd want to be able to change for different npcs and of course it's up to you what you want to keep at the system level versus an npc specific level but for this tutorial i'm going to grab everything except for the make predictable boolean which we'll keep at the system level and then we'll change all of these to be public so they're properly accessible now back in the dialog manager we'll add a serialized field private dialog audio info so and call it default audio info which will represent the default configuration if we haven't set another one through our metadata we'll also create a private dialog audio info so that we'll call current audio info then in the awake method we'll set the current audio info to our default audio info and finally in the play dialogue sound method up at the top we'll set all of the values we need based on whatever the current audio info is then back in unity we'll create a new scriptable object asset from that scriptable object script by right-clicking it going to create and then going to scriptable object dialog audio info so and then we'll rename this one to default since it's going to be the default configuration we use if our dialog metadata hasn't informed us otherwise and we'll just give it an id of default and configure it with the configuration we've been using up to this point with the three beep sounds then in the dialog manager component in the unity inspector be sure to drag in that asset to the default audio info slot and just to make sure things are still working we'll play it and we can see that the scriptable object asset is being used for the dialog audio configuration when we talk to the npcs the next step is going to be supporting multiple configurations and being able to change them from our dialog in the dialog manager we'll add a serialized field dialog audio info array called audio infos which is going to hold the other configurations that aren't our default and we'll also create a private dictionary to keep track of all of the configurations then to initialize that dictionary we'll create a new private method called initialize audio info dictionary in there we'll set the dictionary to be a new dictionary and then add the default configuration where the id for that configuration is the key and the dialog audio infoscriptable object is the value and then we'll loop through all of the other configurations and add them as well then we'll call that method in our start method so that the dictionary is initialized when the game starts up next we need a method to set the current audio info given an id so we'll create another private method called set current audio info that takes in a string for the id and in there we'll use the try get value method on the dictionary to try to get the dialog audio info so object but if that ends up being null that means there was no matching id in the dictionary so we'll log a warning to let ourselves know and if we were able to find it we'll simply set the current audio info to the one that we pulled out of the dictionary and as mentioned earlier in our ink dialog file the tag we're going to use to change our audio configuration is going to look like this where the key is the word audio and the value is the unique id that maps to the scriptable object asset so we'll set that up real quick by adding a constant string audio tag which equals audio in all lowercase then in this handle tags method which in this dialog system is where we're handling metadata for any given line of dialogue we'll add on a switch case for that tag and then call the set audio info method that we created passing in the tag value next this is completely optional but at the end of the exit dialog mode method i'm going to make it go back to our default audio configuration with this line here just so that we don't have configuration data set from previous instances of dialog and if you followed the previous tutorials for this dialog system in particular one other change we'll want to do is make sure that we're handling the audio tag metadata before it actually starts displaying the line otherwise the audio may not switch right away when the line starts we can do that here pretty easily by creating a string to capture the text for the new line calling handle tags to handle the tags for that line and then displaying that line after with the display line co-routine and with all of that in place back in unity we'll now have an audio info section in our dialog manager where we'll want to drag in all of the audio info assets that we have i created two more scriptable object assets just like we did for the default one where one has an id of beep 1 and is using the b1 sound that i have and another has an id of beep 3 and uses the beep 3 sound that i have and last i've already added the audio tag to the dialog files to change the configuration between the deep one and beep three configurations when the speaker changes so if we go into play mode we'll see that the audio configuration is changing based on our metadata which happens each time the npc speaker changes just like we wanted [Music] and of course if we talk to one of the other npcs over here which we're not passing any metadata for our system just defaults to our default configuration and that's it for the system itself next just for fun i'm going to show how i configured this system to very loosely replicate the dialog audio for the game's celeste and animal crossing just a quick disclaimer though i'm not a very skilled audio engineer myself and the sounds i'll be using are quite simple in comparison to the actual sounds being used in those games the purpose of this demonstration is just to show a couple of ways of configuring the system created in this video it's not to try to replicate those other games exactly as that would require a wealth of experience in audio engineering that i simply don't have with that said if you're looking to take your dialogue audio a step further i'll put some useful resources for reproducing both celestes and animal crossings dialogue audio in the description of this video and with that said here's how i went about these loose replications for doing a replication of the dialogue audio in celeste i used several synth sounds each with somewhat of a vocalization to them which a friend of mine provided for this video from there i altered them slightly by speeding them up changing the pitch a bit and playing around with some other random effects like wah which resulted in clips that sounded like this for the audio configurations i did two variations for two different speakers where the first is just a bunch of those sound clips a frequency level of four and a small range for the pitch for the other configuration i used a slight modification of those sound clips where the pitch on the clips themselves is lower and a bit distorted but otherwise the other configurations are the same so here's how the dialog audio in celeste sounds [Music] and then here's how the replication sounds you can really tell that a lot of care was put into the dialogue audio for celeste and i think if i were to try and get closer to it i'd need to experiment around with more synth sounds and maybe try to add some more raspiness to the sound as well there's actually a really great video out there where the sound designer for celeste talks through what he did for all kinds of audio in the game he used something called fmod which integrates nicely with unity and it's something that i actually plan on doing some tutorials on in the future myself and the entire fmod project for celeste along with all of the sounds are available for download for educational purposes next for doing a replication of the dialogue audio in animal crossing i recorded myself saying vowel sounds and then increase the speed and pitch of those by quite a bit for this one i actually created three different configurations one with a high pitch a second with a low pitch and a third with a medium pitch each with a frequency level of two and a small range for the pitch and here's how some dialogue and animal crossing sounds [Music] and here's how my replication sounds oh [Music] i'm actually pretty happy with how this one turned out given such simple sounds but i think adding more sound clips with different vocalizations would be a really good improvement if i were to expand on this further there are a couple of other really great videos i came across when researching this topic myself and i'll be putting those in the description of this video if you're interested in seeing some other approaches to reproducing the dialogue audio for animal crossing and that's it for this video thank you so much for watching if you liked the video please give it a thumbs up so more people see it and if you want to see more from me be sure to subscribe as well doing so really helps out this channel and i really appreciate it you're also welcome to come by my discord server which is a great place to hang out whether you're just learning about game development working on a passion project or just want to share what you created with this video which i'd love to see you can also follow me on twitter instagram or tiktok where i mostly post about the game that i'm creating anyways thanks again for watching and i hope this was helpful [Music]
Info
Channel: Shaped by Rain Studios
Views: 15,240
Rating: undefined out of 5
Keywords: trevermock, trevormock, trevor, trever, mock, celeste, animal crossing, gibberish dialogue, dialogue audio, dialog audio, dialogue sound effect, dialog sound effect, dialogue sound, dialog sound, animal crossing dialogue, animal crossing dialog, celeste dialogue, celeste dialog, unity, unity2d, unity3d, unity dialogue system tutorial, unity dialog system tutorial, how to create dialog audio, how to create dialogue audio, animal crossing dialogue audio, unity tutorial
Id: P3FcXHEai_E
Channel Id: undefined
Length: 29min 38sec (1778 seconds)
Published: Mon Aug 22 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.