How Speech Synthesizers Work

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Knew it was 8 bit guy love that guy

šŸ‘ļøŽ︎ 1 šŸ‘¤ļøŽ︎ u/itsmejak78 šŸ“…ļøŽ︎ Sep 11 2020 šŸ—«︎ replies

i had a speak n spell! it was a BIG deal! my dad negotiated with the salesperson like you would when you buy a car!

šŸ‘ļøŽ︎ 1 šŸ‘¤ļøŽ︎ u/VolupVeVa šŸ“…ļøŽ︎ Sep 11 2020 šŸ—«︎ replies

My sister had Speak N Spell and I had Speak N Math. Iā€™m still terrible with math!!

šŸ‘ļøŽ︎ 1 šŸ‘¤ļøŽ︎ u/WattsAGigawatt šŸ“…ļøŽ︎ Sep 11 2020 šŸ—«︎ replies

Speak n spell ftw.

šŸ‘ļøŽ︎ 1 šŸ‘¤ļøŽ︎ u/Automatic-Pie šŸ“…ļøŽ︎ Sep 18 2020 šŸ—«︎ replies
Captions
Open the pod bay doors, HAL. Iā€™m sorry Dave, Iā€™m afraid I canā€™t do that. Back during the 1960s, 70s, and 80s we were mesmerized by computerized voices in our movies and television shows. Negative, a close copy. No alternative possible, Master. Iā€™m sorry Michael, but I canā€™t do that. No good, Iā€™ve got three. Warp energy has increased 14 oercent. Thereā€™s only one problem. They were all a lie. None of these were actual computer voices. Even the movie war-games, with itā€™s very convincing sounding computer, wasnā€™t actually real either. Shall we play a game? If you are wondering why it sounds so artificial, what they did was have the actor read the words in reverse. So, for example, instead of saying ā€œwould you like to play a game of chess?ā€ They had the actor read it like this, ā€œChess of game a play to like you would?ā€ Iā€™ll use Audacity to illustrate what they did. They cut the individual words out like this, and re-arranged them in the correct order. would you like to play a game of chess? I think they also changed the pitch some, so weā€™ll do that too. would you like to play a game of chess? And then, they probably added some sort of effect, Iā€™ll play with one here and see what I can get. would you like to play a game of chess? So, anyway, thatā€™s roughly how it was done. Now, why they didnā€™t just use a real speech synthesizer, I have no idea. I mean, speech synthesizers did exist at that time. Being the movie was made in 1983 however, it was probably in pre-production for a year or two before that, so speech synthesizers, while they did exist, they werenā€™t terribly common at the time, so who knows? Of course, there have been devices such as talking dolls since Edisonā€™s first model that came out in 1890. These essentially had a miniature phonograph inside of them such as this one shown in Get Smart back in 1969. My name is Mary Lou. Or better yet, the miniature Yogurt doll from SpaceBalls. May the Schwartz be with you. These actually worked very similar to another device, the popular See N Say by Mattel. The cow says moooo. The original model used a type of internal phonograph as well. The design of these is quite interesting. Iā€™ve taken this one apart so you can see how it works. Everything fits inside of this mechanism here. Let me show you what weā€™re actually looking at there. This is the actual record, which is made of plastic. And this part here is the tone arm, and you can see the stylus, which is very dirty at the moment, is slightly lifted from the record. This part here is a rudimentary speaker. And it amplifies vibrations by staying in contact with the tone arm. So, letā€™s show it in action. The cow says moooo. Now, you may notice that each time a track is played, the stylus travels all the way across the record. So you may be wondering how there is room for multiple sounds. Well, hereā€™s how that works. Normally, you think of a vinyl album and it has different songs and you can essentially see the divisions between those songs. But the See N Say works very different. The tracks are all wound together like this. If you look at the outer edge of the record closely, youā€™ll be able to see different entry point grooves. Each one of these is the start of a specific track. And so, by pointing the arrow at the sound you want to hear, it will align it with the entry groove of that track. Neat, huh? Oh, and by the way, you can actually play sounds on this thing with just these two parts, but it takes practice. So what about those talking cars of the 1980s? Donā€™t forget your keys. Well, you might be tempted to think these are computerized synthesizers, but they arenā€™t. Murilee Martin from Autoweek recently took one of the speech boxes apart and showed that they were actually little phonographs that work extremely similar to a See N Say. Parking brake is on. The only real difference is that the amplifier is electronic instead of mechanical. They even have the same little selection of entry grooves to pick which sound it needs to play. So, these arenā€™t real speech synthesizers either. So, what about these early computer games that incorporated speech into them? Ghostbusters! These games were extremely impressive in the early 1980s. And as cool as they were, they were not real speech synthesizers either. These games are simply using digitized recordings of speech. I mean, These sounds could just have easily been dogs barking or cats meowing, or any other sounds. So in essence, these games used the digital equivalent of a See N Say. Hence, theyā€™d never be able to say anything other than what was pre-recorded for them. The Speak and Spell was one of the first consumer devices that started to cross the line into actual speech synthesis. E A R N. That is correct. Now spell one as in one word. But I hesitate to call the Speak and Spell a true speech synthesizer, because it really only knows about a little over 200 words and those words are all pre-recorded. So, in essence itā€™s like a See N Say that just happens to know 200 some odd words. In fact, if you wanted to add more words to your Speak and Spell, it was necessary to buy new vocabulary cartridges, which had additional recorded sounds on ROM. These could be inserted through the battery compartment. Nevertheless, having appeared on the market in 1978, it was one of the first talking electronic devices to reach the consumer market. However, the Speak and Math starts to blur the lines. The reason is, it can pronounce any number imaginable because it has recorded all of the sounds that make up numbers. Thatā€™s correct! Now try forty five thousand eight hundred three. So, to say a word like 45 thousand 8 hundred 3, there are 6 separate sounds that have been recorded to make this phrase. And so, if you wanted to change it to 46 thousand 8 hundred and 3, then you just replace one sound. This is very similar to how the Radio Shack VoxClock works. It only has a few dozen pre-recorded sounds and it mixes and matches them to produce the time. Itā€™s five seventeen PM. Also similar is the Tel Star answering system from around the same time period. The time is eleven forty two AM, March nineteen. Next, I want to show you the Commodore Magic Voice speech cartridge. On the side it has an audio in and audio out. It just plugs into your C64 like so, and all you would need to do is run an audio cable out to an amplifier, or in my case the television Iā€™m using. Of course, if you wanted to still hear the internal sounds from the Commodore 64, then you would run that into the audio input like so. Then, you just fire up the C64. And you can use the say command to type something like this. Commodore! Or this. Computer! But itā€™s kind of limited and you can only say one word at a time. So, to say something more complex, you could write it as a BASIC program. Commodore Computer. But, what if I try something like this. As you can see, it doesnā€™t work. Believe it or not, the Magic Voice cartridge is not a true speech synthesizer. It has a list of 234 pre-recorded words that it can say. In fact, if you give it a number it will say the word that corresponds with that number. Control! In fact, Iā€™ll demonstrate this further by writing a little BASIC program that says all words between 100 and 200. Find! Get! Have! Hear! Help! IS! Know! Like! Presents! So, the Magic Voice cartridge is also just a digital equivalent of a See n Say, that just happens to know 234 words. The rooster Says, ā€œcockadoodledoo!ā€ Fortunately, other software can add new phrases to it. For example, the cartridge game GORF adds additional phrases that are used with the game. Iā€™ll just plug it into the little passthrough connector here, and letā€™s check it out. Commodore presents Gorf, a Bally/Midway game. As you play the game, the enemy will taunt you with insults, among other information. Ha! Ha! Ha! Gorfian robots attack! attack! Itā€™s a neat gimmick, but doesnā€™t really add much to the game in my opinion. In fact, very few games support this. Mattel also introduced a very similar device for itā€™s Intellivision gaming console around the same time. Itā€™s called the Intellivoice. One side plugs into the game console, and the other end is where you put the game cartridge. On the front is a volume control for the voice. There were only a total of 5 games that ever supported it, and I have 3 of them right here. Now, these games will work without the intellivoice, but they just wonā€™t have any speech. So, letā€™s try this thing out. The first game Iā€™ll try is Bomb Squad. Letā€™s power it on. Mattel Electronics presents Bomb Squad! Theyā€™ll never do it in time! The code! The code! Figure out the code! It wonā€™t be easy. Replace this first, this third, this second. Mattel electronics presents Tron. OK, letā€™s try out Tron Solar Sailer instead. 7 4 7 8 2 Energy High. Again, the speech is a nice gimmick, but isnā€™t really all that useful. Itā€™s not surprising that the product was considered a flop. So, up to this point, everything I have shown you have been devices that, while they can speak, they are really only playing back select pre-recorded sounds. So, they are pretty limited in the things that they can say. Now I want to show you some true speech synthesizers. These devices can actually create words out of allophones, which are basically the fundamental building blocks of speech such as vowels, and consonants and some of the other sounds that we make when we talk. The first one I want to show you is the Currah speech 64 cartridge for the Commodore 64. This was also marketed under the name of voice messenger. Now you may notice this DIN cable hanging out the side. Let me show you how this works. The cartridge plugs in like any other, but then this part plugs into the monitor port on the C64. Itā€™s actually making use of the seldom-used audio-input line on the Commodore 64. This allows audio to pass through the sound chip and back out, at the same time mixing the sound with the internal sound. Of course, most Commodore users back then were actually using a television for a display, so this was actually a pretty elegant design. It was supposed to come with a breakout cable if you were using the cartridge with a monitor. Mine didnā€™t come with it, so I will make my own so that Iā€™ll be able to get some clear recordings of it. OK, so when you power on your C64, youā€™ll need to type INIT. Return! And at this point, it will literally tell you every key you are pressing on the keyboard, A B C D E F G Return!, which is sort of annoying. However, you can tell it to say a word. Return, hello! If you get tired of hearing every key press, you can type KOFF to turn the keyboard speech off. K O F F Return Now, hereā€™s where things get interesting. Hello Not only can I say a single word. But I can type literally anything inside these quotes and it will say it. Hello there, how are you doing? Of course, it works off of English spelling rules, which to say the least arenā€™t very consistent. So it isnā€™t perfect. Let me give you an example. Harry Potter. Ok, it gets pretty close on that. But letā€™s try Hermione Granger. Hermione Granger. Yeah, it totally fails on that one. You can also change to different voices by putting a 0 or a 1 in front of the sentence to be spoken. So hereā€™s voice zero. This is voice zero. And youā€™ve already heard voice 1, which is the default. This is voice one. Next, I want to show you the speech sound program pack for the Tandy color computer. This cartridge contains not only a speech synthesizer, but also a slightly better sound chip for the CoCo. So, letā€™s pop this in there. So, on boot up the computer doesnā€™t really do anything different. There are no SAY commands in BASIC or anything like that. Fortunately, mine has the users manual with it. It looks like if I want to test the speech, Iā€™ll have to type this little program in. Oh the joys of typing in BASIC programs from a book. OK, all done, now letā€™s test it out. Test. It works! This is a Tandy. So, as you can see this is a true speech synthesizer that can say anything you type. Of course, just like the others, certain words will throw it for a loop. Hermione Granger. However, you can always get around this by typing in the correct sounds, like this. Hermione Granger. OK, I have another type speech synthesizer to show you. This is called SAM. Itā€™s a software-only speech program that was made for the Commodore 64, Atari, and Apple 2 computers. This particular disk is for the Commodore 64, so Iā€™ll put it in and load it up. So, first you have to load the actual speech part into RAM, and then you load a small interface program. This was done so that you could use SAM with other programs if you wanted. OK, letā€™s see what it sounds like. This is a test. Letā€™s try some other things. I love the Commodore 64. SAM is a true speech synthesizer as it can say anything you throw at it. I can say anything you want. Of course, with certain limitations. Harry Potter. Again, you canā€™t expect a computer with 64K of RAM to have a database of every possible English word, so it has to make some assumptions. Hermione Granger. But, again, you can get around this by tweaking the spelling of the words so that you get the pronunciation that you want. Hermione Granger. SAM was also very configurable. You could change all sorts of aspects of the voice. So, hereā€™s a higher pitch. I love the Commodore 64 And hereā€™s me changing the mouth variable. I love the Commodore 64. So, you might ask, if you could do true speech synthesis completely with software, then why did these cartridges exist? Well, one thing you might notice is that every time you tell SAM to say something, it causes the screen to blank because it requires every cycle of the CPU to produce the sound. So thereā€™s no time for the CPU to do anything else. In fact, even when you look at games that used speech, typically the entire game comes to a halt while the speech occurs. On the other hand, when you have a speech synthesizer cartridge, it can handle the work of producing sound, while the computer can keep doing other things. Another enemy ship destroyed. Ha! Ha! Ha! By the way, itā€™s worth mentioning that SAM has been reverse engineered and reprogrammed as a website you can use now. So itā€™s really easy to try it out. Yes, I sound just like the Commodore 64 version. So what were the practical uses for speech synthesis? Well, when we were 10 year old kids, probably the favorite thing that we liked to do with them was to make the computer say all kinds of filthy curse words. And yes, programs like SAM, they could absolutely say anything you wanted them to say. You know, when we were 10 years old, that alone could provide hours worth of entertainment for us, but I think the second most popular use for it was for making prank telephone calls. So, for example back in those days we had phones like these, and no caller ID. So we didnā€™t know who was calling until we answered the phone talked to somebody. So, it was hilarious to type out some insulting message like this. And then weā€™d just dial somebodyā€™s phone number and put the handset up to the television like this and wait for them to answer. Hello? Hey there Techmoan! I just have to tell you that your YouTube channel is total crap! Flippinā€™ idiot! But seriously, speech synthesis has found numerous uses over the years, such as being the voice of Stephen Hawking.Ā The first question they asked it was, is there a God? And even automated telephone services like these.Ā Hello and welcome to moviefone. If you know the name of the movie youā€™d like to see, press 1. And it has continued to improve over the years with things like Siri.Ā Hey Siri, What is speech synthesis? Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech computer or speech synthesizer. In fact, I'm using an online speech service to narrate this section of the video.Ā Pretty neat, huh? If you are interested in the early development of speech synthesizers, I recommend that you check out the VODER, which was one of the first of its kind. It came out back in the 1930s. It was completely analog. Say ā€œShe saw meā€ with no expression. She saw me. Now say it in answer to these questions. Who saw you? She saw me. Since it had no CPU, of course, it required a human to actually like play the different sounds almost like playing a piano. And, so that about wraps it up for this episode. So, as always, thanks for watching!
Info
Channel: The 8-Bit Guy
Views: 1,955,448
Rating: undefined out of 5
Keywords: speech, voice, synthesizer, messenger, commodore, apple, atari, VODER, VOCODER, speak and spell, mattel, intellivision, tandy, color computer, BASIC, allophones, phonemes, robot, video game, android, star trek
Id: XsMRxNSDccc
Channel Id: undefined
Length: 18min 14sec (1094 seconds)
Published: Sat Mar 09 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.