What's The Longest Word You Can Write With Seven-Segment Displays?

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

I definitely write a 4 that way. (Not the way it's typed here.)

👍︎︎ 59 👤︎︎ u/cphcider 📅︎︎ Oct 08 2018 🗫︎ replies

I like how he goes into the ideas behind his coding choices. "I'm going to use node because it's easier to show you what's going on" and "This could be more efficient, but we're just running one script, so performance isn't a factor in this situation" and especially "This is what the boilerplate does. I totally had to look this up because programming isn't about memorisation, it's about breaking a problem down and solving it with code"

This is the type of thing I would show someone who thinks they want to get into programming. Great look into the thought process behind a project like this.

👍︎︎ 50 👤︎︎ u/thurstylark 📅︎︎ Oct 08 2018 🗫︎ replies

Can’t believe never showed the final word in seven segments...

👍︎︎ 30 👤︎︎ u/LaszloK 📅︎︎ Oct 08 2018 🗫︎ replies

I love the way he explained coding. I totally agree with what he said. It's normal to have to regularly look up coding syntax and not know how to do things like reading in files right away.

I also appreciate his point on code legibility. Lots of programmers like to be fancy pants in situations where they don't have to be. For most applications (not large-scale things like Youtube, like he mentioned), speed is second to readability and sustainability.

👍︎︎ 15 👤︎︎ u/pxan 📅︎︎ Oct 08 2018 🗫︎ replies

SPOILERS: Here's the word as displayed on the screens https://imgur.com/Ng22qYR

👍︎︎ 10 👤︎︎ u/arielmanticore 📅︎︎ Oct 09 2018 🗫︎ replies

Gotta love Tom

👍︎︎ 9 👤︎︎ u/alazar221 📅︎︎ Oct 08 2018 🗫︎ replies

Ruling out I and O doesn't feel right. I, maybe, because it looks like a 1, but O can be written lowercase and it won't look like a 0. I'd include G (6 without the center line), Q (lowercase), and Z (mirrored S) as well.

K, M, V, W, X are definitely out of the picture though.

👍︎︎ 16 👤︎︎ u/futlapperl 📅︎︎ Oct 08 2018 🗫︎ replies

Loves me some Tom. Really respect how little takes were needed to make his commentary as well.

👍︎︎ 5 👤︎︎ u/whatsaphoto 📅︎︎ Oct 08 2018 🗫︎ replies

Really loved this video. So well done, so well-explained, too. You can tell he's super passionate about the topic, too. Can't believe this is the first time I'm seeing Tom's channel just now

👍︎︎ 4 👤︎︎ u/JMhere 📅︎︎ Oct 08 2018 🗫︎ replies

Captions

Seven-segment displays are great. We don't see them as much in electronics these days because screens are a lot cheaper than they used to be, but these used to be everywhere from clocks to supermarket checkouts to calculators to slot machines. Here, they're being used as part of the Megaprocessor at the Centre for Computing History in Cambridge. This is the clock speed that the whole system is running at. Seven-segment displays are a really clever bit of design. Seven is the minimum number of segments required to show every number using straight lines. It doesn't matter that the 4 isn't actually how most people write a number 4, it's close enough that we've all just got used to it. And seven segments plus a decimal point means that there are eight lights here to turn on or off, which is convenient: computers like working with eights, there are eight bits in a byte, so you can store the state of those lights in just one byte of memory. And you can use these for some letters, too. You could show the word "Error", or even base-16 numbers. But there's no way you can write an M or a W on this display. So here is a code question for you: what is the longest English word that you can write on a seven-segment display? I love questions like this because there are so many ways to approach them. I'm going to give one solution here because that's the way I want to tell the story, but I can think of a couple of others just off the top of my head, and there will be a dozen more that I haven't thought of, or maybe couldn't even think of. To start off, we need a dictionary, and fortunately here the work has been done for us. There is a public-domain list of English words available, I've put the link in the description. Now, this is not a perfect list: right at the top there are two different spellings of 'aarrgh' and I'm really not sure either of those should count, but the list is good enough for our purposes. We can manually filter it later if there are any strange results. Now, I'm going to code this in JavaScript using Node. Not the best language, but it's not bad for beginners, and importantly it's easy for me to explain. First, we're going to add a bit of boilerplate code, just stock stuff to get it working. Did I know all that code off by heart? No, of course not, I Googled 'load array from file in node' and I adapted some of the results. This is how basically how everyone codes stuff like that. Don't ever be afraid that you're not a real programmer because you still look stuff up. The important part about programming is not remembering exact words or syntax: it is breaking down a problem, working out how to solve it, and then fixing all the inevitable bugs in your solution. It's about holding lots of complicated connections in your head, not the exact magic words that you need this one time. I still forget which order to put basic stuff in sometimes. Anyway, the first line loads in the bits of Node that deal with reading and writing files, and the next line loads the entire dictionary into a single long string called "words". The next line converts that single string into an array, a long list of smaller strings, based on where the new-line characters are. And now we have an array of all the words in the English language: basically just a long list of strings. Let's see how long that array is by telling the console to output the array's length. Okay, more than 370,000 words, each one in a separate item in that array called “words”. Next problem: we need to filter that list and remove any words that use letters that we can't display with seven segments. Which gives us an interesting design problem: which letters can't we display? Now, I'm going to use fancy graphics here rather than actual seven-segment displays, but for extra credit, you can try and figure out the After Effects expressions that I used to make these. Letters A through F are easy, there's almost a standard for those. But G is difficult. We can't use the obvious pattern, because that's a 6, or a 9 if it’s lowercase. And if we use an alternate pattern for it, it's... not really a G? It's a C with aspirations. But to be fair, if any of those patterns appear at the start of a word like 'GOAL', no-one's going to look at it and say 'oh, six-OAL'. But I'm going to make the call that we don't allow it. I think I is all right, though. Like, that's clearly an I, it's not like the half-assed G which sorta looked kinda like a C? It's clear. I mean, I don't care that I'm not applying strict rules here, I'm just going on what feels right, and if you disagree, you can fix it in your version. Other letters that I'm ruling out: K. Just can't be done, requires a diagonal. M: I've seen it displayed like this before, but: no. Not having it. N is borderline, but I'm going to allow it because there's nothing else it could be. Q is out: that is just a 9. R, I'll allow if it's lowercase. S is OK, same reasons as I. But there's no way to do V, or W, or X. And as for Z... no. It needs the diagonals. Not having it. Here's our alphabet, then. Eighteen letters left, eight disallowed. That's actually more than I'd expected left in there. And I'm going to cheat. I know this is called the Basics, but doing this the long way would be really dull, so I'm going to put those disallowed letters into something called a regular expression, or a regex. Or "reg-ex", whichever. Those slashes indicate that it’s a regex, and whatever's inside those slashes is like a test that a string can match against. So this regex would match any word with an X in it, whether that X is at the start or middle or end. As long as there's an X somewhere in the string, it passes that test. If we put all our disallowed letters in, then surround them with square brackets so they're treated as a class, this regex will match any string, any word, with any of those letters anywhere in it. If a string matches this, we cannot use it. Regular expressions are a heck of a lot more complicated than this, and they can boggle even experienced programmers' minds, but using one here will save me about five minutes of really dull script later on. The good news is: we can now just use the function 'match' to test a string against this regular expression, which I'm calling badLetters, and it'll tell us whether there are any bad letters in there. So how do we filter the array? One of the important trade-offs here is between code that is fast and code that everyone can understand. This is not going to be an efficient and fast approach. But because we're running it on a modern PC, at the command line, and we don't mind waiting a fraction of a second after telling it to go, that's no big deal. It's more important that I can show the code and explain it, and look at it again later and understand it. But imagine if something like this had to be run on the scale of YouTube or Google, running millions or billions of times a day. Every minor improvement you could make would be worth it. At some point, I should do a video about Big O notation, but now is not the time. Here, where we're just writing code to find the answer to one simple question, don't worry about it. Sure, there are more elegant solutions. But this is easy to explain. I'm going to declare an empty string. longestAcceptableWord. Then I'm going to tell the code to start testing every word in the array. This line will run the code in between those brackets once for each word, and on each run through, this variable, testWord, will be the next word in the list. First question: is the word we're looking at shorter, or the same length, as the current longest acceptable word? If it is, then we can just ignore it, we know it’s not longer, it's of no use to us, and we can just say 'continue'. That 'continue' skips the rest of the loop, and kicks us back to the start with the next word. Some people hate 'continue', they think it shouldn't be used anywhere because it can cause confusion. And, yeah, it can, but I reckon for things like this it's fine. Anyway, this'll save a lot of processing time, because after the first few words, we're not even going to bother to analyse the short ones, we'll just ignore them. So let's say we've got to this point in the code, it's a new, longer possible word: is it acceptable? Here we can use our regular expression from earlier. Does the word match our test for bad letters? If it does, it's not acceptable, ignore it, ignore the rest of the loop, continue on to the next word in the list. But if it’s passed both these tests, if we’ve got to this point, then we know that our word is longer than anything that we've accepted before and it has no bad letters in it. So we change the longest acceptable word to be our new word and we start over again with the next one in the array. And when we're done, longestAcceptableWord will be the longest acceptable word. We tell the code to write that out, and the result is... Huh. That's actually the longest word in the list anyway. You know what? I'm going to rule out I and O. They are just numbers with aspirations. Fortunately, we've made the code easy to edit, so we can just make that change, and... Sure, fine, that'll do. The lesson here, I guess, is that sometimes you might write code to find something out and the answer is really unsatisfying. Of course, there is one thing we're missing here. Our test was only checking for words longer than the current acceptable one. What happens if there was another acceptable word of the same length? We'd have ignored it. There could be multiple correct answers. But I'll leave checking that up to you. Thank you very much to the Centre for Computing History in Cambridge, for lending me their space and their Megaprocessor. Thank you also to all my proofreading team who made sure I got the script right.

Info

Channel: Tom Scott

Views: 2,999,954

Rating: 4.9224186 out of 5

Keywords: tom scott, tomscott, the basics, computer science, seven segment displays, calculator spelling, javascript, regular expressions, regex, node

Id: zp4BMR88260

Channel Id: undefined

Length: 8min 56sec (536 seconds)

Published: Mon Oct 08 2018