This video is being brought to you with the
support of the channel sponsor PCBWay - I'm actually using a PCB that I had assembled
by them in this project - I'll be making some more PCBs in future videos as I
want to learn a bit more about KiCad. They also do 3D printing and CNC work. Check
out the link to PCBWay in the description. We're back with a bit of Audio -
we've played back audio in plenty of previous videos - checkout the
audio projects playlist I've linked to in the description - so you might
be asking: what's new in this video? Well, in previous videos we've been
playing back uncompressed audio - either raw samples or WAV files. This time
we're playing back an MP3 audio file. What's so great about playing back MP3 files?
Well, when we're dealing with the ESP32 size does matter - size of audio files that is. Typically on the ESP32 we're limited by the size
of the flash storage - on most ESP32 devices this is about 4MBytes and you need to reserve
some of that space for your actual firmware. With a small app partition and over the air
support enabled you can get almost 2MB of SPIFFs storage. If you don't need over the air updates
then you can get closer to 3MB for SPIFFS. But as your app grows in size you'll need to decrease
the amount of space you've allocated to storage. If you're dealing with audio data it quickly takes
up a lot of space. If we have stereo audio sampled at 44.1KHz with 16-bit samples then every second
of audio data takes up about 172KBytes. If we've got around 1MB spare on our flash for audio data
we can only store 5 or 6 seconds worth of audio. A normal single is around 3.5 minutes
long and would require about 35MBytes to store uncompressed. Obviously, this is
not going to fit in a SPIFFS partition. There are of course some shorter songs - the
Guinness World Record for the shortest song ever published is 1.3 seconds - I've put a link to
it in the description for you to have a listen. Assuming you want to listen
to something that lasts more than 1.3 seconds we'll need to compress the audio. MP3 is a popular compression technique for audio
files and can compress the audio down to 75-95% of its original size. It became popular back in the mid
to late 90s with services like Napster taking off. I've run a fairly short song through various
different bit rates to see how well it performs. The song is only 2 minutes 41 seconds long so
only takes up around 28Mbytes in WAV format. As we decrease the bit rate the audio is encoded
at we decrease the file size dramatically. Even on the very low setting of
45-85kps, the quality is very good and we are down to less than a megabyte in size. With MP3 decoding we can fit a song into
SPIFFS reasonably easily. Obviously, if you wanted more than one song then you'd
probably need to switch to an SD Card for storage, but even for short audio samples encoding
them as MP3 would give you a massive saving in space. So, how do we decode the MP3 data? I've found a very nice standalone MP3 decoder that is
self-contained in a single header file. We just need to feed this data from the MP3 file
and it will give us 16-bit audio samples to play. The decoder decodes one frame of data at a time
from the buffer and tells us how much data it has consumed. We shift the data down by this number
of bytes and top the buffer up from the file. We keep running this until we
have no more data in the file and all the buffered data has been used up. So, how do we playback the audio? I've added two different options for
you - we can play the data using ESP32's built-in digital
to analogue converter and some headphones or we can output I2S data straight to a digital amplifier. Let's cover the headphone option first. Wiring
up headphones is pretty straightforward, you can either get a little breakout board
with the headphone socket as I have, or you can hack up an old set of headphones
and connect header pins to the cable. Headphone jacks generally come
in a couple of variations. You have some that come with a microphone and
some that just have headphones. The tip of the socket is connected to the left earpiece, the
first ring is connected to the right earpiece. Most headphones have the ground connection next
followed by the microphone on the final connector. Be aware that there are some manufacturers
such as Nokia who used a different standard and have the microphone on the
ring2 and ground on the sleeve. Our DAC outputs the audio signal with a DC bias
so we need to put a DC blocking capacitor between the signal and the headphone connector. This will
block the DC element and only allow through the AC signal. The size of the capacitor will determine
how well the system responds to low-frequency signals. I'm just using a couple of 47 microfarad
capacitors here that I had lying around. I'm not trying to build a high quality audio system. I found the ESP32's DAC output quite
capable of driving the headphones, but to make sure I don't damage the ESP32 by
drawing too much current I've put a resistor in series. I found that a value of 500Ω-2KΩ- seems
to work well and still gives a reasonable volume. But you might want to play with this value yourself. The DAC output is quite noisy - we
can see this on the oscilloscope with the volume turned down to zero we have
an audible noise on the headphones. My initial thought was that
this could be power supply noise so I tried it with a battery power
supply and the noise was still present. So, your mileage may vary, but I found it a little bit noisy. But actually, it's quite listenable. I've recorded the audio as
best I can from the headphones, it's annoying that not many computers actually
have a line in port nowadays. So I've had to use my headphones and strap them round my phone. Let's have a listen. MUSIC It's actually pretty good! I'm quite pleased with the audio output. The other way to get audio out of
the ESP32 is to use the I2S interface connected directly to an amplifier
such as the MAX98357 - I've got a whole video on using this device -
there's a link in the description. Wiring up is very straightforward, we just need
three pins on the ESP32 - one for the LR clock, one for the serial clock and one for the serial
data. If we want to have stereo sound then we need two MAX98537 boards - one
for the left channel and one for the right channel. If you're using the breakout
board from Adafruit then you select the channel the board by connecting a
resistor from the SD pin to the power supply. Calculating this value is a bit complicated -
the board has a resistor divider connected to the pin already - so you need to take this into
account when calculating your pull up resistor - I actually got it wrong in the previous video.
A value that should work is around 390Kohms. I've got my own stereo breakout board based around
the same chip that outputs stereo to two speakers and cuts down on the wiring. I've linked to the
design video for that board in the description. The MAX98357 board has a selectable gain, at maximum gain, each amplifier will require
over 1amp. That means that if you are using two amplifiers for stereo sound then
you'll need about 3 amps in total. Drawing this much current can pull down
the voltage coming from the power supply and can cause the ESP32 to
brownout if it gets pulled too low. I've connected the voltage coming into
the ESP32 dev board from the power supply to my oscilloscope and we can see the effect here. The voltage
is being pulled down by almost 2 volts. Eventually, we have a sustained loud piece of music and the ESP32 cuts out due
to a brownout and we get a reset. To mitigate this we can add a bunch of capacitors
to the power pins of the amplifier. I've added 3x1000 microfarad capacitors to mine and now we
can get through the whole song at maximum volume. The voltage is still being pulled
down, but not as bad as before. Ideally, you'd probably want to use
a slightly more powerful power supply along with some reservoir
capacitors near the amplifiers. MUSIC As a bonus, I've added a volume control
using a potentiometer connected between 0v and the 3.3v line. We measure this
using the Analog to Digital Converter on the ESP32 and use the measured value to set the volume. There's an interesting thing that's quite
important about volume and how we perceive loudness. We don't perceive loudness on a linear
scale, so if you are using a linear potentiometer like I am to set the volume you should make
sure to amplify the sound in a non-linear way. You can do this simply by
squaring or cubing the volume. This gives a much more pleasing volume control. So that's it for this project - it would
be quite easy to hook up an SD Card and an ESP32 with a display and you've got
yourself a very rudimentary MP3 player! As always the code is on GitHub.
Let me know how you get on!