Broadcasting Your Voice with ESP32-S3 & INMP441

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hey guys. What's up?   Right now my voice is being transferred through the INMP441  mems microphone on the ESP32-S3. How is the quality? It currently  has a sampling rate of 16000 Hz.  It's used in most VoIP. So  I believe it's not that bad. Of course, you can increase the  sampling rate up to 44100 Hz but  this will have a lot of delay  while sending data to the server. The goal of this project is to be  able to play the audio source from   the ESP32 to all connected clients in real-time. It will be fun as you can connect it  from any device with a web browser. Let's get into it. As you can see here, the hardware  configuration is very simple.  I simply connected the INMP441  to the ESP32-S3 DevKit-C.  Its role is to continuously transfer audio  sampling data to the connected server.  That's it. First, it's the source code for ESP32. You can download this project from my GitHub page.  If readability is poor, please  look at the code yourself. The basic program flow is very simple. Connecting WiFi and try to connect the WebSocket Server, if the connectivity is ready,   audio sampling data is transferred to the server. That's what this program is all about. The point of this program is the settings of I2s. Let's take a quick look at i2s sampling.  In order to use i2s, we need  to have the correct settings. The sampling rate determines sound quality.  It means the number of audio samples  we can receive in one second. The bits per sample is the number of bits used  in the sample. More bits equal better quality.  So the higher the resolution and sample  rate, the larger the size of the data file. DMA buffer settings are important. By using DMA, peripheral devices can  directly access memory without using the CPU.  After getting the sampling data from the  DMA buffer, we can actually use this data. Now we need to think about DMA buffer size.  A buffer that can hold more samples  can do less work for the CPU.  This is because the number of times the CPU is  interrupted depends on the amount of sample. The sample size of 128 is updating  the screen faster than that of 1024.  The CPU works again by the interrupt,  and the screen is updated at this time.  The sample size of 1024 updates is much  slower, meaning the CPU can do less work.  So the CPU can do more other things. Better to have a larger buffer length. Finally, it is setting the count of the buffer. I currently set it to 10.  Since this DMA buffer contains samples, the size  of the actual buffer is 20 Kilobytes in my case. Because there are cases where  the buffer cannot be emptied,  Making the DMA buffer as large as  possible can cover the worst cases. But this DMA buffer is assigned to SRAM, and in  the case of ESP32-S3, the size of SRAM is 512 KB.  So, the size of the DMA buffer  must be carefully controlled. This is a transmission test  from ESP32 to the server.  In a 44.1kHz, 16-bit, Mono, the  amount of data that needs to be   transferred to the server in  one second is 88 Kilobytes.  If this is not met there will be  problems playing audio on other clients. It's currently receiving 86 sample data of  length 1024 from the server in 1 second.  This isn't bad. However, this is  assuming there is no network delay.  WebSockets also work over TCP. It has a  higher network traffic load compared to UDP.  Unfortunately, UDP-based WebSockets are not  supported, so if you need a better system,  I recommend creating a WebRTC-based service.   There is WebRTC's RTCDataChannel  API for UDP-like communication. This system is server-centric. All clients connect to the server,   receive the audio data sent by the ESP32 to  the server, and play it in each web browser.  Sampling data obtained from the INMP441  can be directly played using a PCM player. The reason why WebSocket is used even  though it is based on TCP is that  it makes it very simple to create a system that  sends and receives data from client to client. If you don't like this, try writing a server  based on UDP or WebRTC as I mentioned before.  Also, I have a plan for that too.  Let's see how I can build it. This is a server code based on node js. This server is simply responsible for   sending data to connected WebSocket clients.  On the right is the audio_client HTML file  you can connect to it from your web browser. The server prepares for HTTP  and WebSocket connections.  While the server is running, enter the IP address,   port number and audio path in the web  browser to open audio_client.html. What you do here is to update the  IP address of the WebSocket server.  Since the current server runs locally, it will be a local IP address.  You can easily get the IP address  with the ifconfig command on Mac  and the ipconfig command on Windows. This PCM Player plays audio  samples obtained from INMP441.  Since Bits per sample is  16 bits, it is Integer 16.  It's mono, so it has 1 channel. The sample rate is 44100. After setup, audio data can be output to the  speaker through the player's feed function. As you can see here, Plotly  is used to draw audio graphs.  It is recommended to remove this  if your system is slowing down. You can download this source  code from my GitHub page.  Please give it a try and let me  know if you have any problems. After downloading the source code,   install the package with NPM. Then start the server. That's it server is running now. Now connect power to the ESP32 or reset it so that it can connect to the server. Enter the address and connect to the server. This is the first client.  You can connect it in the  same way from other devices. It seems difficult to continuously stream   44100Hz audio sampling in a TCP  environment without interruption.  It's because network latency can always occur. If possible, please change it to 16000Hz   in my code and test it. You can  see that it works much smoother. Usually, in my case, I do a lot of projects  to create a central server, receive data from   multiple devices, and analyze and process it.  That's why I shared a project that can do this kind of test. It's not that easy   to continuously transmit streaming data over  the network. Many things must be considered.  What I've done is a minimal version. I hope  this will be a stepping stone for your project. That's it for today. Thank you for watching.  See you on the next project.
Info
Channel: That Project
Views: 20,207
Rating: undefined out of 5
Keywords: That Project, Arduino Project, ESP32-S3, INMP441, I2S interface, Audio sampling, DMA buffer, Real-time audio, Client-server communication, Audio streaming, Voice communication, Internet of Things (IoT), Embedded systems, Sensor integration, Sound capture, Audio playback, Data streaming, Networked audio distribution, IoT applications, Microphone integration, Real-time data streaming, Audio visualization, Firmware development, Real-time audio processing, Websockets
Id: qq2FRv0lCPw
Channel Id: undefined
Length: 8min 13sec (493 seconds)
Published: Wed May 17 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.