Broadcasting Your Voice with ESP32-S3 & INMP441

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

Hey guys. What's up? Right now my voice is being transferred through the INMP441 mems microphone on the ESP32-S3. How is the quality? It currently has a sampling rate of 16000 Hz. It's used in most VoIP. So I believe it's not that bad. Of course, you can increase the sampling rate up to 44100 Hz but this will have a lot of delay while sending data to the server. The goal of this project is to be able to play the audio source from the ESP32 to all connected clients in real-time. It will be fun as you can connect it from any device with a web browser. Let's get into it. As you can see here, the hardware configuration is very simple. I simply connected the INMP441 to the ESP32-S3 DevKit-C. Its role is to continuously transfer audio sampling data to the connected server. That's it. First, it's the source code for ESP32. You can download this project from my GitHub page. If readability is poor, please look at the code yourself. The basic program flow is very simple. Connecting WiFi and try to connect the WebSocket Server, if the connectivity is ready, audio sampling data is transferred to the server. That's what this program is all about. The point of this program is the settings of I2s. Let's take a quick look at i2s sampling. In order to use i2s, we need to have the correct settings. The sampling rate determines sound quality. It means the number of audio samples we can receive in one second. The bits per sample is the number of bits used in the sample. More bits equal better quality. So the higher the resolution and sample rate, the larger the size of the data file. DMA buffer settings are important. By using DMA, peripheral devices can directly access memory without using the CPU. After getting the sampling data from the DMA buffer, we can actually use this data. Now we need to think about DMA buffer size. A buffer that can hold more samples can do less work for the CPU. This is because the number of times the CPU is interrupted depends on the amount of sample. The sample size of 128 is updating the screen faster than that of 1024. The CPU works again by the interrupt, and the screen is updated at this time. The sample size of 1024 updates is much slower, meaning the CPU can do less work. So the CPU can do more other things. Better to have a larger buffer length. Finally, it is setting the count of the buffer. I currently set it to 10. Since this DMA buffer contains samples, the size of the actual buffer is 20 Kilobytes in my case. Because there are cases where the buffer cannot be emptied, Making the DMA buffer as large as possible can cover the worst cases. But this DMA buffer is assigned to SRAM, and in the case of ESP32-S3, the size of SRAM is 512 KB. So, the size of the DMA buffer must be carefully controlled. This is a transmission test from ESP32 to the server. In a 44.1kHz, 16-bit, Mono, the amount of data that needs to be transferred to the server in one second is 88 Kilobytes. If this is not met there will be problems playing audio on other clients. It's currently receiving 86 sample data of length 1024 from the server in 1 second. This isn't bad. However, this is assuming there is no network delay. WebSockets also work over TCP. It has a higher network traffic load compared to UDP. Unfortunately, UDP-based WebSockets are not supported, so if you need a better system, I recommend creating a WebRTC-based service. There is WebRTC's RTCDataChannel API for UDP-like communication. This system is server-centric. All clients connect to the server, receive the audio data sent by the ESP32 to the server, and play it in each web browser. Sampling data obtained from the INMP441 can be directly played using a PCM player. The reason why WebSocket is used even though it is based on TCP is that it makes it very simple to create a system that sends and receives data from client to client. If you don't like this, try writing a server based on UDP or WebRTC as I mentioned before. Also, I have a plan for that too. Let's see how I can build it. This is a server code based on node js. This server is simply responsible for sending data to connected WebSocket clients. On the right is the audio_client HTML file you can connect to it from your web browser. The server prepares for HTTP and WebSocket connections. While the server is running, enter the IP address, port number and audio path in the web browser to open audio_client.html. What you do here is to update the IP address of the WebSocket server. Since the current server runs locally, it will be a local IP address. You can easily get the IP address with the ifconfig command on Mac and the ipconfig command on Windows. This PCM Player plays audio samples obtained from INMP441. Since Bits per sample is 16 bits, it is Integer 16. It's mono, so it has 1 channel. The sample rate is 44100. After setup, audio data can be output to the speaker through the player's feed function. As you can see here, Plotly is used to draw audio graphs. It is recommended to remove this if your system is slowing down. You can download this source code from my GitHub page. Please give it a try and let me know if you have any problems. After downloading the source code, install the package with NPM. Then start the server. That's it server is running now. Now connect power to the ESP32 or reset it so that it can connect to the server. Enter the address and connect to the server. This is the first client. You can connect it in the same way from other devices. It seems difficult to continuously stream 44100Hz audio sampling in a TCP environment without interruption. It's because network latency can always occur. If possible, please change it to 16000Hz in my code and test it. You can see that it works much smoother. Usually, in my case, I do a lot of projects to create a central server, receive data from multiple devices, and analyze and process it. That's why I shared a project that can do this kind of test. It's not that easy to continuously transmit streaming data over the network. Many things must be considered. What I've done is a minimal version. I hope this will be a stepping stone for your project. That's it for today. Thank you for watching. See you on the next project.

Info

Channel: That Project

Views: 20,207

Rating: undefined out of 5

Keywords: That Project, Arduino Project, ESP32-S3, INMP441, I2S interface, Audio sampling, DMA buffer, Real-time audio, Client-server communication, Audio streaming, Voice communication, Internet of Things (IoT), Embedded systems, Sensor integration, Sound capture, Audio playback, Data streaming, Networked audio distribution, IoT applications, Microphone integration, Real-time data streaming, Audio visualization, Firmware development, Real-time audio processing, Websockets

Id: qq2FRv0lCPw

Channel Id: undefined

Length: 8min 13sec (493 seconds)

Published: Wed May 17 2023