Webradios - ESP32 + MAX98357 (I2S)

22/04/2025

The MAX98357 mono amplifier is very popular with the Arduino community. It is a reliable module that can be easily connected to the I2S (Inter-IC Sound) audio interface, which is available for example in ESP32-WROOM-32, ESP32-S, or ESP32-S3 and others. Of course, to play sounds, we also need a speaker. I have good experience with 2W and 3W speaker versions. In addition to playing individual audio tracks, for example from an SD card or the SPIFFS (LittleFS) file system, a popular form of use is also for internet radio (webradio).

Thanks to this, you can listen to your favorite radio via the Internet, but also practically any world radio that is distributed via the Internet, i.e., is streamed. Internet radios are most often compressed as .mp3, .ogg (Vorbis), or .aac (LC - Low Complexity, HE - High Efficiency, HEv2 - High Efficiency v2) streams. In our case, we will use the .mp3 stream, which is the most widespread. On the microcontroller side, this means the need to decode the stream in real time.

For ESP32, the most used library for this purpose is ESP32-audioI2S (header file Audio.h), which however requires a dual-core ESP32. It is not compatible with the S2 or C3 version of the microcontroller. From a hardware perspective, however, it is necessary for the microcontroller to include additional RAM (PSRAM), as stream decoding is memory intensive, especially when the ESP32 is also used for WiFi, or other memory-intensive sensors and sub-applications and its burst usage.

If you use an ESP32 without PSRAM, the microcontroller may run out of memory at various times and as a result reboot due to an error, which ultimately interrupts the stream playback. Depending on the program, this can be a short pause if the ESP32 is connected to WiFi immediately after reboot and a stream is being downloaded, or a longer period of time if the ESP32 does not play any stream until it is manually entered by the user. In such a case, if you already want to play an .mp3 stream, it is recommended to use a 64 kbps or 32 kbps stream and not try to use 128, 256 or even 320 kbps.

Swiss developer yellobyte has also designed its own devkit directly for use with the ESP32-audioI2S library, which has a directly integrated ESP32-S3 module with 2 MB PSRAM and 8 MB Flash, an SD card slot and also a MAX98357 amplifier. Not one amplifier, but two, which allows you to get surround sound (stereo) with left and right channels, since the amplifier itself is only mono. For stereo sound, it is also important to use the SD pin of the MAX98357 amplifier, where it is necessary to achieve a certain voltage by selecting a suitable resistor. The devkit also has terminal blocks for connecting speakers. The price of the YB-ESP32-S3-AMP devkit is somewhere around 12-15€. Currently, they are already in revision 3, so they are quite tuned.

If you only have regular ESP32 devkits at hand, most of them do not have PSRAM. But you can definitely find PSRAM, for example, in the popular ESP32-CAM module with ESP32-S from AI-Thinker, which has PSRAM integrated on the PCB, up to 4 MB in size, which is used as standard in the OV2460 camera. Such PSRAM will reliably ensure optimal audio stream operation even in high quality. However, it is worth noting that the ESP32-CAM does not have hardware I2S pins and it is necessary to set them to any GPIO, but you must use ones that are not strapping / flash pins, which could prevent the ESP32 from booting.

However, with the ESP32-CAM, you must take into account that you have to upload the ESP32 through an external USB-UART converter, for example FT232RL, and that GPIO0 must be connected to PULLDOWN before restarting the microcontroller, so that the ESP32-CAM switches to download mode in which you can upload the firmware. After uploading the firmware, it is necessary to restart the microcontroller via the reset (EN) button. Since the ESP32-CAM has a significant power consumption and the FT232RL may not cover it, I recommend an external power supply. However, this means that you will not have a GND pin for the converter.

For this reason, it is necessary to solder the cable somewhere where you can steal GND, e.g. to the SD card slot. Alternatively, you must split the ground that is pulled out elsewhere if you plan to use USB-UART as well. If you already have the application ready, this is not necessary. With the Minimal SPIFFS partition table, it is also possible to have an OTA firmware update for future updates, e.g. in the form of BasicOTA with a virtual network COM port. In this case, I powered the ESP32-CAM with an external ZK-4KX power supply at 5V.

Therefore, I used the following pin configuration for the MAX98357 amplifier and ESP32-CAM, these pins are shared with the SD card, so you cannot use the SD card for reading and writing:

#define I2S_DOUT 12
#define I2S_BCLK 13
#define I2S_LRC 15

Gain pin not connected
The consumption of the running web radio with ESP32-CAM was somewhere between 60 and 150 mA @ 5V, with the sound level set to 8, with 21 being the maximum. Also, these pins do not block the boot process of the microcontroller and it can start without the need to disconnect the cables at this stage.

The sample stream "Radio Slovakia" was downloaded from the source https://icecast.stv.livebox.sk/slovensko_128.mp3, so the bitrate is 128 kbps (FM quality) and the compression frequency is 44.1 kHz. It is also possible to amplify the output with the Gain pin. In our case, with the GAIN pin not connected, the gain is 9 dB (typical, range 8.4 - 9.6 dB), with a direct (hard) pulldown pin to GND it is 12 dB, with a 100K pulldown resistor it is 15 dB gain

When decoding an .mp3 stream, the stream is dynamically converted to PCM modulation, which is suitable for the input of the DAC converter. PCM can be thought of as a RAW digital signal that was obtained after decoding an .mp3 stream, most often it has a sampling frequency of 44.1 kHz, which is considered the "CD quality" standard with a sample every about 23 us, some internet radios also have half-frequency downsampling, which can reduce the quality of the played content, especially when it comes to music, with a monotonous voice the downsampling may not be so audible.

The DAC converter is the MAX98357 amplifier itself, which converts the streamed digital signal to analog, which from its output goes directly to the speaker, thereby playing the sound. Through ESP32 and the ESP32-audioI2S library, it is also possible to control the volume in the range from 0 to 21. At a volume of 0, of course, there is no sound being played, as if it were "mute". For the program implementation, I used the example from https://circuitdigest.com/microcontroller-projects/esp32-based-internet-radio-using-max98357a-i2s-amplifier-board, which I supplemented with my own reader from the serial interface, which reads the line after entering the command until the terminating character \n. The basic Internet radio program takes up 1,4XX XXX bytes, which exceeds the standard 1.2 MB application partition. Therefore, you must also adjust the partition scheme, e.g. to Huge App.

The original example contained a "hard-coded" URL to an HTTP stream of an internet radio. This radio will start by default, e.g. after a microcontroller restart, even if another radio is running in the meantime. The counter itself had 3 supported commands. These were the "+" or "-" characters for adjusting the volume, while the program implementation also considered the threshold values, i.e. the maximum and minimum sound settings, so that it would not go above 21 or below 0. Even a 2W speaker plays sufficiently and will certainly not need full volume.

Thanks to the functions of the String class in Arduino Core, it was easy to verify whether the entered string begins with the characters "http", which will be valid for both HTTP and HTTPS streams. At the same time, the verification will fail if http is somewhere in the string, but does not start with it. In such a case, the ESP32 will write to the serial interface that it does not support this command and prompt the user to enter one of the supported input formats.

From the perspective of program implementation, in relation to changing the currently playing stream, it is necessary to stop the current stream. This means calling the audio.stopSong() function and then connecting to a new host with the stream - audio.connecttohost(input). Since the connecttohost() function has a char array as input and not a String, it is necessary to convert the String through .c_str().

I found the largest database of Slovak radio stations at https://fmstream.org/index.php?c=SVK ("The Radio Stream Directory"). What appealed to me about the interface was that you can immediately copy the link when selecting a specific stream, which was the way I switched the active radio. Some websites with a list of Internet radio stations only had a play button and you had to click through to the stream link in a complicated way.

In total, 233 Slovak radios are available here. Many of them are separate genres of radios, so there are XY subradios of certain radios. It is true that there were several inactive ones among them. In the list of radios, you can also find the dormitory (student) radio TLIS (dormitories Mladosť, STUBA) https://stream.tlis.sk/tlis.mp3, or Rádio X (dormitories Veľký Diel, UNIZA): https://stream.radiox.sk:8443/alternative.mp3, which also has several styles, e.g. DNB, Oldies, Chillout. Although it wasn't on the list, I would definitely mention Radio 9 at the Jedlíková dormitory, which broadcasts for TUKE students: https://stream.radio9.sk/high.mp3

And while we're on the subject of this database, foreign radio stations are also worth mentioning, for example:

NPR News and Talk: https://pd.npr.org/anon.npr-mp3/npr/news/newscast.mp3
WNYC - New York Public Radio: https://njpr.wnyc.org/wnycfm
Radio ZET: https://zt02.cdn.eurozet.pl/ZETSWI.mp3
AFN Tokyo: https://playerservices.streamtheworld.com/api/livestream-redirect/AFNP_TKO.mp3
Fox News Radio: https://playerservices.streamtheworld.com/api/livestream-redirect/KTELAM.mp3
CNN International (TV Audio): https://tunein.cdnstream1.com/3519_96.aac

By selection own radio streams, you can build a custom web radio application with your favorite stations, and you can also add your own method of navigating between radios. You can use, for example, a rotary encoder or an incremental button that cycles through stations with an index of 0 to 15. You can add an OLED display on which you can draw the logo (bitmap) of the currently playing / selected radio, or its imaginary FM / AM frequency. Station control can also be solved directly through the touch surface, if the display used allows it. Similarly, with specific graphics on the display, it is possible to achieve an old-school radio visual, for example with an imitation of tuning the radio to the AM or FM frequency, including the scale on which the radio broadcasts on radio waves by default.

The station selection can be saved permanently, so that even after a restart / disconnection and connection of power, the last set station will start playing. The choice must be saved to the ESP32 memory immediately after selecting the station, preferably through the Preferences library, which is built directly into the Arduino Core for ESP32 and directly replaces the earlier EEPROM library, which worked for software emulation of EEPROM memory on a designated sector of the ESP32 flash memory up to a maximum size of 4 kB. Preferences has wear-leveling, so there will be no problem even with frequent rewriting. It will definitely not reduce the lifespan of the built web radio during normal, even daily use and switching between radios.

Switching the radio can be done, for example, through a webserver that can run on the ESP32. The webserver is available on the standard HTTP protocol 80, and it is also possible to use an mDNS local record so that the user does not have to search for the assigned IP address from the DHCP service on the network, but can access it directly through the local mDNS name that is available in the LAN network. In this case, it is audio.local, so the full url will be: https://audio.local. The webserver is also available via the IP address.

The user can select a default radio from the stream list in the web interface, or can directly enter the URL to any world radio they want to listen to. Of course, there is also volume control, in this case through the + and - buttons, or muting the sound to zero. The program that is used and is based on the sample implementation also supports .aac and .ogg streams, there is no difference in the implementation. However, on microcontrollers without PSRAM, an immediate error would occur when trying to play these stream formats, .mp3 would run for a certain time, but not indefinitely.

The volume can also be adjusted dynamically, without having to click on the required volume. This can be done through a slider or a virtual encoder, which essentially work identically and would differ only in visuals. To make the volume change dynamic, we will use the Javascript function addEventListener(), which is set to an HTML element, in this case a slider. The slider has a total of 21 levels in which it can be set. Every time it changes (shifts by +1 or -1), an asynchronous call is made, which adjusts the volume through the subpage and from the user's point of view, there is no refresh of the entire HTML page.

In the case of buttons, it was solved exactly the same way. With a slider, the volume change is immediate. However, there is a disadvantage in the form of the fact that if several clients were connected to the website, they would not see the change on their slider, since the value is only taken when the page is opened. For dynamic synchronization between all clients, it would be necessary to use, for example, websocket for realtime update, or a periodic asynchronous call that gets the current value from the global volume variable and adjusts the slider dynamically. It is also possible to make an asynchronous call for the slider only when the control is released, but the sound control will not be as dynamic and you may click until the volume is too loud, which can be annoying.

Browse more projects: https://your-iot.github.io/

Webradios - ESP32 + MAX98357 (I2S)

Advanced settings