Audio acquisition and analysis with ARM Cortex-M0 and I2S microphone

fft
i2s

#72

Also, when reconsidering the code at lines 44-47:

if (!I2S.begin(I2S_PHILIPS_MODE, SAMPLE_RATE, 32)) {
    Serial.println("Failed to initialize I2S!");
    while (1); // do nothing
}

This busy loop also might be the reason for the stalls, right? You might miss that point because you are not calling Serial.flush() there, which will pause your program until the transmit buffer has been flushed to the UART. At least, this is what we already learned painfully when attempting to debug in similar conditions.


#73

Not sure how I can edit you gist, so I edited the file on my github: added some Serial.println statements. Please have a look if I understood your comment correctly. Running that sketch outputs the following to serial:

Starting
begin()
Before end()
After end()
begin()
Before end()

and then nothing …


#74

Thanks for sharing insights about the runtime behavior of your code, we’ve updated the Gist Basic audio recorder for ARM Cortex-M0 and ICS43432 I2S microphone · GitHub accordingly.


#75

Did you actually read this post (by Andrew J. Fox again), Wouter?


#76

While I perfectly understand this from looking at the problem from an ad hoc perspective, I would like to add some thoughts here. Usually, when putting the MCU into deep sleep, all the peripherals most probably have been shut down anyway and will have to be initialized again when resuming from hibernation.

So I believe it’s not really appropriate to cycle through I2S.begin() and I2S.end() in this manner so quickly to actually emulate something you will not have in the final version as this most probably leads to some bus member hiccup.

So, I would like to advise to get rid of this I2S.begin() / I2S.end() obstacle until we have produced a stable version if you have no other objections about this.


#81

I’ll give it a try; see if my earlier observation (it standing in the way of LMIC) still stands and if I can work around that…


#82

In tests I did half a year ago, having I2S running would get in the way of successfully joining the TTN network. My assumption was that the I2S library messed with timing, so that LMIC’s receive window for an OTAA join accept from the gateway was slightly out ot sync. This brought me to run I2S only for a brief period of time, between LoRa Tx and Rx events.

Now, with this node (same brand/type of board, different “instance”) it does join TTN when I2S has been initiated already. In the meanwhile what changed as well is that I have a different gateway and it is closer by (so now I can join with SF7 rather than SF11). LoRa is quite sensitive to timing, so, although it works now, I do feel a bit uneasy about not understanding this point. Well it works now … note to self to think about this if at one point it has difficulties joining again.


#83

Hi Wouter,

what you are telling about the LMIC/I2S interaction eventually makes absolute sense to me when looking at the code.

The point is that just almost everything is timing critical here. As LMIC apparently introduces a scheduling/task system, there’s additional care to be taken in comparison/addition to the regular Arduino HAL main() / loop() lifecycle.

I just thought about isolating the I2S sensor domain first and getting this working. After that, we can dedicate ourselves to the interaction of I2S with LMIC. I would have to say quite some words about that, especially as the FFT computation in between obviously is not a cheap operation, if I’m getting this right?

If you think reading from I2S works stable now, another phone call for talking about how to bring I2S together with LMIC/TTN would be appropriate.

Cheers,
Andreas.


#84

Just loud thinking: We do not need I2S and LoRa in parallel. We measure and then we submit data, also “listening” in parallel to the audio acquisition is not necessary. So we can try to get one sleeping / disabled while the other is working in case they come in conflict.


#85

Hi Clemens, that’s right. We know when we schedule for the next LoRA transmission. We can do the audio stuff before the scheduled LoRa transmission and there shouldn’t be an issue.

When a transmission happens, for a few seconds afterwards a window is opened to listen to uplink messages (such as needed for MAC commands or application payloads). This windows needs to be cleanly timed. I am not sure how much jitter is added by the onI2Sreceive event, but it indeed seems undesirable to have that running.

However, I haven’t found a public method (other than end()) to disable it or put it to sleep.


#86

Hi Andreas … thanks; still in the process of verifying that :-)


#87

Dear Clemens,

LoRa and LoRaWAN (via LMIC) are totally different beasts when it comes to scheduling / task lifecycles. Please take that into account. It’s actually not about parallelism (there isn’t actually any), it’s all about timing to get things right.

Cheers,
Andreas.


#88

Thanks for letting us know. Will be happy to support you further if we are through this. Are there any things around reading from I2S I can provide better assistance for? If not: Good luck figuring things out and please don’t hesitate to contact me for even more rubber duck debugging - it really worked well for the last time, right?


#89

I assume it’s really just that you are calling I2S.begin() after I2S.end() which most probably makes things go south when doing this too quickly. Please get rid of this for the first tests to achieve stable and deterministic results of this building block which are absolutely required before assembling things into larger systems.

I have recognized your motivation for doing this, but still think it fits into the premature optimization category regarding deep sleep mode - which you really should approach differently than by putting I2S.begin() / I2S.end() into the measurement cycle.


#90

Hi Andreas. I continued some test with the Adafruit_Zero FFT and I2S libraries. I was utterly confused for a while, but then it turned out that something was wrong with the hardware (I should perhaps up my ESD safe working habits) as the microphone output had no relation anymore to the sound coming into it. I had another one (ICS43434 this time) sitting around, and with that one things made sense again.

This one doesn’t show the jumps in the spectrum seen earlier. It looks promising. Though, the difference is not understood.

https://github.com/wjmb/FeatherM0_BEEP/blob/master/mic_test_adafruit_libs contains the test-code. If you’d like to have a look. I’d welcome comments!


#91

Hi @clemens and @Andreas,

A small update: I did extensive testing with the Adafruit_ZeroI2S library and the new ICS43434 and I don’t see any of the issues noticed earlier with the Arduino I2S library:

  • The spectrum looks nice and clean, without these half, quarter, etc, jumps shown earlier in this thread.
  • The spectrum looks the same for code w/ and w/o LMIC.
  • LMIC joins the network nicely, even if I don’t disable the I2S code temporarily.

W.r.t. to the latter comment: on this particular hardware the issue of not being able to join TTN with I2S active was not seen anymore. In contrast to an earlier set of hardware with different I2S microphone and perhaps different crystal in the MCU. So, it’s not to say if Adafruit_ZeroI2S.h is better or worse in that respect.

One more comment: Adafruit_ZeroI2S.h doesn’t implement DMA for the SAMD21. I would imagine that for small sample sizes, recorded in an intermittant non-streaming manner, DMA is not really needed. @Andreas, does that thinking make sense?

Finally, Adafruit_ZeroFFT.h does its magic on 16 bit data, whereas the microphone delivers 24 bit data. Compressing the range of the microphone data in 16 bits is easy enough, so not really an issue. For the moment I am quite happy that it now finally seems to work fine, so I won’t yet start looking into the FFT algorithms provided in arm_math.h that operate on q31_t type data.

UPDATE: GitHub - wjmb/FeatherM0_BEEP is now public. Please note that it is work in progress code. Please comment if you’d feel it could be improved in one way or another!

Best regards,

Wouter.


#92

Dear Diren and others,

@Diren, thanks for the BeABee files ! I tested my code quite a bit and feel confident that it works as needed, so yesterday I played “miks boden-04-may-16-00.wav” via a headphone to my node in order to check how the thing would respond to real bee-sounds. While looking at the data coming out, I was wondering about signal to noise and how to deal with that.

Although I have no way to calibrate the sound pressure level coming out of the headset into the microphone, I do believe I played it at a reasonable volume. I sample the microphone with 6250 Hz and take 1024 samples. On that data I do a FFT. The spectrum looks quite noisy; in fact, too noisy to do the analysis of comparing different bins and getting consistent results. Here is a picture, to give you a sense of the signal:

In previous tests, my remedy to this was to take a number of spectra and just average over these. But, how many spectra to take and average over? The picture below shows averaging over 10 and 100 spectra.

Is there some way to determine when the signal to noise is decent enough? Are there other methods than simply taking n spectra and lumping them together? Something that doesn’t require a lot of memory.

Probably this is simple stuff for anyone skilled in the art of dealing with noisy data … I would love to get a few pointers on what to try!

Best regards,

Wouter.


#93

Dear Wouter,

after some trips to other clearings in our ecosystem I’m just coming back to this. Thank you so much for the effort and spirit you are putting into this and great to hear you have been able to make that kind of progress.

Unfortunately, I’m not able to go through all the details right now and I don’t even know if I would have appropriate answers on the analysis aspects. If you feel there are still infrastructural aspects you might need my assistance for, please let me know.

Thanks for that!

About the other things you were asking:

I hear you. However, I’m not a signal guy either. Would computing the RMS level make some sense at this place? Maybe @weef or @tiz could contribute some suggestions here? Thanks already!

With kind regards,
Andreas.


#94

Hi Wouter and everybody,

thanks for your good work!

In fact, the results of my FFT also look quite noisy (black line).
Especially for plotting, I only save the mean value of 20 Hz frequency bands (red dots).


The plot shows the spectrum of the first 10 seconds of “miks boden-04-may-16-00.wav”.
I think we did not use the same audio snippets, but it seems that your approach works well and the results make sense, even though you rerecorded the audio!

I would say this really depends on the method you want to use. As you showed, it is important for the algorithm developed by the people in Kursk.
It also really depends on the amount of data that can be send via LoRa and if you want to run a Machine Learning algorithm at the node or some server. Deep Learning algorithms should be able to deal with a “bad” (low) signal to noise ratio. If they are pretrained, they could even run at the node.
Also, if you just use the spectrum to calculate something like the mean frequency, the signal to noise ratio is not that important.
Potentially, the noise even contains some information.

Probably yes, but I guess if your approach works, it works ;)
Not sure how much memory you want to use, but in general possible approaches could be bin smoothing, polynomials or some sort of splines, e.g. penalised regression splines.

Cheers,
Diren


#95

Hi @Diren and others,

In the end I decided to use simple averaging of spectrograms. The node records at 6250 Hz for 1024 data points. From this audio data the FFT is calculated (in 6.1 Hz wide bins). This is done some number of times (e.g. 100) and these FFT’s are averaged. In the figure below this is shown as “fine spectrogram”, which is kept internal to the node and not sent over LoRaWAN. Instead, from that average spectrogram, several quantities are calculated, as listed in the graph. Among these is the “course spectrogram”, which is obtained by lumping eight “fine” bins together into one “course” bin.

image

Somehow I have pretty high hopes for the s_fmax quantities, which represent the one global and another four local maxima in the spectrogram. It will be interesting to see how these evolve over time after I insert the node into one of my hives.

Best regards,

Wouter.