Thanks to @clemens, we got some recordings from queen vs. queenless hives:
In the following posts, we want to share our results when running the OSBH audio analyzer against these samples. For more information about that, see also the introduction from @einsiedlerkrebs:
Our classifier has been recently updated to include two new states, and we’ve moved past decision-tree algorithms to something yielding greater results.
When running the audio data through the updated classifier, we get the following results:
Colony with queen
$ audiohealth --audiofile samples/colony-with-queen-gruber.mp3 --analyzer tools/osbh-audioanalyzer/bin/test
==================
Sequence of states
==================
swarm, swarm, swarm,
===================
Compressed timeline
===================
0s - 30s swarm ===
==============
Total duration
==============
30s swarm ===
======
Result
======
The most common events (i.e. the events with the highest total duration) are:
The colony is mostly in »SWARM« state, which is going on for 30 seconds.
Colony without queen
$ audiohealth --audiofile samples/colony-without-queen-gruber.mp3 --analyzer tools/osbh-audioanalyzer/bin/test
==================
Sequence of states
==================
missing_queen, swarm, swarm,
===================
Compressed timeline
===================
0s - 10s missing_queen =
10s - 30s swarm ==
==============
Total duration
==============
20s swarm ==
10s missing_queen =
======
Result
======
The most common events (i.e. the events with the highest total duration) are:
The colony is mostly in »SWARM« state, which is going on for 20 seconds.
Sometimes, the state oscillates to »MISSING_QUEEN«, for 10 seconds in total.
While “missing_queen” is not the most dominant event here, this result is very impressive!
Cheers, Javiers and OSBH!
Here’s a direct comparison between the former strategy called dt-1.0 (decision tree) and the current one called lr-2.0 (logistic regression), both applied to the “Colony with queen” audio sample:
dt-1.0
$ audiohealth --datfile samples/colony-with-queen-gruber.wav.dat --strategy dt-1.0 --analyzer tools/osbh-audioanalyzer/bin/test
File: samples/colony-with-queen-gruber.wav.dat
Strategy: dt-1.0
==================
Sequence of states
==================
active, active, active
===================
Compressed timeline
===================
0s - 30s active ===
==============
Total duration
==============
30s active ===
Here’s a direct comparison between the former strategy called dt-1.0 (decision tree) and the current one called lr-2.0 (logistic regression), both applied to the “Colony without queen” audio sample:
we see improvements (see results below), cool! Thanks a bunch for retraining the classifier on @clemens’ input data. Especially the analysis of the queenless colony now is to the point and the results of the audio sample of a colony with a queen are slightly better.
We labeled the classifier changes as “lr-2.1” in our fork to support selecting the filter and classifier from the command line.
Colony with queen
$ audiohealth analyze --analyzer tools/osbh-audioanalyzer/bin/test --datfile samples/colony-with-queen-gruber.wav.dat --strategy lr-2.1
File: samples/colony-with-queen-gruber.wav.dat
Strategy: lr-2.1
==================
Sequence of states
==================
swarm, swarm, active
===================
Compressed timeline
===================
0s - 20s swarm ==
20s - 30s active =
==============
Total duration
==============
20s swarm ==
10s active =
======
Result
======
The most common events (i.e. the events with the highest total duration) are:
The colony is mostly in »SWARM« state, which is going on for 20 seconds.
Sometimes, the state oscillates to »ACTIVE«, for 10 seconds in total.
Colony without queen
$ audiohealth analyze --analyzer tools/osbh-audioanalyzer/bin/test --datfile samples/colony-without-queen-gruber.wav.dat --strategy lr-2.1
File: samples/colony-without-queen-gruber.wav.dat
Strategy: lr-2.1
==================
Sequence of states
==================
missing_queen, missing_queen, missing_queen
===================
Compressed timeline
===================
0s - 30s missing_queen ===
==============
Total duration
==============
30s missing_queen ===
======
Result
======
The most common events (i.e. the events with the highest total duration) are:
The colony is mostly in »MISSING_QUEEN« state, which is going on for 30 seconds.
Great @Andreas thanks very much for the analysis with the updated classifier.
I fear we do not need to be too euphoric because learning data and evaluation data could be the same. ;-) @Tristan_OSBH and @Jabors: Has the latest version of the classifier been trained with my queenless hive data? So it would not bee too surprising that matching is much better.
But it is correct, isn’t it? At least, your file name said “hive6_no-queen_nucleus-3-days-old_2017-06-05_15-40.mp3”…
But you are right: Hive 6 feels pretty healthy, also when looking at the power spectrum (see Sound Visualization - #5 by Andreas) in contrast to the other colonies without a queen.
Yes, categorization is correct. So it is good that the algorithm has “seen” it even when a human inspection of the sound visualization would come to other results.
Btw. I think that “health” or “healthy” is not a good description for what we do here with the sound analysis. A queenless or swarming hive is not sick. Also other categories (not yet implemented but on our wishlist) like less honey / nectar, full supers, and so on are at most un-normal but this means not sick. So “bee health” is a buzz word today and for sure good in marketing also for electronic bee monitoring devices but this is not the only selling point for beekeepers.
Nearly all classifications are wrong. All hives have queens, so “missing_queen” is wrong in all hives, no hive is swarming in this season in Germany. I don’t know what “collapsed” means, now, in the next month, next year, ;-) ok some nuclei are small, especially the hive1 but for the rest it is wrong, they are perhaps “weak” but not collapsed.
hive1_queen-nucleus_2017-07-21_12-38-18.mp3
==============
Total duration
==============
30s missing_queen ===
10s collapsed =
As the strategy 2.1 reflects the most recent update from OSBH, we found 2.0 sometimes worked better for us by giving more reasonable results. However, we believe the training data is not yet sufficient. Let’s also ping @Jabors, @Tristan_OSBH and @Aaron about this, they might be interested.