Analyze recordings from queen vs. queenless hives using "audiohealth"

Thanks to @clemens, we got some recordings from queen vs. queenless hives:

In the following posts, we want to share our results when running the OSBH audio analyzer against these samples. For more information about that, see also the introduction from @einsiedlerkrebs:

To download two recording samples:

wget -O colony-with-queen-gruber.mp3 https://community.hiveeyes.org/uploads/default/original/1X/6de56aed3b520166d92ce3194ff6a9e6852491b2.mp3
wget -O colony-without-queen-gruber.mp3 https://community.hiveeyes.org/uploads/default/original/1X/c751a4a83e8c1f5f0380eee2c7ee310b0967056a.mp3 

Cool, thanks @clemens!

We tried the audiohealth program on it and are happy to share the results.

Analysis

Colony with queen

$ audiohealth --file samples/colony-with-queen-gruber.mp3 --analyzer tools/osbh-audioanalyzer/bin/test
Duration: 38s
{u'active': 30}

Colony without queen

$ audiohealth --file samples/colony-without-queen-gruber.mp3 --analyzer tools/osbh-audioanalyzer/bin/test
Duration: 38s
{u'swarm': 30}

Resources

…means having sox handle mp3 files - which requires libsox-fmt-mp3; might be added as dependency.

Right. Will be coming with one of the next commits. Thanks!

Aaron Makaruk just told us before:

Our classifier has been recently updated to include two new states, and we’ve moved past decision-tree algorithms to something yielding greater results.

… and he is right:
https://github.com/opensourcebeehives/MachineLearning-Local/commit/a40de504

Thanks, Javier!


When running the audio data through the updated classifier, we get the following results:

Colony with queen

$ audiohealth --audiofile samples/colony-with-queen-gruber.mp3 --analyzer tools/osbh-audioanalyzer/bin/test

==================
Sequence of states
==================
swarm, swarm, swarm,

===================
Compressed timeline
===================
  0s -  30s   swarm           ===

==============
Total duration
==============
        30s   swarm           ===

======
Result
======
The most common events (i.e. the events with the highest total duration) are:

     The colony is mostly in »SWARM« state, which is going on for 30 seconds.

Colony without queen

$ audiohealth --audiofile samples/colony-without-queen-gruber.mp3 --analyzer tools/osbh-audioanalyzer/bin/test

==================
Sequence of states
==================
missing_queen, swarm, swarm,

===================
Compressed timeline
===================
  0s -  10s   missing_queen   =
 10s -  30s   swarm           ==

==============
Total duration
==============
        20s   swarm           ==
        10s   missing_queen   =

======
Result
======
The most common events (i.e. the events with the highest total duration) are:

     The colony is mostly in »SWARM« state, which is going on for 20 seconds.
     Sometimes, the state oscillates to »MISSING_QUEEN«, for 10 seconds in total.

While “missing_queen” is not the most dominant event here, this result is very impressive!
Cheers, Javiers and OSBH!

1 Like

@clemens is your colony really in swarm state? Should the algorithm not declare it as active state?

No, it isn’t!

It actually did already using the former decision tree algorithm:

$ audiohealth --file samples/colony-with-queen-gruber.mp3 --analyzer tools/osbh-audioanalyzer/bin/test
Duration: 38s
{u'active': 30}

The result you cited is from the updated analyzer using logistic regression. By looking at it, it might need some improvements or more training.

Colony with queen

Here’s a direct comparison between the former strategy called dt-1.0 (decision tree) and the current one called lr-2.0 (logistic regression), both applied to the “Colony with queen” audio sample:

dt-1.0

$ audiohealth --datfile samples/colony-with-queen-gruber.wav.dat --strategy dt-1.0 --analyzer tools/osbh-audioanalyzer/bin/test 

File:     samples/colony-with-queen-gruber.wav.dat
Strategy: dt-1.0

==================
Sequence of states
==================
active, active, active

===================
Compressed timeline
===================
  0s -  30s   active          ===

==============
Total duration
==============
        30s   active          ===

lr-2.0

$ audiohealth --datfile samples/colony-with-queen-gruber.wav.dat --strategy lr-2.0 --analyzer tools/osbh-audioanalyzer/bin/test 

File:     samples/colony-with-queen-gruber.wav.dat
Strategy: lr-2.0

==================
Sequence of states
==================
swarm, swarm, swarm

===================
Compressed timeline
===================
  0s -  30s   swarm           ===

==============
Total duration
==============
        30s   swarm           ===

We already told OSBH about the outcome regarding the minor regression: The state should have probably stayed on “active”.

Colony without queen

Here’s a direct comparison between the former strategy called dt-1.0 (decision tree) and the current one called lr-2.0 (logistic regression), both applied to the “Colony without queen” audio sample:

dt-1.0

$ audiohealth --datfile samples/colony-without-queen-gruber.wav.dat --strategy dt-1.0 --analyzer tools/osbh-audioanalyzer/bin/test 

File:     samples/colony-without-queen-gruber.wav.dat
Strategy: dt-1.0

==================
Sequence of states
==================
swarm, swarm, swarm

===================
Compressed timeline
===================
  0s -  30s   swarm           ===

==============
Total duration
==============
        30s   swarm           ===

lr-2.0

$ audiohealth --datfile samples/colony-without-queen-gruber.wav.dat --strategy lr-2.0 --analyzer tools/osbh-audioanalyzer/bin/test 

File:     samples/colony-without-queen-gruber.wav.dat
Strategy: lr-2.0

==================
Sequence of states
==================
missing_queen, swarm, swarm

===================
Compressed timeline
===================
  0s -  10s   missing_queen   =
 10s -  30s   swarm           ==

==============
Total duration
==============
        20s   swarm           ==
        10s   missing_queen   =

The new lr-2.0 strategy detected the “missing_queen” state once. This looks promising!

Dear @Jabors,

after pulling your recent changes about logistic classifier and filters

https://github.com/opensourcebeehives/MachineLearning-Local/commit/214c60ce

we see improvements (see results below), cool! Thanks a bunch for retraining the classifier on @clemens’ input data. Especially the analysis of the queenless colony now is to the point and the results of the audio sample of a colony with a queen are slightly better.

We labeled the classifier changes as “lr-2.1” in our fork to support selecting the filter and classifier from the command line.

Colony with queen

$ audiohealth analyze --analyzer tools/osbh-audioanalyzer/bin/test --datfile samples/colony-with-queen-gruber.wav.dat --strategy lr-2.1

File:     samples/colony-with-queen-gruber.wav.dat
Strategy: lr-2.1

==================
Sequence of states
==================
swarm, swarm, active

===================
Compressed timeline
===================
  0s -  20s   swarm           ==
 20s -  30s   active          =

==============
Total duration
==============
        20s   swarm           ==
        10s   active          =

======
Result
======
The most common events (i.e. the events with the highest total duration) are:

     The colony is mostly in »SWARM« state, which is going on for 20 seconds.
     Sometimes, the state oscillates to »ACTIVE«, for 10 seconds in total.

Colony without queen

$ audiohealth analyze --analyzer tools/osbh-audioanalyzer/bin/test --datfile samples/colony-without-queen-gruber.wav.dat --strategy lr-2.1

File:     samples/colony-without-queen-gruber.wav.dat
Strategy: lr-2.1

==================
Sequence of states
==================
missing_queen, missing_queen, missing_queen

===================
Compressed timeline
===================
  0s -  30s   missing_queen   ===

==============
Total duration
==============
        30s   missing_queen   ===

======
Result
======
The most common events (i.e. the events with the highest total duration) are:

     The colony is mostly in »MISSING_QUEEN« state, which is going on for 30 seconds.

Here are the results for the total number of recording samples from @clemens’ seven hives obtained from:

“audiohealth” is also using the strategy “lr-2.1” as above, i.e. the most recent classifier updates from the OSBH machine learning audio analyzer as of 2017-07-11. Thanks, @Jabors and @clemens!

To download the samples, use:


# Colonies with queen
wget -O hive2_queen_production-hive-small_2brood-boxes_no-super_old-queen_2017-06-05_15-29.mp3 https://community.hiveeyes.org/uploads/default/original/1X/bf02f90aa741db64956ff1e4b8bbf94bf1c636d0.mp3
wget -O hive3_queen_production-hive-big_2brood-boxes_2super_open-and-covered-brood_2017-06-05_15-33.mp3 https://community.hiveeyes.org/uploads/default/original/1X/e76fa0558ccf2b46848059ed3db1755bea27f6a8.mp3
wget -O hive4_queen_production-hive-middle_2brood-boxes_1super_open-and-covered-brood_2017-06-05_15-35.mp3 https://community.hiveeyes.org/uploads/default/original/1X/6d59e3004bc5c5c5aa8edfe59dc146c6f6e47cbe.mp3

# Colonies without queen
wget -O hive1_no-queen_nucleus-10-days-old_with-queen-cell_2017-06-05_15-27.mp3 https://community.hiveeyes.org/uploads/default/original/1X/6e001118cc13c9fe1937258f2e260a32b67f82dd.mp3
wget -O hive5_no-queen_nucleus-10-days-old_with-queen-cell_2017-06-05_15-37.mp3 https://community.hiveeyes.org/uploads/default/original/1X/18ba8685a8391e6faa9cb10b7c813a2524e466c3.mp3
wget -O hive6_no-queen_nucleus-3-days-old_2017-06-05_15-40.mp3 https://community.hiveeyes.org/uploads/default/original/1X/4b3e1997bb18148cc225c1dec031a7d50a3881dd.mp3
wget -O hive7_no-queen_nucleus-3-days-old_2017-06-05_15-42.mp3 https://community.hiveeyes.org/uploads/default/original/1X/2b3e47816a13135c3f220099ce36683ba3012d21.mp3

Colonies with queen

Hive 2

$ audiohealth analyze --analyzer tools/osbh-audioanalyzer/bin/test --audiofile samples/hive2_queen_production-hive-small_2brood-boxes_no-super_old-queen_2017-06-05_15-29.mp3 --strategy lr-2.1

Duration: 48s
Strategy: lr-2.1

==================
Sequence of states
==================
swarm, swarm, active, active

===================
Compressed timeline
===================
  0s -  20s   swarm           ==
 20s -  40s   active          ==

==============
Total duration
==============
        20s   active          ==
        20s   swarm           ==

Hive 3

$ audiohealth analyze --analyzer tools/osbh-audioanalyzer/bin/test --audiofile samples/hive3_queen_production-hive-big_2brood-boxes_2super_open-and-covered-brood_2017-06-05_15-33.mp3 --strategy lr-2.1

Duration: 74s
Strategy: lr-2.1

==================
Sequence of states
==================
swarm, swarm, swarm, swarm, swarm, swarm, swarm

===================
Compressed timeline
===================
  0s -  70s   swarm           =======

==============
Total duration
==============
        70s   swarm           =======

Hive 4

$ audiohealth analyze --analyzer tools/osbh-audioanalyzer/bin/test --audiofile samples/hive4_queen_production-hive-middle_2brood-boxes_1super_open-and-covered-brood_2017-06-05_15-35.mp3 --strategy lr-2.1

Duration: 50s
Strategy: lr-2.1

==================
Sequence of states
==================
active, active, active, active, active

===================
Compressed timeline
===================
  0s -  50s   active          =====

==============
Total duration
==============
        50s   active          =====

Colonies without queen

Hive 1

$ audiohealth analyze --analyzer tools/osbh-audioanalyzer/bin/test --audiofile samples/hive1_no-queen_nucleus-10-days-old_with-queen-cell_2017-06-05_15-27.mp3 --strategy lr-2.1

Duration: 45s
Strategy: lr-2.1

==================
Sequence of states
==================
missing_queen, missing_queen, missing_queen, missing_queen

===================
Compressed timeline
===================
  0s -  40s   missing_queen   ====

==============
Total duration
==============
        40s   missing_queen   ====

Hive 5

$ audiohealth analyze --analyzer tools/osbh-audioanalyzer/bin/test --audiofile samples/hive5_no-queen_nucleus-10-days-old_with-queen-cell_2017-06-05_15-37.mp3 --strategy lr-2.1

Duration: 90s
Strategy: lr-2.1

==================
Sequence of states
==================
missing_queen, missing_queen, missing_queen, missing_queen, missing_queen, active, missing_queen, missing_queen, missing_queen

===================
Compressed timeline
===================
  0s -  50s   missing_queen   =====
 50s -  60s   active          =
 60s -  90s   missing_queen   ===

==============
Total duration
==============
        80s   missing_queen   ========
        10s   active          =

Hive 6

$ audiohealth analyze --analyzer tools/osbh-audioanalyzer/bin/test --audiofile samples/hive6_no-queen_nucleus-3-days-old_2017-06-05_15-40.mp3 --strategy lr-2.1

Duration: 77s
Strategy: lr-2.1

==================
Sequence of states
==================
missing_queen, swarm, swarm, swarm, swarm, missing_queen, missing_queen

===================
Compressed timeline
===================
  0s -  10s   missing_queen   =
 10s -  50s   swarm           ====
 50s -  70s   missing_queen   ==

==============
Total duration
==============
        40s   swarm           ====
        30s   missing_queen   ===

Hive 7

$ audiohealth analyze --analyzer tools/osbh-audioanalyzer/bin/test --audiofile samples/hive7_no-queen_nucleus-3-days-old_2017-06-05_15-42.mp3 --strategy lr-2.1

Duration: 65s
Strategy: lr-2.1

==================
Sequence of states
==================
missing_queen, missing_queen, active, missing_queen, missing_queen, missing_queen

===================
Compressed timeline
===================
  0s -  20s   missing_queen   ==
 20s -  30s   active          =
 30s -  60s   missing_queen   ===

==============
Total duration
==============
        50s   missing_queen   =====
        10s   active          =

What i like about open source and community:

A project can even grow, while myself has been on holiday.

Very impressive all of you. Thanks a bunch

2 Likes

Great @Andreas thanks very much for the analysis with the updated classifier.

I fear we do not need to be too euphoric because learning data and evaluation data could be the same. ;-) @Tristan_OSBH and @Jabors: Has the latest version of the classifier been trained with my queenless hive data? So it would not bee too surprising that matching is much better.

But it’s also interesting that hive 6 with an out of the ordinary visual sound spectrum is now classified as queenless.

But it is correct, isn’t it? At least, your file name said “hive6_no-queen_nucleus-3-days-old_2017-06-05_15-40.mp3”…

But you are right: Hive 6 feels pretty healthy, also when looking at the power spectrum (see Sound Visualization - #5 by Andreas) in contrast to the other colonies without a queen.

Yes, categorization is correct. So it is good that the algorithm has “seen” it even when a human inspection of the sound visualization would come to other results.

Btw. I think that “health” or “healthy” is not a good description for what we do here with the sound analysis. A queenless or swarming hive is not sick. Also other categories (not yet implemented but on our wishlist) like less honey / nectar, full supers, and so on are at most un-normal but this means not sick. So “bee health” is a buzz word today and for sure good in marketing also for electronic bee monitoring devices but this is not the only selling point for beekeepers.

Acknowledged. I didn’t find a better word in the hurry. I should just have said “active” - my apologies :-).

I’m a bit disappointed from the new analysis results of the sound files I posted under: Sound Samples and Basic Analysis Hive with Queen vs. Queenless

The reported classifier was:
Strategy: lr-2.1

Nearly all classifications are wrong. All hives have queens, so “missing_queen” is wrong in all hives, no hive is swarming in this season in Germany. I don’t know what “collapsed” means, now, in the next month, next year, ;-) ok some nuclei are small, especially the hive1 but for the rest it is wrong, they are perhaps “weak” but not collapsed.

hive1_queen-nucleus_2017-07-21_12-38-18.mp3

==============
Total duration
==============
        30s   missing_queen   ===
        10s   collapsed       =

hive2_queen-nucleus-some-weeks-old_2017-07-21_12-39-36.mp3

==============
Total duration
==============
        20s   active          ==
        10s   collapsed       =

hive3_queen_production-hive-small_2brood-boxes_no-super_old-queen_2017-07-21_12-40-43.mp3

==============
Total duration
==============
        30s   swarm           ===
        20s   collapsed       ==
        10s   active          =

hive4_queen-nucleus_2017-07-21_12-42-21.mp3

==============
Total duration
==============
        50s   collapsed       =====

hive5_queen-nucleus_2017-07-21_12-43-48.mp3

==============
Total duration
==============
        50s   missing_queen   =====

hive6_queen_production-hive-middle_2brood-boxes_no-super_2017-07-21_12-45-08.mp3

==============
Total duration
==============
        30s   swarm           ===
        10s   active          =

hive7_queen_production-hive-big_2brood-boxes_no-super_2017-07-21_12-46-22.mp3

==============
Total duration
==============
        40s   swarm           ====
        10s   active          =

hive8_queen-nucleus-some-weeks-old_2017-07-21_12-47-35.mp3

==============
Total duration
==============
        30s   missing_queen   ===
        10s   active          =
        10s   collapsed       =

hive9_queen-nucleus-some-weeks-old_2017-07-21_12-49-14.mp3

==============
Total duration
==============
        60s   missing_queen   ======

hive10_queen-nucleus_2017-07-21_12-50-42.mp3

==============
Total duration
==============
        30s   collapsed       ===
        20s   missing_queen   ==

hive11_queen-nucleus-some-weeks-old_2017-07-21_12-52-10.mp3

==============
Total duration
==============
        40s   active          ====
        10s   missing_queen   =

As the strategy 2.1 reflects the most recent update from OSBH, we found 2.0 sometimes worked better for us by giving more reasonable results. However, we believe the training data is not yet sufficient. Let’s also ping @Jabors, @Tristan_OSBH and @Aaron about this, they might be interested.