Troubleshooting some issues when running the Terkin Datalogger on the LoPy4

did this and make sketch-and-run but log still stops after:

WiFi STA: Networking address (IP): ...

and resetting the board leads to Core 1 panic

Guru Meditation Error: Core  1 panic'ed (IllegalInstruction). Exception was unhandled.
Memory dump at 0x40200754: bad00bad bad00bad bad00bad

PS: I’ve never seen this panic before using the pybytes-less pycom firmware. Can this be related?

I think it has something to do with the new 1.20.1 firmware. I saw this with pybytes activate also, see Troubleshooting the recent Pycom Firmware Release 1.20.1.r1 - #2 by clemens > section Hiveeyes-Software

Sorry to hear that this also hits you on a LoPy4. You might alternatively check out https://packages.hiveeyes.org/hiveeyes/foss/pycom/LoPy4-1.20.1.r1-robert.tar.gz, that’s all we can offer right now. This firmware image was built and provided by robert-hh.

If this error will not go away, you might well consider going back to the official pybytes firmware…


Sorry again, we are also just doing trial-and-error here on a regular basis. However, there are things indicating there might be something fishy under the hood.


Other users are also reporting these core panics on the Pycom user forum.

Please also enjoy

So, things like Soft errors caused by single-event upsets (SEUs) aka. ECC RAM absolutely matters might not be unrelated at all. Maybe some device models or revisions are more fragile than others.

Saying this kiddingly, I strongly believe this issue is more likely related to Random memory corruption faults on ESP32-WROVER rev.1 and rev.2 when running in dual-core mode.

Coming back from there, we built firmware images for the FiPy and the LoPy4 using

#define CONFIG_FREERTOS_UNICORE 1

I can’t tell for sure whether this makes any difference at all as the FiPy on my workbench has been running pretty stable even before without that setting.

Please also pull from our latest master as this will bring you Improve WiFi robustness on first connection attempt · hiveeyes/terkin-datalogger@d3ae518 · GitHub.

2 Likes

The AttributeError comes up, when bus-onewire-0 in settings.py is disabled

 "id": "bus-onewire-0",
            "family": "onewire",
            "number": 0,
            "enabled": False,
            "pin_data": "P11",
1 Like

Thanks for letting us know. Are you sure you are running the latest version from master or did you make some modifications locally? At line 329, we are seeing an empty line there.

However, we just added [1] and [2] to mitigate all conditions when accessing a sensor object connected to a bus object which has been disabled through the configuration settings.


  1. Gracefully handle buses without names · hiveeyes/hiveeyes-micropython-firmware@aaae6c2 · GitHub ↩︎

  2. Improve sensor registration mechanics again · hiveeyes/hiveeyes-micropython-firmware@bcaacf8 · GitHub ↩︎

1 Like

Again I saw these core panics also with the unicore firmware. It starts happening after about 10 times flashing the device after PWRON. After dis/reconnecting the device from/to power the error is gone, likely until the next 10 flash procedures.

1 Like

Thanks for letting us know.

Uploading raw source code through raw REPL is a rather heavy process which contributes to memory fragmentation on the device most probably leading to subsequent crashes, most probably caused by Random memory corruption faults on ESP32-WROVER rev.1 and rev.2 when running in dual-core mode.

If you are getting sick of this, we might want to go for more advanced and efficient methods bringing the source code to the device with less overhead.


I am using the one-stop

make recycle-ng MPY_CROSS=true MPY_TARGET=pycom MPY_VERSION=1.11

these days and would never look back. The background about why this is way more efficient is that it’s using FTP instead of raw REPL for transferring the files and that it will compile the sourcecode to bytecode before, reducing the overall size of stuff to be shipped to the device significantly.

$ du -sch dist-packages terkin
596K	total

$ du -sch lib-mpy-1.11-pycom
352K	total

How to

This currently requires a network connection over WiFi with the IP address of the device stored in .terkin/floatip like

$ cat .terkin/floatip
192.168.178.40

For automating this, I am using

make terkin-agent action=maintain macs=80:7d:3a:c2:de:44 # libero

to find the device’s IP address and

make connect-wifi ssid=GartenNetzwerk password=<redacted>

to actually get the device into the network if required.

2 Likes

this revealed the following modules missing from the environment:

  • netaddr
  • netifaces
  • scapy

Do you mind adding these to the requirements files?

These guys are listed in requirements-terkin-agent.txt already. Just run

make setup-terkin-agent

Sorry that we haven’t documented each and every bit of these details yet. You are probably one of the first people running Linux who would like to use that infrastructure.

We followed the instructions on PSRAM Cache Issue stills exist (IDFGH-31) · Issue #2892 · espressif/esp-idf · GitHub thoroughly and just made another wave of releases. If we are lucky, [1] might be even more stable for you.


  1. https://packages.hiveeyes.org/hiveeyes/foss/pycom/vanilla/LoPy4-1.20.1.r1-0.2.0-vanilla-psram-fix.tar.gz ↩︎

Dear @Thias,

we just released [1]. It includes some additional fixes on the ESP-IDF level and might also add some improvements on stability. For more details, enjoy [2].

With kind regards,
Andreas.

[1] https://packages.hiveeyes.org/hiveeyes/foss/pycom/vanilla/LoPy4-1.20.1.r1-0.5.0-vanilla-butterfly-csfix.tar.gz
[2] Pycom firmware bakery

1 Like

We’ve just released another bunch of firmware images with hopefully improved robustness, so [1] might make you happy.

[1] https://packages.hiveeyes.org/hiveeyes/foss/pycom/vanilla/LoPy4-1.20.1.r1-0.6.0-vanilla-dragonfly.tar.gz

1 Like

Hallo zusammen,
bei meinen Versuchen den Terkin-Datalogger auf meinem LoPy4 zum Laufen zu bekommen (hier noch per Atom+Pymakr) bin ich mit der Releaseversion 2019-08-19 0.6.0 auf folgenden Fehler beim Upload auf den LoPy gestoßen, den @clemens auch schon mal hier angesprochen hat:

Auf dem LoPy4 habe ich dafür bei meinen Versuchen diverse Firmwares probiert, von der neuesten LoPy4-1.20.2.rc6-0.10.1-vanilla-squirrel.tar.gz bis zu einer 1.20.1.r1, alle mit dem selben Fehler.

Meine ersten Versuche mit der Sandbox dann endeten mit einem Connection Timeout zum LoPy, sobald ich make sketch-and-run gestartet habe.

VG,
Jan

Die Squirrel Firmware ist schon mal eine gute Idee.

Wir machen bald ein neues Release. @Andreas wird dich informieren, wann das geschehen ist.

Hast du mal RESET gedrückt, wenn du in der Sandbox auf die Verbindung wartest. Selbst dann dauert die Übertragung meist eine ganze Weile.

Das letzte “offizielle” release ist leider recht alt, ich schicke dir mal (per PM) eine Kopie des codes meines aktuellen LoPys. Den habe ich mit Matthias ttn-branche erzeug, quasi ein “dirty-release”, der läuft mit der 1.20.2.rc6-0.10.2-vanilla-squirrel-nosmartconfig firmware … nur falls du nicht warten magst, bis das offizielle auf git ist.

Leider ist es so, dass im GIT-Repo nicht der code liegt, der dann 1:1 auf den LoPy kommt, sondern die Maschinerie “Sandbox” bastelt da einiges zusammen.

1 Like

0.7.0 scheint fertig:

3 Likes

Ich habe es bisher leider nicht geschafft die Terkin Sandbox zum Laufen zu bekommen mit der WSL (Ubuntu 18.04 LTS), bzw der Upload zum LoPy4 klappt nicht wegen timeouts. Ich habe es auch mal mit WSL und Ubuntu 20.04 LTS versucht, aber die virtualenv wird mit make setup nicht erstellt/gefunden, ich glaube die heisst in den neueren Python Versionen 3.8.x jetzt venv… sicher bin ich mir allerdings nicht.

Gibt es sonst noch eine Möglichkeit, wie eine zum LoPy hochladbare Version erzeugt werden kann, z.B. auch ohne angestecktem LoPy? Das wäre hilfreich, um die im Feld befindlichen LoPys mit einem Laptop am Bienenstand updaten zu können.

VG, Jan