-
Notifications
You must be signed in to change notification settings - Fork 0
Home
M Hightower edited this page May 18, 2023
·
9 revisions
Objective/Goal: To get a better understanding of an ESP8266 crash. In particular, the frustrating inexplicable Hardware WDT Reset and the Software WDT Reset.
- Collect information about the crash not present in the stack trace.
- Convert some crash conditions that result in Hardware WDT Reset into
ill
, exccause 0, exception event. - Recover the source of some Software WDT Resets
- Buffer/capture the last
ets_printf
just before: 2 and 3. - Convert unhandled breakpoints to
ill
, exccause 0, exception event. - Replace unattended Boot ROM exceptions with the SDK's general exception handler.
- Without 5 and 6 their default logic would lead to HWDT Resets.
Generally speaking, Hardware WDT Reset and Software WDT Reset crashes are often the results of an Infinite Loop. While the sketch looping and failing to let the system run is often the assumed culprit, that is not always the case. Once you know you don't have any Infinite Loops in your code, what is the next move? This library may help there.
- Some are the results of unhandled breakpoints in the SDK or Boot ROM, and others result from unhandled exceptions which fall into the unhandled breakpoint path. These breakpoints can usually be caught with
gdb
. However, nobody runs their sketch all the time withgdb
, and some failures are so infrequent thatgdb
is not a viable monitoring tool for the crash. - Another group of causes for WDT Reset crashes is Deliberate Infinite Loops. In "C" code, these are often written as
while(true){/* empty */}
. In asm, this compiles down toloop: j loop.
Which has the byte pattern 0x06, 0xff, 0xff. If you search through SDK v3.0.5, you can find about 95 of these. 94 of these Infinite Loops are preceded byets_printf
. Which does nothing unless you have a console connected and have debug printing enabled. When you think about it, Deliberate Infinite Loops is not an unreasonable way to handle an unrecoverable event; however, not providing a clue for the WDT reset is intolerable. And, if you were to see the "last gasp" error message, they are not documented. When interrupts are enabled, these Infinite Loops present as Software WDT Resets, which can give you some clue of the location of the crash; however, when interrupts are disabled, you get zero details, only a Hardware WDT Reset.
I suspect the root cause or stimulus for these is low heap space with OOM events.
To be continued ...