Pitfalls & Troubleshooting
Introduction
In this article, we will walk you through the important concepts when developing on the BOLOS platform and provide some analysis of common failure scenarios you might experience while developing applications. This article can help you if you are developing your application from scratch or if you are going through important customization of a forked application.
Not Enough RAM
At the time of this writing, the default link script provided by the SDK for the Ledger Nano S allocates 4.5 KiB of RAM for applications to use. This 4.5 KiB has to be enough to store all global non-const and non-NVRAM variables as well as the call stack.
This is the linker error you will get if you declare too many global non-const and non-NVRAM variables to fit in RAM:
bin/app.elf section `.bss' will not fit in region `SRAM'
The only solution to this problem is using less RAM. To do so, make your application's memory layout more efficient.
Stack Overflows
To address RAM limitations and stack overflows, consider optimizing your application's memory layout and implementing a stack canary for overflow detection.
To create a stack canary, you have to set a magic value at the start of the stack area. The stack grows towards lower addresses, so a canary at the start of this area will be located at the top of the stack.
The canary is then checked regularly. If the canary was modified, this means there was a stack overflow.
Here is the recommended way to implement a stack canary:
// This symbol is defined by the link script to be at the start of the stack
// area.
extern unsigned long _stack;
#define STACK_CANARY (*((volatile uint32_t*) &_stack))
void init_canary() {
STACK_CANARY = 0xDEADBEEF;
}
void check_canary() {
if (STACK_CANARY != 0xDEADBEEF)
THROW(EXCEPTION_OVERFLOW);
}
The canary must be checked regularly. You can run the check every time io_event(...)
is called.
Exception Handling
With our error model, there are two common failure scenarios.
-
You must be careful always to close a
TRY
context when jumping out of it. For example, in the block of code below, theCLOSE_TRY
macro must be used to close theTRY
context before returning from the function in the case thati > 0
. However, in theCATCH
clause, theTRY
has already been closed automatically, soCLOSE_TRY
is unnecessary.bool is_positive(int8_t i) { BEGIN_TRY { TRY { if (i == 0) THROW(EXCEPTION); if (i > 0) { CLOSE_TRY; return true; } } CATCH_OTHER(e) { return false; } FINALLY {} } END_TRY; return false; }
-
When modifying variables within a
try
/catch
/finally
context, always declare those variablesvolatile
. This will prevent the compiler from making invalid assumptions when optimizing your code because it doesn't understand how our exception model works.uint16_t multiply(uint8_t a, uint8_t b) { volatile uint16_t product = 0; volatile uint8_t multiplier = b; while (true) { BEGIN_TRY { TRY { if (multiplier == 0) THROW(1); multiplier--; product += a; THROW(2); } CATCH_OTHER(e) { if (e == 1) return product; } FINALLY {} } END_TRY; } // Suppress compiler warning return 0; }
In the above example,
a
does not need to be declaredvolatile
because it is never modified.
For greater clarity and coherence, we recommend using the error codes defined in the SDKs wherever possible (see EXCEPTION
, INVALID_PARAMETER
, etc. in os.h
). If you decide to use custom error codes, use custom error codes for each error case.
The Application crashed
An application crashing when running on the device (the device's screen freezes and stops responding to the APDU) can be caused by a number of issues. For example:
- The Secure Element is isolated due to invalid handling of SEPROXYHAL packets
- There's a core fault on the device (perhaps due to a misaligned memory access or an attempt to access restricted memory)
If it occurs, you can start by trying it on Speculos which support GDB debugging. You can also simplify the app and strip away as much code as possible until the problem can be isolated.
Unaligned RAM access
uint16_t *ptr16 = &tmp_ctx.signing_context.buffer[processed];
PRINTF("uint16_t: %d", ptr16[0]);
ptr16[0]
access can make the application crash, even though tmp_ctx.signing_context.buffer[processed]
(unsigned char*
) can be accessed. This happens when a pointer isn't word-aligned, but a word is accessed in RAM.
To prevent crashes due to unaligned RAM access, consider copying the buffer into a properly aligned location using memmove
or a similar function.
Please refer to the alignment page for further information.
Printf
The firmware supports the PRINTF
macro debugging feature, which is defined it in the Makefile.
DEFINES += HAVE_SPRINTF HAVE_PRINTF PRINTF=screen_printf
You can use PRINTF with the Speculos emulator.
Usually, PRINTF
is defined as void, with this line DEFINES += PRINTF\(...\)=
.
Check if PRINTF
is already defined somewhere in your Makefile, and if so, comment out this definition so it doesn't override.
The PRINTF
macro is a debugging feature, it must not be used in production.
When compiling an application for release, please check that PRINTF
is disabled in the Makefile:
insert the line DEFINES += PRINTF(...)=
and comment out the other one.