Pre-Boot enviroment weirdness
The world of early board boot is a weird one. Hard-and-fast rules of programming don't necessarily apply here, and things have an annoying tendency to work only sometimes. I finally revisited the pre-boot environment this week, due in part to bringing up the dual-core version of Novena, which differs from the quad-core version in a few subtle ways.
The most obvious difference is the reduced on-chip SRAM size. The quad offers 256 kB of on-chip RAM we can use for bootloader initialization, but the dual only offers 128 kB. We actually get less than that due to page tables and the like, so in the end we have around 68 kB to work with. Our previous U-Boot image was 190 kB, so there was a long way to go to trim down the image.
DDR3 is very sensitive when it comes to features such as circuit trace length and impedence, and these values must be calibrated for each board and chip combination. Most other i.MX6 platforms, and indeed most embedded ARM platforms have a fixed amount of DDR3 RAM soldered onto the board with fixed timings. In these situations, each board is identical, so they use a script to set up the DDR3 timing registers before their bootloader gets loaded. Fixed values get blasted in by a "pokefile", so there's no timing variance to worry about. Since Novena allows a user to swap in any DDR3 SO-DIMMs with any vaguely conforming impedence or trace length, we must perform a DDR3 calibration at boot time.
The solution is in the form of the U-Boot Secondary Program Loader, or SPL. The SPL is a stripped-down environment that allows limited access to many drivers, and is designed to allow the developer to set up enough of the system to load a full bootloader. The i.MX6 SPL contains I2C, MMC, and FAT drivers, which is exactly what we need to load a secondary bootloader (or even a full Linux kernel). The compiled SPL image clocks in at 48 kB, which easily fits within even our smallest budget and allows us to run entirely out of on-chip SRAM. It's weird booting a laptop without any RAM installed, but it actually works.
The U-Boot SPL calibrates DDR3, then loads a file called "bootloader" off of the internal MMC card to DDR3 memory and jumps to it. This is a perfect setup, because the U-Boot SPL can hide in the first 48 kB of the SD card, and the actual bootloader can reside on a Windows-readable FAT partition. Fantastic.
There were a bunch of problems with setting up this situation, but perhaps the most bizarre was one in which my variables weren't getting set properly. In normal programming, you expect this code to work:
int i = 64;
void f() {
printf("%d", i); // prints 64
}
Except that wasn't what was happening. Instead, I was getting random values. Instead of printing 64, it would print 74298175, or 480228053, or some other random value. Worse, the random values were coming out of the dlmalloc() code, so malloc was failing even though it had plenty of memory available.
The solution I came up with was to re-initialize all malloc variables when U-Boot initialized dlmalloc. I zeroed out values that should be zero, and set up boundary values that needed to be set. For some reason it was trying to align memory on byte boundaries of several million, rather than on boundaries of 8. I don't quite understand why this is necessary. My best guess is that it has something to do with a value overlapping another in the BSS, which might not be getting zeroed properly. But I don't know for certain.
By re-initializing the malloc parameters, I finally got U-Boot's SPL to work reliably. Now I have the dual-lite version booting Linux, and am ready to work to speed up booting.