Post

E-Paper Calendar V2: The Rabbit Hole of AI Hardware Integration

E-Paper Calendar V2: The Rabbit Hole of AI Hardware Integration

This article is migrated from Medium and translated by Gemini pro 2.5.


(This post was generated by Gemini 2.5 Pro, with personal revisions.)

In the last post, we finally validated the V2.1 hardware design, successfully getting the sleep current down to 70uA. This means the hardware part has “graduated” and doesn’t need another revision!

So, I joyfully returned to the world of software, ready to start integrating the main program. My plan was as follows:

  1. Use a State Machine to manage the main flow, making the code structure clearer.
  2. Start with an empty state machine skeleton (e.g., INIT -> IDLE -> SLEEP).
  3. Add features one by one, like integrating hw_init first, then wifi_connect… This avoids letting the AI leap too far ahead, which makes debugging difficult.

I naively thought that since the hardware was all verified, the integration would be an easy street.

Turns out, this street is full of traps.

Trap 1: The Silent STATE_HW_INIT

I started with the STATE_HW_INIT state. I asked gemini-cli to help me integrate all the hardware initialization code (GPIO, I2C, SPI) from my old hw_selftest project.

Code generated, compiled, flashed… and then, the program hung.

No Error, no Panic, not even a single line of log output. It just ran through the boot-up logs, entered app_main, and then vanished into the abyss.

At first, I thought AI was being dumb again. Did it write some bad logic? I started simplifying. I kept only the GPIO initialization: it worked. I kept only the I2C initialization: it also worked. But as soon as spi_init() was added, the program hung.

This was weird. In my hw_selftest project, both I2C and SPI had passed their tests!

After spending a lot of time simplifying and retrying, I still couldn’t find the problem. I eventually turned to the web version of Gemini to discuss the phenomenon. It mentioned: “Could this be a hardware resource conflict?”

Following that line of thought, I started commenting/uncommenting things one by one, and finally discovered the devil in the details: If SPI is initialized before I2C, the I2C driver hangs during its own initialization!

Looking back, I had an epiphany. My hw_selftest project, to keep tests isolated, worked like this:

[— 程式碼區塊 1: hw_selftest 流程 —]

Because I2C and SPI basically never existed at the same time in the selftest, this conflict was never discovered. But in the main program, I needed them both initialized, and this hidden bug exploded.

Trap 2: Wi-Fi “Hardware Conflict” Déjà Vu

After finally solving the hw_init problem, I moved on to the next state: STATE_SYNC_TIME. This state’s job is to connect to Wi-Fi, then access an IP geolocation service (ip-api.com) to automatically get the timezone.

And… the exact same nightmare happened again.

The program would get to the HTTP request, hang right there, give no log output, and eventually be rebooted by the Watchdog Timer (WDT).

Again, I spent hours simplifying the task and tweaking the code, but it still hung at the same spot.

This time, I was smarter. I created a brand new, clean project and copied the Wi-Fi and timezone-fetching code into it verbatim to test.

The result: No problem at all. Wi-Fi connected, and it successfully fetched the JSON data.

Now I was certain. I took this 100% working code and ported it back into my main project… and it hung again.

That feeling of dread washed over me. I remembered the fear of “hardware conflict.” I went back to my STATE_HW_INIT state, commented out the spi_init() line…

And sure enough, the Wi-Fi and HTTP request started working magically.

Reflection: AI’s Limits vs. Human Experience

These two painful debugging experiences gave me a new understanding of using AI in low-level development.

  1. AI Can’t Read “Silence”: When an ESP32 has a hardware conflict or resource deadlock (like an I2C bus being pulled low, causing a while(1) wait), it doesn’t print any logs. Lacking any clues, the AI can’t identify this as a “hardware conflict.” It just keeps guessing that some software logic is wrong.
  2. AI Lacks “Debugging Experience”: When the program hangs and eventually triggers the WDT, AI easily “drills into the wrong hole.” It sees the WDT log and keeps giving suggestions in the wrong direction, like “Why did the WDT trigger?” or “You need to feed the dog,” which wasted a lot of my time.

In this scenario, you still have to rely on human experience. That intuition to “test in a new project” or “try commenting out spi_init” is something AI can’t provide right now. AI can help us write 80% of the code, but that last, most difficult 20% of bugs still has to be solved by a human brain.

New Strategy and Conclusion

After these two shocks, I decided to adopt a new software strategy: The SPI bus will remain in a “de-initialized” state by default.

I will only call init_spi() in the STATE_DISPLAY_UPDATE state when I truly need to update the screen. Immediately after the update, I will call deinit_spi() to release the resources. This will minimize the possibility of it conflicting with the I2C or Wi-Fi modules.

The pits of hardware integration are far deeper than I imagined. I hope that after solving these two major conflicts, the rest of the integration work can go a little more smoothly!

This post is licensed under CC BY 4.0 by the author.