ESP32-CAM Monitor: DIY Auto Flash for Dark Scenes
Why
I built a surveillance camera with ESP32-S3 before, and it worked well. Later, while rummaging through a drawer, I found an AI-Thinker ESP32-CAM development board — that classic board costing about ten bucks with a built-in OV2640 camera and TF card slot. No reason to let it go to waste, so I built another one: ai-thinker-esp32-cam.
This time I wrote the firmware from scratch using ESP-IDF again, with similar capabilities to the previous project but with lots of adaptations for the AI-Thinker board. Here’s what it ended up doing:
- Real-time MJPEG stream viewable in browser by opening the device IP
- Motion detection with auto photo capture
- Auto flash for photo capture in dark environments
- Save photos to TF card, with WebDAV upload to NAS
- Web management interface, configurable from phone
- Prometheus
/metricsendpoint for monitoring integration - WiFi AP/STA dual mode, phone-based network config on first boot
The features aren’t complex, but the flash logic took me several days. The reason is simple: I was too lazy to add a photosensor. Saved a few wires, but wrote a few hundred extra lines of code.
Flash Logic: The Price of Skipping a Sensor
The requirement was one sentence; the implementation took days
Automatically turn on the flash when it gets dark — that’s the whole requirement. Adding a photoresistor to an ADC pin and reading a voltage value would take about half an hour. But the AI-Thinker ESP32-CAM has tight GPIO availability — the flash uses GPIO4, TF card uses GPIO2/14/15, the camera has a bunch of pins, and there aren’t many left. Adding another sensor would require a breadboard, Dupont wires, a voltage divider resistor… just thinking about it felt like too much work.
So why not use the camera itself to sense brightness? The camera is a light sensor, after all — bright scene means bright image, dark scene means dark image. The catch is that the camera is always outputting in JPEG mode (for MJPEG streaming and motion detection), so I needed to determine ambient brightness without disrupting normal operation.
Reading OV2640 exposure registers — didn’t work
My first thought was to read the OV2640’s AEC (Auto Exposure Control) registers. In theory, the exposure value reflects ambient brightness: darker scenes mean higher exposure, zero-latency zero-overhead reading. Perfect.
In practice, this register was completely unreliable in continuous JPEG output mode — the aec_value stayed locked at the max value of 671 regardless of actual lighting conditions. I checked the OV2640 datasheet but found no clear explanation. It might be a bug in the AEC feedback loop under continuous output mode, or just a “special feature” of this sensor. Either way, this approach was a dead end.
Grayscale pixel sampling: using the camera as a photosensor
Since reading registers didn’t work, let’s read pixels directly. The approach is straightforward:
- Do brightness probing every 30 seconds
- Temporarily switch the camera from JPEG mode to grayscale mode (GRAYSCALE + QQVGA 160×120)
- Grab one frame, iterate through all pixels, calculate average brightness
- Switch back to JPEG mode and continue
| |
QQVGA 160×120 totals 19,200 pixels — iterating through them takes a few hundred microseconds, negligible performance impact. In grayscale mode, each pixel is one byte (0~255), just sum and average, simple and crude.
However, there’s a prerequisite: if someone is viewing the MJPEG stream in a browser, you can’t do grayscale probing — switching modes would break the MJPEG stream. In that case, we fall back to the alternative below.
JPEG size as brightness proxy: a zero-cost hack
When someone is watching the stream, you can’t switch modes, but you still need to determine brightness. What to do? The answer is hidden in every JPEG frame: JPEG file size itself is a brightness indicator.
The principle is simple. In dark scenes, most of the frame is black — large areas of uniform color compress very efficiently in JPEG, resulting in small files. The brighter the scene with more detail, the larger the file. I tested at SVGA resolution with quality=10 and got these numbers:
- Dark (lights off): 12~14 KB
- Normal indoor: 14~17 KB
- Bright (facing window): 17~25 KB
Based on this I created a linear mapping:
| |
Less accurate than grayscale sampling, but with zero additional overhead — motion detection already grabs frames for frame-differencing, and JPEG size is obtained as a side effect without doing anything extra.
The relationship between the two approaches:
- Grayscale probing (method=2): Accurate, but requires mode switching, not usable when someone is viewing the stream
- JPEG size estimation (method=1): Moderate accuracy, but always available with zero extra cost
flowchart TD
A["Per-frame motion detection"] --> B@{shape: diam, label: "Grayscale probing available?"}
B -->|"Probed within 30s & no MJPEG clients"| C["Use grayscale result"]
B -->|"MJPEG client online or timeout"| D["Estimate via JPEG size"]
C --> E@{shape: diam, label: "Brightness < threshold?"}
D --> E
E -->|"Yes"| F["Mark as dark scene"]
E -->|"No"| G["Mark as bright scene"]
F --> H["Flash ON when motion detected"]
G --> I["Normal photo capture"]
classDef decision fill:#fff3e0,stroke:#ff9800,stroke-width:2px
classDef process fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef result fill:#e8f5e9,stroke:#4caf50,stroke-width:2px
class B,E decision
class A,C,D,H,I process
class F,G resultFlash control: 80% duty cycle to be safe
The AI-Thinker ESP32-CAM flash is connected to GPIO4. There’s a hardware trap here: the board has no current-limiting resistor for the flash. The LED is directly connected to the GPIO. Running at full power could exceed current limits and potentially damage the board over time.
So I used LEDC PWM control, capping the duty cycle at 80% (205/255), keeping a safety margin:
| |
Note that the Timer and Channel can’t be chosen arbitrarily. The camera XCLK uses Timer 0 / Channel 0, so the flash must use Timer 1 / Channel 1, otherwise the two peripherals will conflict. This kind of trap isn’t mentioned in the ESP-IDF documentation — you only find out when things don’t work.
Flash only turns on during photo capture, never during detection
This design decision is crucial. The flash only turns on when frame-differencing detects motion and a photo needs to be saved. The flash is never on during frame-differencing comparison.
The reason is obvious: turning the flash on and off causes dramatic brightness changes between frames. The frame-differencing algorithm would see the entire scene “moving” and report 80%+ differences, all false alarms.
| |
After turning on the flash, I wait 200ms before capturing. Right after the flash turns on, the white balance hasn’t caught up yet — colors would be off. Waiting lets the OV2640’s auto white balance stabilize for a usable photo.
Auto-lowering motion detection threshold in dark scenes
One more detail: in dark scenes, JPEG frame-differencing values are naturally lower (darker scenes have fewer details, less inter-frame difference after compression). Using normal thresholds might miss motion events. So in dark scenes, the threshold is automatically reduced to one quarter:
| |
No need to elaborate — a moment’s thought explains why.
Other Features
The flash consumed the most development time. The rest is straightforward, just a listing:
- MJPEG stream: Double buffering + PSRAM, viewable directly in browser
- WiFi management: AP/STA dual mode, AP mode for first-time network config, then auto STA
- TF card: GPIO14 is shared with camera — must mount TF card after camera init, wrong order causes crashes
- NAS upload: WebDAV/HTTP POST, auto-upload after capture
- Web interface: Configure parameters, check status, download photos — all from phone
- Prometheus metrics:
/metricsendpoint, feed into Grafana dashboard
Closing Thoughts
Looking back, soldering a photoresistor would have taken half an hour. But using the camera itself for brightness detection forced me to thoroughly understand OV2640 mode switching, JPEG compression characteristics, LEDC PWM resource allocation, and other details. Was it worth it? Hard to say, but the tinkering process was certainly interesting.
All code is on GitHub: Mi-Bee-Studio/ai-thinker-esp32-cam, with build and flash instructions in the README.