Zephyr Acquisition Framework

Status Quo

Most of the firmware I design is for acquisition systems that gather data from some number of ADCs and/or integrated I2C/SPI sensors and sends it to a host system through some PHY (USB, BLE, etc.) so the user can see and record data in real-time. Each device has its own firmware project and little is shared between them because the code is highly specialized for each microcontroller and is tested by hand. Testing by hand means it is better to keep code completely separate so changes in one doesn't break another. Some code can be copied from project to project as long as the same microcontroller is used.

The interface for each acquisition device can vary from device to device and consists of enums for command indexing and structs for data transport. Larger structs are used to configure the device and rely on the C++ acquisition software to store structs in the same way as firmware; a bad assumption.

The acquisition software "knows" what the device can do because each device has its own class with hardcoded values about the hardware and channel mapping. This has the advantage of keeping devices somewhat isolated in the acquisition software but requires any changes in firmware to be updated in the acquisition software. If different revisions of firmware change or expand channel functionality, the software must look at the firmware revision and have hardcoded decisions for each firmware revision.

Ideals

This approach introduces some challenges that I aim to mitigate through a common framework.

Testability

When a bug is discovered in firmware it is fixed and tested for each device by hand. If a bug is discovered while developing code for a new project, you have to remember to look at old code that uses the same or similar code. Combing through many different projects is tedious and distracting when you are trying to focus on problems within the current project.

Any code that is common between firmwares has to be kept quite simple to limit the risk that a bug fixed for one firmware causes issues in another. Since testing is done by hand, bugs are only discovered if I carefully test each firmware that uses the modified shared code. The acquisition core can be separated into common code, but its reliance on hardware-specific peripherals requires a layer of abstraction.

Application logic can be tested with unit tests, but they would need to be maintained for each project. Zephyr provides both the hardware abstraction layer and a framework to parse and run all tests within the acquisition module using Twister.

For every device that uses my Zephyr framework I add it to my test bench which is just a Linux server with a lot of USB hubs that are supported by uhubctl. Each device has a modified bootloader that will always enter DFU mode when the device is reset, so at the start of every test I simply cycle the power with uhubctl, flash the test app with dfu-util, and run the test, all of which is easily scripted. The USB port is set up as a composite device so I can easily associate the serial device in Linux with its associated hardware and provide useful logs for debugging test failures.

The core of my acquisition test is the same for each device, but the specific functionality to test is determined by the devicetree. For this purpose I've created test fixture nodes such as a loopback node where a DAC feeds an ADC input, so the entire test runs on the device without any hardware dependencies; everything goes through USB and is managed by the microcontroller. This keeps all documentation about the test jig right next to the firmware.

Implementing DFU from day one is a necessity, and it needs to be very thoroughly tested. My strategy is to keep the bootloader immutable so I always have a recovery path for the device, which requires the bootloader to be simple and the most heavily tested component of the system. Having the test flash devices via DFU simplifies the test setup but also exercises the DFU flow for every test. A DFU failure is a test failure.

Portability

I work with a number of microcontrollers from Nordic and Nuvoton including the nRF52 and M480 respectively. The framework needs to be portable between these microcontrollers.

Scalability

Modular, well-tested code enables quicker deployment and adaptability. Portability also plays a role in how well the application can scale to changes in hardware requirements.

Implementation

There are many different approaches to solving the problems I've identified above. Zephyr suited my needs nicely and provides the devicetree — a great tool for abstracting hardware that I use to map hardware to my acquisition framework. The build system (West) allows efficient implementation of my requirements through feature enablement (Kconfig) and inclusion through preprocessor directives which is awesome for keeping binaries trim with no runtime cost.

Device Interface

Enforcing a common interface for each device using a common, extensible serialization scheme such as protobuf simplifies the testing approach and improves readability with tests that describe behavior in stages rather than duplicating commonality to exercise the specific functionality of each device.

Zephyr provides integration with nanopb which made it easy to adopt for serialization/deserialization of interface packets (messages for acquisition control, setup, etc.). It allows me to efficiently pack data and statically allocate variably-sized struct members that change depending on the device's capability (such as acquisition channel descriptors).

Acquisition Mapping

I decided to map acquisition channels to hardware resources using the devicetree. I considered creating my own YAML parser to achieve this but decided I could achieve what I needed more readily with the devicetree. There are some challenges using the devicetree as it uses explicit language which is not always ideal for my application. For example, I want to create channel nodes to map hardware resources where the position of each channel node determines its ID. This results in somewhat complex and less intuitive macro expansions to get the index of each channel node while accounting for the channel node being enabled or disabled by setting the 'status' property. Alternatively, I could just set the channel index manually but I wanted to keep the acquisition map as clutter-free as possible so it could succinctly describe the system while avoiding indexing errors that aren't always easy to detect. If I were to explicitly set the channel ID for each node and wanted to use a devicetree overlay for different device configurations I would have to either copy the entire map into each overlay file or manually compare each overlay with the base map to ensure I wasn't missing or redefining channel IDs.

Having the acquisition map in a single location (the build system parses any overlays into a single acquisition map), I can easily generate an acquisition descriptor map that I can send back to the host system to define the capability and channel mapping of the device.

Node Interpretation

There are a few different base types of node that I want to encapsulate in the acquisition map: acquisition endpoints, acquisition sources, and output sources. The diagram below shows how they fit together at runtime for a heater controller:

Acquisition Endpoints

Acquisition endpoints define destinations where acquired data can be sent and can also process command/control packets by default. There are a couple of formats for data that offer different benefits depending on which endpoint type is used and the sample rate(s). For example, if you are recording data from multiple channels at different sample rates over USB, it is simpler to just re-sample the data and clock a frame of data at the highest sample period. This isn't a problem because I always have an excess of bandwidth on high-speed USB with the sample rates I am interested in. PHYs with higher energy costs and/or reduced throughput such as BLE require a different approach because resampled data can get expensive quickly.

Acquisition Sources

Acquisition sources provide data consumed by acquisition endpoints. Each source is bound to a specific Zephyr API such as: sensor, ADC, and GPIO. Each acquisition source node is a child of a master acquisition source node that contains all acquisition sources so I can easily parse each acquisition source into the common acquisition source's code.

A source can split data into multiple channels so I have two types of channels: compound and single. Compound channels are used when it makes more sense to read a channel in one operation then parse the values into multiple channels. This is currently only implemented for the sensor API which I use to split values from an I2C accelerometer into a channel for each axis.

Output Sources

Another requirement for the acquisition systems I work on is synchronous output to provide stimulus or support acquisition (such as a pump). I need to be able to feed back the output value to the user so they can correlate stimulus with a response (from a sensor, ADC, etc.) which means output sources also need to be able to write into the acquisition buffer.

The output source nodes I've written thus far adapt to Zephyr's GPIO and PWM APIs. Each output node has access to a sequencer by default, though I can redefine the size of the sequencer buffer reserved for each output source as required by the application. For instance, some outputs such as a pump for a gas analyzer only need to be set once and don't need to be sequenced during a recording.

Example

Let's take a look at a working example for a temperature controlled heat bed, to get a better sense for how this works in practice:

device_mapping: device-mapping {
	compatible = "iworx,device-map";
	status = "okay";

	device-name = "Sample_Heater_Controller";

	acq_endpoints: acq-endpoints {
		compatible = "iworx,acq-endpoints";
		status = "okay";

		acq_ep_usb: acq-ep-usb {
			compatible = "iworx,acq-ep-usb";
			status = "okay";

			phy = "usb";
			bus = <&hsusbd>;
			is-host-port;
		};

		acq_ep_shell: acq-ep-shell {
			compatible = "iworx,acq-ep-uart";
			status = "okay";

			phy = "uart";
			bus = <&uart1>;
			is-host-port;
		};
	};

	channel_sources: channel_sources {
		compatible = "iworx,acq-sources";
		status = "okay";

		heater_ntc_source: heater-ntc-source {
			compatible = "iworx,acq-ch-src-sensor";
			status = "okay";

			sensor = <&heater_ntc>;

			heater_thermistor_ch: heater_thermistor_ch {
				compatible = "iworx,acq-ch";
				status = "okay";

				ch-unit = "AMBIENT_TEMP";
				max-sample-rate-hz = <10000>;
			};
		};
	};

	output_sources: output-sources {
		compatible = "iworx,output-sources";
		status = "okay";

		pwm_output {
			compatible = "iworx,output-ch-src-pwm";
			status = "okay";
			
			pwms =  <&epwm1 5 PWM_USEC(100) PWM_POLARITY_NORMAL>;  // Heater pwm

			heater_drv_ch: heater_drv_ch {
				compatible = "iworx,output-ch";
				status = "okay";

				min-output = <0>;
				max-output = <70>;

				unit = "LIGHT_DUTY_CYCLE";
				scale = <1>;
				max-val = <255>;
				units-per-lsb = <1>;
			};
		};
	};

	feedback_sources: feedback-sources {
		compatible = "iworx,feedback-sources";
		status = "okay";

		heater_ctrl: heater-ctrl {
			compatible = "iworx,feedback";
			status = "okay";

			default-mode = "PID";
			input = <&heater_thermistor_ch>;
			output = <&heater_drv_ch>;
			use-callback;

			heater_pid: heater-pid {
				compatible = "iworx,fb-ctrl-pid";
				status = "okay";

				p = <600>;
				i = <10>;
				d = <0>;
				div = <1000>;
			};
		};
	};
};

Notice that I have all my acquisition endpoint nodes in an "iworx,acq-endpoints" compatible, all my acquisition sources in an "iworx,acq-sources" compatible, and all of my output sources in an "iworx,output-sources" compatible. Grouping these three basic types into a single node simplifies parsing from the devicetree into an array managed by common code for each respective type.

Device Info

The device name is parsed by a macro and expands to an enum defined by the protobuf file that defines the interface for every acquisition device. This is one of the fields populated in a device's info packet which is also defined in the acquisition device proto file. The phy and name of each acquisition endpoint is also added to the device info packet along with acquisition and output channels. When the acquisition software requests info about the device it returns an info packet which it uses to populate the UI.

Acquisition Configuration

When configuration and setup data is received by an acquisition endpoint that has message parsing enabled it is parsed into a struct defined in the acquisition device proto file by nanopb. The struct is then published into Zephyr's messaging bus which is called Zbus. Every packet type is defined in a single device packet union using protobuf's oneof to keep deserialization straightforward.

Build System

The acquisition core connects to Zephyr's device APIs through intermediate "API adapters" which parse acquisition messages and set up each enabled device accordingly. For example, when a "iworx,acq-ch-src-gpios" compatible node is defined, the "CONFIG_IWORX_ACQ_CH_SRC_GPIO" is defined because it is enabled by default but depends on the "DT_HAS_IWORX_ACQ_CH_SRC_GPIO" symbol which is defined by Zephyr's build system.

Summary

The greatest benefit to all this systemization and modularity can be summarized simply: reliability. Whether you have just one product or numerous, there is enormous benefit in pipelining firmware to pull in as many well-tested components as possible, so you can start working on your next great piece of firmware with an inherent testing framework, a singular interface, and a solid update strategy.