Avro Schemas
Apache Avro is a compact, row-oriented serialization format designed for high-throughput data systems. MAPS treats Avro as a first-class schema type, with tight integration into the Typed Event pipeline.
1. Format Overview
Avro defines data using a JSON schema and encodes records in a compact binary format.
Key characteristics
- Schema stored as JSON, data encoded as binary
- Strong typing with support for:
- records, arrays, maps
- enums, unions, fixed, logical types
- Well-suited for:
- telemetry streams
- log/event pipelines
- long-lived topic-based data with evolution over time
Why use Avro in MAPS?
- Efficient binary encoding
- Built-in schema evolution features (defaults, aliases, unions)
- Good fit for high-volume IoT and analytics streams
- Plays well with downstream big-data / lake / warehouse tooling
2. SchemaConfig for Avro
All Avro schemas in MAPS are stored as a SchemaConfig:
formatmust be"avro".schemaholds the Avro JSON schema.schemaBase64is typically null for Avro.labelscarry routing and discovery metadata (including CoAP interface/resource when exposed over CoAP).
2.1 Required fields for Avro
At the SchemaConfig level:
format→"avro"name→ logical schema nameversionId→ logical schema versionschema→ valid Avro JSON schemalabels.matchExpression→ regex mapping topics to this schemalabels.uniqueId→ stable schema identifierlabels.interface→ optional: CoAPifvalue if exposed via CoAPlabels.resource→ optional: CoAPrtvalue if exposed via CoAP
3. Example Avro SchemaConfig (BME688)
Below is an example Avro-based SchemaConfig for the BME688 sensor payload.
{
"versionId": "1",
"name": "BME688-Avro",
"description": "BME688 VOC, pressure, temperature and humidity telemetry (Avro-encoded)",
"labels": {
"comments": "I2C device BME688 VOC, Pressure, Temperature and Humidity Sensor",
"uniqueId": "b1dc43de-4c9b-5d86-9425-cf958eeb598d",
"resource": "sensor",
"interface": "sensor.bme688"
},
"format": "avro",
"schema": {
"type": "record",
"name": "BME688Reading",
"namespace": "io.mapsmessaging.sensors",
"fields": [
{
"name": "temperature",
"type": "double",
"doc": "Unit: °C, range -40.0 to 85.0"
},
{
"name": "humidity",
"type": "double",
"doc": "Unit: %RH, range 10.0 to 90.0"
},
{
"name": "pressure",
"type": "double",
"doc": "Unit: hPa, range 300.0 to 1100.0"
},
{
"name": "gas",
"type": "double",
"doc": "Unit: Ω, range 0.0 to 65535.0"
},
{
"name": "heaterStatus",
"type": "string"
},
{
"name": "gasMode",
"type": "string"
},
{
"name": "dewPoint",
"type": "double",
"doc": "Unit: °C, range -50.0 to 100.0"
},
{
"name": "condensationRisk",
"type": "double",
"doc": "Risk score in [0.0, 1.0]"
},
{
"name": "timestamp",
"type": {
"type": "long",
"logicalType": "timestamp-millis"
},
"doc": "Event time, epoch millis"
}
]
}
}
Notes:
- The Avro schema sits directly in
schemaas standard Avro JSON. timestampuses Avro'slogicalType: "timestamp-millis"to align with MAPS' normalised time handling.- Ranges and units are carried in the Avro
docfield.
4. How MAPS Uses Avro Schemas
At runtime, MAPS:
- Resolves the
SchemaConfigby topic viamatchExpression/ bindings. - Loads the Avro JSON schema from
schema. - Uses the Avro schema to decode binary Avro payloads into a Typed Event:
- field names and types come from the Avro schema
- logical types (like timestamps) are normalised internally
- The Typed Event flows through:
- filtering
- transformations
- statistics
- format conversion (e.g. Avro → JSON / Protobuf / CBC)
Schema evolution rules defined at the Avro level (e.g. added fields with defaults) are respected when decoding.
5. Warnings & Best Practices
- Keep
namespacestable; it forms part of the Avro type identity. - Prefer
doublefor sensor telemetry to avoid unnecessary rounding artefacts. - Use Avro logical types where appropriate:
timestamp-millis/timestamp-microsfor event timedatefor date-only values
- When changing schemas:
- add fields with sensible defaults
- avoid incompatible type changes
- use aliases when renaming fields
- Only use
schemaBase64for Avro if you truly need to store a compiled/binary representation; otherwise keep the canonical form as Avro JSON inschema.
6. Example
This java example will load an Avro schema from file and construct a AvroSchemaConfig to then use
public static AvroSchemaConfig getAvroSchema(String name, String title, String description, String matcher, String type) throws IOException {
String schemaFile = "";
File file = new File("./src/main/avro/"+name+".avsc");
try (InputStream is = new FileInputStream(file.getAbsolutePath())) {
schemaFile = new String(is.readAllBytes(), StandardCharsets.UTF_8);
}
UUID schemaId;
try {
schemaId = UuidGenerator.getInstance().generate(NamedVersions.SHA1,uuid, file.getAbsolutePath() );
} catch (NoSuchAlgorithmException e) {
e.printStackTrace();
schemaId = UuidGenerator.getInstance().generate();
}
JsonElement element = JsonParser.parseString(schemaFile);
AvroSchemaConfig config = new AvroSchemaConfig();
config.setSchema(element.getAsJsonObject());
config.setComments(description);
config.setTitle(title);
config.setVersion(1);
config.setMatchExpression(matcher);
config.setUniqueId(schemaId);
config.setResourceType(type);
return config;
}
Example of a file called ballast.avsc
{
"type": "record",
"name": "BallastTelemetry",
"namespace": "io.mapsmessaging.ship",
"fields": [
{ "name": "fore_tank_level", "type": "float" },
{ "name": "aft_tank_level", "type": "float" },
{ "name": "stbd_tank_level", "type": "float" },
{ "name": "port_tank_level", "type": "float" }
]
}
Example of a file called cargo.avsc
{
"type": "record",
"name": "CargoMonitorTelemetry",
"namespace": "io.mapsmessaging.ship",
"fields": [
{ "name": "container_temp", "type": "float" },
{ "name": "humidity", "type": "float" },
{ "name": "shock_detected", "type": "boolean" }
]
}
Example of a file called engine-room.avsc
{
"type": "record",
"name": "EngineRoomTelemetry",
"namespace": "io.mapsmessaging.ship",
"fields": [
{ "name": "rpm", "type": "int" },
{ "name": "oil_pressure", "type": "float" },
{ "name": "temperature", "type": "float" }
]
}