Snowplow Java tracker 0.12.0 released


We are pleased to announce the release of our Java tracker version 0.12.0. In this release, we focused on improving how events are buffered and sent.
The headline new feature is the addition of retry (with exponential backoff) to event sending. Events that fail to send – HTTP request codes other than 2xx – are returned to the buffer for subsequent retry. Previously, handling events that didn’t send was the responsibility of the developer: users could provide callbacks, with the intention that the callback would retrack the failed event. We’re pleased to have now replaced the event-sending callbacks with a more sophisticated, automatic retry solution.
Adding the retry mechanism got us thinking about HTTP requests and response codes. Until now, our trackers have considered response codes to be either “successful” (2xx) or “unsuccessful” (anything else). If unsuccessful, then the tracker should keep trying to send the events: not losing data is our priority. But some response codes suggest that retrying is undesirable. For example, a collector that’s returning “401 Unauthorised” or “403 Forbidden” is unlikely to change its mind the next time it receives that request. If a request contains a huge event, and the collector returns “403 Payload Too Large”, there’s no point wasting bandwidth sending it again. This is an ongoing conversation within the Snowplow Trackers team. Let’s be clear: no retry means the event is deleted. As a temporary solution, the Java tracker now allows configuration of response codes not to retry on.
To provide greater control over event buffering, we’ve provided a new EventStore
interface, and default InMemoryEventStore
class. This class stores the payloads in a queue, specifically a LinkedBlockingDeque
. We strongly recommend setting the maximum capacity of the default event buffer queue, at initialization. This is the number of events that can be stored. When the buffer is full, new tracked payloads are dropped (data loss again!), so choosing the right capacity is important. The default buffer capacity is that of a LinkedBlockingDeque
: Integer.MAX_VALUE
. It’s likely your application would run out of memory before buffering that many events.
The Java tracker provides two Emitter
classes: BatchEmitter
and SimpleEmitter
. These classes have responsibility for buffering tracked events, and sending them to the event collector. They both work asynchronously, except that BatchEmitter
uses POST requests, and SimpleEmitter
uses GET. We couldn’t think of any occasion when BatchEmitter
wouldn’t be a better choice, so SimpleEmitter
has been deprecated. If you need to send events via GET, or synchronously, there’s also an Emitter
interface so you can use your own class.
The refactoring in this version was wide-ranging, including updating the Event
classes. In the Java tracker, events are tracked by passing an Event
object to a Tracker
object, using Tracker.track(Event)
. The built-in event types, such as PageView
or Structured
events, are all Event
subclasses. We’ve removed a couple of outdated methods from Event
. The eventId
and deviceCreatedTimestamp
are no longer Event
properties, but generated automatically during Tracker.track()
. The main purpose of the eventId
is to provide a UUID for events once they have been received by the collector and are in the pipeline. Allowing custom UUIDs could have accidentally led to non-unique “unique” identifiers, which causes big problems for pipelines, and risks data loss. In case you needed to use the eventId
elsewhere in your app, or in a third-party app, we’ve added a return type to Tracker.track()
: the eventId
UUID string of the tracked event payload. If the event buffer is full, the event is lost. In this case, null
will be returned instead.
Code aside, we are very pleased to announce brand new Java tracker documentation. The new Javadoc API docs are hosted here. On top of that, the Snowplow Java tracker docs have been completely rewritten, with additional information and a more intuitive structure.
For more information about the new Java tracker v0.12, check out the Github release notes and the migration guide. If you have any comments or questions, or if you find a bug, please raise an Issue in the Github repository. Alternatively, get in touch via our Discourse forums. Looking forward to hearing from you.