Routing Events Directly to a Parallel Apache Geode AsyncEventQueue

Barry Oglesby
3 min readAug 30, 2020

--

Introduction

An Apache Geode AsyncEventQueue is used to asynchronously process events after they have been applied to a Region. They are normally used to replay Region events into a relational database or other remote data store. Other use cases want to take advantage of asynchronously processing events in parallel without actually storing entries in a Region. In these cases, each event just needs to be routed directly to the AsyncEventQueue. This behavior is effectively possible with serial AsyncEventQueues and replicated Regions. All servers can define a Region as a REPLICATE_PROXY and operations are allowed on that Region. Events go through the Region without being applied to it and are delivered to the serial AsyncEventQueue. The same cannot be done with parallel AsyncEventQueues and partitioned Regions. If all servers define a Region as PARTITION_PROXY, an operation on that Region will fail with a PartitionedRegionStorageException.

This article shows how to route events directly to a parallel AsyncEventQueue using Functions.

Parallel AsyncEventQueue Architecture

Normal Usage With Region Operations

Normally, events get delivered to AsyncEventsQueues as a result of Region operations like create, update and destroy. Here is a simplified diagram of that architecture:

Usage With Function Invocations

An alternate architecture bypasses the Region operation and replication in the diagram above and instead uses Function invocations to route the data between the primary and redundant servers. Here is a simplified diagram of that architecture:

Implementation

All of the code for this example is here.

The implementation consists mainly of a PrimaryRoutingFunction and a SecondaryRoutingFunction.

The PrimaryRoutingFunction:

  • Retrieves the routing partitioned Region
  • Creates an EntryEventImpl with the Region, key, value and EventID as parameters
  • Uses the key’s BucketRegion to set the EntryEventImpl’s tail key (which is the key in the AsyncEventQueue queue Region)
  • Gets the redundant DistributedMembers for the key
  • Invokes the SecondaryRoutingFunction on those DistributedMembers with the Region name, key, value, EventID and tail key as arguments
  • Gets the (primary) AsyncEventQueue for the routing Region
  • Gets the AsyncEventQueue’s GatewaySender
  • Distributes the EntryEventImpl to the GatewaySender

The SecondaryRoutingFunction:

  • Retrieves the routing partitioned Region
  • Creates an EntryEventImpl with the Region, key, value, EventID and tail key as parameters
  • Gets the (secondary) AsyncEventQueue for the routing Region
  • Gets the AsyncEventQueue’s GatewaySender
  • Distributes the EntryEventImpl to the GatewaySender

The BaseFunction provides methods common to both functions.

The RoutingAsyncEventListener is an AsyncEventListener that processes AsyncEvents by logging and counting them.

Create EntryEventImpl

An EntryEventImpl is created by both functions. It represents the Region operation and is delivered to the AsyncEventQueue.

The createEvent method:

  • creates the EntryEventImpl with the Region, key, value and EventID as parameters
  • sets the tail key (in the constructor if secondary; through the BucketRegion if primary)

Invoke SecondaryRoutingFunction

The PrimaryRoutingFunction invokes the SecondaryRoutingFunction on any redundant DistributedMembers. This invocation replaces the Region event replication in the normal usage. The routeToSecondaryMembers method:

  • gets the redundant DistributedMembers for the key
  • invokes the SecondaryRoutingFunction on those redundant DistributedMembers

Deliver EntryEventImpl to AsyncEventQueue

The EntryEventImpl is delivered to the AsyncEventQueue by both functions. The deliverToAsyncEventQueue method:

  • gets the AsyncEventQueue for the event’s Region
  • gets the AsyncEventQueue’s GatewaySender
  • distributes the event to the GatewaySender

Server Logging Output

Normal Usage

In normal usage, the PrimaryRoutingFunction and RoutingAsyncEventListener in the primary server will log messages like:

The SecondaryRoutingFunction in the secondary servers will log messages like:

Killed Server

If a server is killed while routing events, the server logs will contain messages like below.

In this example, the last log messages from the PrimaryRoutingFunction in the killed server were:

This server was killed before the RoutingAsyncEventListener could process these six events.

Of these six events, three of them were processed by the SecondaryRoutingFunction in one redundant server. Once the server was killed, these events were processed as possible duplicates by the RoutingAsyncEventListener:

The other redundant server processed the other three events in the same way:

Future

Two useful additions to Apache Geode would be:

  • Allowing configuration of a PARTITION_PROXY Region on all members into which operations can be done
  • An API to deliver events directly to AsyncEventQueues

--

--

Barry Oglesby
Barry Oglesby

No responses yet