Logging Apache Geode PartitionedRegion Primary and Secondary Bucket Locations
An Apache Geode PartitionedRegion partitions its entries into buckets among all the servers where it is defined. Properties that affect the number and location of the buckets include total-num-buckets and redundant-copies. The total-num-buckets configures the number of buckets across all the members of the DistributedSystem. The redundant-copies configures the number of copies of each bucket. The primary bucket is hosted on one server, and if redundant-copies is greater than zero, the secondary buckets are hosted on other servers.
In addition, the redundancy-zone property helps determine where buckets are located. If two redundancy zones are defined and redundant-copies is one (meaning 2 copies of each bucket), then the primary bucket will be in a member in one zone, and the secondary bucket will be in a member in the other zone.
This article is a companion to my Logging Apache Geode PartitionedRegion Entry Details Per Bucket article. It provides an example of a compact view of the primary and secondary bucket locations per server and redundancy zone.
All source code described in this article as well as an example usage is available here.
The GetBucketIdsFunction is executed on each server. It:
- gets the PartitionedRegion for the input region name
- gets the member’s redundancy zone
- gets the configured number of buckets for the PartitionedRegion
- gets the list of local bucket ids for the PartitionedRegion
- gets the list of local primary bucket ids for the PartitionedRegion
- creates and returns a ServerBucketIds object containing these values
The AllBucketIds object contains:
- all bucket ids per server
- primary bucket ids per server
- all bucket ids per redundancy zone
- primary bucket ids per redundancy zone
- total number of bucket ids
- total number of primary bucket ids
- missing bucket ids per redundancy zone
- extra bucket ids per redundancy zone
Execute the GetBucketIdsFunction
The GetBucketIdsFunction execute method first gets the PartitionedRegion. The PartitionedRegion provides the configured number of buckets. Its PartitionedRegionDataStore provides the local bucket ids and the local primary bucket ids. The redundancy zone is retrieved from the DistributionConfig. Finally, the Function creates and returns the ServerBucketIds object.
Process the ServerBucketIds Result
The GetBucketIdsResultCollector addResult method is called on the client when the ServerBucketIds result from each server is received. The method calls AllBucketIds process to process the ServerBucketIds object like:
The AllBucketIds process method sets the region name, configured number of buckets and the redundancy zones per server. In addition, it updates the bucket ids per server and redundancy zone.
The AllBucketIds updateBucketsPerServer method:
- sorts each server’s bucket ids and primary bucket ids
- updates all bucket ids per server and primary bucket ids per server
- increments the total number of bucket and primary ids
The AllBucketIds updateBucketsPerRedundancyZone method:
- gets the server’s redundancy zone
- adds the server’s bucket ids to the redundancy zone bucket ids
- adds the server’s primary bucket ids to the redundancy zone primary bucket ids
- sorts the redundancy zone bucket ids and primary bucket ids
Display the Results
The AllBucketIds getDisplayString method builds the message containing the primary and secondary bucket locations per server and redundancy zone like:
Client Logging Output
Executing the GetBucketIdsFunction will cause the client to log a message like this showing the primary and secondary bucket locations per server and redundancy zone:
A gfsh command and Function that provides PartitionedRegion primary and secondary bucket locations per server and redundancy zone like this example would be a useful addition to Apache Geode.