Smart Global Replication Using Reinforcement Learning at KubeCon 2023

There are many great reasons to replicate data across Kubernetes clusters in different geographic regions: e.g. for disaster recovery and to ensure the best possible user experiences. Unfortunately, global replication is not easy; not just because of the difficulty in consistency reasoning that it introduces, but also due to the increased cost of provisioning multiple volumes that exponentially duplicate ingress and egress. Wouldn’t it be great if our systems could learn the optimal placement of storage blocks so that total replication was not necessary? Wouldn’t it be even better if our replication messaging was reduced ensuring communication only between the minimally necessary set of storage nodes? We show a system that uses multi-armed bandits to perform such an optimization; dynamically adjusting how data is replicated based on usage. We demonstrate the savings achieved and system performance using a real world system: the TRISA Global Travel Rule Compliance Directory.

More Details

Nov 7, 2023

Featuring

Benjamin Bengfort

Smart Global Replication Using Reinforcement Learning at KubeCon 2023

More Details

Featuring

Recommended Resources

Defend Your Moat: 4 Practical AI Strategies for 2024

Developer Voices: Crafting Your Own Distributed Code with Benjamin Bengfort

How To Build a Distributed System (And Should You?) at KubeCon 2022