Using OTP / Presence for Real-time Analytics by Jacqui Manzi & Matt McClure
Nice talk how requirements were perfectly implemented with various features available in Elixir OTP and Phoenix.
Notes
-
Problem - Subscribing to data updates isn't an option
Solution - In Elixir, create a process driven subscription system that polls the database -
Problem - Does it work for more than one user? We don't know if a subscription already exists and a second connection could spin up a duplicate process
Solution - Register the subscription process with the subscription ID. We can check if the process name has been registered before creating a subscription supervision tree -
Problem - Nodes are in a connected cluster behind a load balancer. A websocket connection can be routed to any node in the cluster and can produce duplicate subscription supervision trees
Solution - User globally registered processes that are accessible across nodes. Register the subscription names globally and check if one exists before adding a subscription -
Problem - If a node hosting global process goes down, connections on other nodes will stop receiving messages. We need a way to re-create a process on another node.
Solution - Monitor the global process from another node. i.e. When a process is abnormally killed, we can re-create it on an existing node. -
Problem - We need a way to remove subscription supervision trees when all sockets have left a topic
Solution - Use Phoenix Presence to track when all users have left a channel topic. And cleanup the subscription supervision trees accordingly- Phoenix Presence
- Channel topic registry
- Replicated transparently across nodes
- Communicates over Phoenix PubSub
- Phoenix Presence
-
Isn't Phoenix Presence eventually consistent? That seems like a bad idea for keeping track of state
Yes.. but we only rely on Presence for telling us when a channel topic list is empty and we know the node hosting our subscription process will eventually (and quickly) be given that information -
Do process removals always work?
99.9999999% of the time they do. We were able to use Presence and Erlang to create tools to notify and remove zombie processes -
What about the race condition when...
Yes.. there are scenarios where a global subscription could be registered on two nodes. However, eventually erlang will detect the name clash and call theResolved
function