pooler is implemented as an OTP application. Consumers interact with the pooler gen_server. Pools and pool members are managed in a supervision tree described below (see Supervision).
Server state consists of:
{PoolName, Status,
Time}
, where Status is either 'free' or the pid of the consumer
process using the member and Time is a timestamp from
os:timestamp/0
that records the time this member was last returned
to the pool (this is used to cull inactive members).Each pool keeps track of its parameters, such as max member to allow, initial members to start, number of members in use, and a list of free members.
Since our motivating use-case is Riak's pb client, we opt to reuse a given client as much as possible to avoid unnecessary vector clock growth; members are taken from the head of the free list and returned to the head of the free list.
pooler is a system process and traps exits. Before giving out a member, it links to the requesting consumer process. This way, if the consumer process crashes, pooler can recover the member. When the member is returned, the link to the consumer process will be severed. Since the state of the member is unknown in the case of a crashing consumer, we will destroy the member and add a fresh one to the pool.
The member starter MFA should use start_link so that pooler will be linked to the members. This way, when members crash, pooler will be notified and can refill the pool with new pids.
The top-level pooler supervisor, pooler_sup, supervises the pooler gen_server and the pooler_pool_sup supervisor. pooler_pool_sup supervises individual pool supervisors (pooler_pooled_worker_sup). Each pooler_pooled_worker_sup supervises the members of a pool.
pooler will allocate up to max_count
members in a pool before
returning error_no_members
when take_member is called. If you set
max_count to a large value, you may end up with a very large pool
after a traffic spike. These members will be culled if unused for
cull_after
minutes.
Whenever a member is returned to the pool, either explicitly or
implicitly via member or consumer exit, a check for inactive members
will be made if such a check has not been made in cull_after
minutes. To cull inactive members, all pool members are examined
using the all_members dict and those that are marked 'free' with a
timestamp older than cull_after
minutes will be removed from the
pool, killed and not replaced.
For riak pb, you should probably call the ping function at start and possibly on a schedule to cull bad members.
Or perhaps provide a load-balancing scheme based on choosing the pool with the smallest in_use_count (assumes same sized pools). Could also track free count (to avoid length(free_pids)) and pick pool with largest free count.
Would it be better to make a bulk call for randomness and cache it in server state rather than calling into the crypto module for each take_member call?
It is an error to add a pool with a name that already exists.
Pool removal has two forms:
pid_stopper
is specified in the pool parameters. No pids
will be handed out from this pool's free list. As pids are
returned, they are shut down. When the pool is empty, it is
removed.-spec(take_member() -> pid()). -spec(return_member(pid(), ok | fail) -> ignore). -spec(status() -> [term()]). -type(pid_type_opt() :: {name, string()} | {max_pids, int()} | {min_free, int()} | {init_size, int()} | {pid_starter_args, [term()]}). -type(pid_type_spec() :: [pid_type_opt()]). -spec(add_type(pid_type_spec()) -> ok | {error, Why}). -spec(remove_type(string()) -> ok | {error, Why}).