|
@@ -1,149 +1,174 @@
|
|
|
-* pidq - A Process Pool Library for Erlang
|
|
|
-
|
|
|
-*Note:* this is all work very much in progress. If you are
|
|
|
- interested, drop me a note. Right now, it is really just a readme
|
|
|
- and no working code.
|
|
|
-
|
|
|
-** Use pidq to manage pools of processes (pids).
|
|
|
-
|
|
|
-- Protect the pids from being used concurrently. The main pidq
|
|
|
- interface is =pidq:take_pid/0= and =pidq:return_pid/2=. The pidq
|
|
|
- server will keep track of which pids are *in use* and which are
|
|
|
- *free*.
|
|
|
-
|
|
|
-- Maintain the size of the pid pool. Specify a maximum number of pids
|
|
|
- in the pool. Trigger pid creation when the free count drops below a
|
|
|
- minimum level or when a pid is marked as failing.
|
|
|
-
|
|
|
-- Organize pids by type and randomly load-balance pids by type. This
|
|
|
- is useful when the pids represent client processes connected to a
|
|
|
- particular node in a cluster (think database read slaves). Separate
|
|
|
- pools are maintained for each type and a request for a pid will
|
|
|
- randomly select a type.
|
|
|
+* pooler - An OTP Process Pool Application
|
|
|
+
|
|
|
+The pooler application allows you to manage pools of OTP behaviors
|
|
|
+such as gen_servers, gen_fsms, or supervisors, and provide consumers
|
|
|
+with exclusive access to pool members using pooler:take_member.
|
|
|
+
|
|
|
+** What pooler does
|
|
|
+
|
|
|
+- Protects the members of a pool from being used concurrently. The
|
|
|
+ main pooler interface is =pooler:take_member/0= and
|
|
|
+ =pooler:return_member/2=. The pooler server will keep track of
|
|
|
+ which members are *in use* and which are *free*. There is no need
|
|
|
+ to call =pooler:return_member= if the consumer is a short-lived
|
|
|
+ process; in this case, pooler will detect the consumer's normal exit
|
|
|
+ and reclaim the member.
|
|
|
+
|
|
|
+- Maintains the size of the pool. You specify an initial and a
|
|
|
+ maximum number of members in the pool. Pooler will trigger member
|
|
|
+ creation when the free count drops to zero (as long as the in use
|
|
|
+ count is less than the maximum). New pool members are added to
|
|
|
+ replace member that crash. If a consumer crashes, the member it was
|
|
|
+ using will be destroyed and replaced.
|
|
|
+
|
|
|
+- Manage multiple pools. A common configuration is to have each pool
|
|
|
+ contain client processes connected to a particular node in a cluster
|
|
|
+ (think database read slaves). By default, pooler will randomly
|
|
|
+ select a pool to fetch a member from.
|
|
|
|
|
|
** Motivation
|
|
|
|
|
|
-The need for the pidq kit arose while writing an Erlang-based
|
|
|
-application that uses [[https://wiki.basho.com/display/RIAK/][Riak]] for data storage. When using the Erlang
|
|
|
-protocol buffer client for Riak, one should avoid accessing a given
|
|
|
-client concurrently. This is because each client is associated with a
|
|
|
-unique client ID that corresponds to an element in an object's vector
|
|
|
-clock. Concurrent action from the same client ID defeats the vector
|
|
|
-clock. For some further explaination, see [1] and [2].
|
|
|
-
|
|
|
-I wanted to avoid spinning up a new client for each request in the
|
|
|
-application. Riak's protocol buffer client is a =gen_server= process
|
|
|
-that initiates a connection to a Riak node and my intuition is that
|
|
|
-one doesn't want to pay for the startup time for every request you
|
|
|
-send to an app. This suggested a pool of clients with some management
|
|
|
-to avoid concurrent use of a given client. On top of that, it seemed
|
|
|
-convenient to add the ability to load balance between clients
|
|
|
-connected to different nodes in the Riak cluster. The load-balancing
|
|
|
-is a secondary feature; even if you end up setting up [[http://haproxy.1wt.eu/][HAProxy]] for that
|
|
|
-aspect, you might still want the client pooling.
|
|
|
+The need for pooler arose while writing an Erlang-based application
|
|
|
+that uses [[https://wiki.basho.com/display/RIAK/][Riak]] for data storage. Riak's protocol buffer client is a
|
|
|
+=gen_server= process that initiates a connection to a Riak node. A
|
|
|
+pool is needed to avoid spinning up a new client for each request in
|
|
|
+the application. Reusing clients also has the benefit of keeping the
|
|
|
+vector clocks smaller since each client ID corresponds to an entry in
|
|
|
+the vector clock.
|
|
|
+
|
|
|
+When using the Erlang protocol buffer client for Riak, one should
|
|
|
+avoid accessing a given client concurrently. This is because each
|
|
|
+client is associated with a unique client ID that corresponds to an
|
|
|
+element in an object's vector clock. Concurrent action from the same
|
|
|
+client ID defeats the vector clock. For some further explanation,
|
|
|
+see [1] and [2]. Note that concurrent access to Riak's pb client is
|
|
|
+actual ok as long as you avoid updating the same key at the same
|
|
|
+time. So the pool needs to have checkout/checkin semantics that give
|
|
|
+consumers exclusive access to a client.
|
|
|
+
|
|
|
+On top of that, in order to evenly load a Riak cluster and be able to
|
|
|
+continue in the face of Riak node failures, consumers should spread
|
|
|
+their requests across clients connected to each node. The client pool
|
|
|
+provides an easy way to load balance.
|
|
|
|
|
|
[1] http://lists.basho.com/pipermail/riak-users_lists.basho.com/2010-September/001900.html
|
|
|
[2] http://lists.basho.com/pipermail/riak-users_lists.basho.com/2010-September/001904.html
|
|
|
|
|
|
** Usage and API
|
|
|
|
|
|
-*** Startup configuration
|
|
|
+*** Pool Configuration
|
|
|
|
|
|
-The idea is that you would wire up pidq to be a supervised process in
|
|
|
-your application. When you start pidq, you specify a module and
|
|
|
-function to use for creating new pids. You also specify the
|
|
|
-properties for each pool that you want pidq to manage, including the
|
|
|
-arguments to pass to the pid starter function.
|
|
|
-
|
|
|
-An example configuration looks like this:
|
|
|
+Pool configuration is specified in the pooler application's
|
|
|
+environment. This can be provided in a config file using =-config= or
|
|
|
+set at startup using =application:set_env(pooler, pools,
|
|
|
+Pools)=. Here's an example config file that creates three pools of
|
|
|
+Riak pb clients each talking to a different node in a local cluster:
|
|
|
|
|
|
#+BEGIN_SRC erlang
|
|
|
- Pool1 = [{name, "node1"},
|
|
|
- {max_pids, 10},
|
|
|
- {min_free, 2},
|
|
|
- {init_size, 5}
|
|
|
- {pid_starter_args, Args1}],
|
|
|
-
|
|
|
- Pool2 = [{name, "node2"},
|
|
|
- {max_pids, 100},
|
|
|
- {min_free, 2},
|
|
|
- {init_size, 50}
|
|
|
- {pid_starter_args, Args2}],
|
|
|
-
|
|
|
- Config = [{pid_starter, {M, F}},
|
|
|
- {pid_stopper, {M, F}},
|
|
|
- {pools, [Pool1, Pool2]}]
|
|
|
-
|
|
|
- % either call this directly, or wire this
|
|
|
- % call into your application's supervisor
|
|
|
- pidq:start(Config)
|
|
|
-
|
|
|
+% pooler.config
|
|
|
+% Start Erlang as: erl -config pooler
|
|
|
+% -*- mode: erlang -*-
|
|
|
+% pooler app config
|
|
|
+[
|
|
|
+ {pooler, [
|
|
|
+ {pools, [
|
|
|
+ [{name, "rc8081"},
|
|
|
+ {max_count, 5},
|
|
|
+ {init_count, 2},
|
|
|
+ {start_mfa,
|
|
|
+ {riakc_pb_socket, start_link, ["localhost", 8081]}}],
|
|
|
+
|
|
|
+ [{name, "rc8082"},
|
|
|
+ {max_count, 5},
|
|
|
+ {init_count, 2},
|
|
|
+ {start_mfa,
|
|
|
+ {riakc_pb_socket, start_link, ["localhost", 8082]}}],
|
|
|
+
|
|
|
+ [{name, "rc8083"},
|
|
|
+ {max_count, 5},
|
|
|
+ {init_count, 2},
|
|
|
+ {start_mfa,
|
|
|
+ {riakc_pb_socket, start_link, ["localhost", 8083]}}]
|
|
|
+ ]}
|
|
|
+ ]}
|
|
|
+].
|
|
|
#+END_SRC
|
|
|
|
|
|
-Each pool has a unique name, a maximum number of pids, an initial
|
|
|
-number of pids, and a minimum free pids count. When pidq starts, it
|
|
|
-will create pids to match the =init_size= value. If there are =min_free=
|
|
|
-pids or fewer, pidq will add a pid as long as that doesn't bring the
|
|
|
-total used + free count over =max_pids=.
|
|
|
+Each pool has a unique name, an initial and maximum number of members,
|
|
|
+and an ={M, F, A}= describing how to start members of the pool. When
|
|
|
+pooler starts, it will create members in each pool according to
|
|
|
+=init_count=.
|
|
|
|
|
|
-Specifying a =pid_stopper= function is optional. If not specified,
|
|
|
-=exit(pid, kill)= will be used to shutdown pids in the case of error,
|
|
|
-pidq shutdown, or pool removal. The function specified will be passed
|
|
|
-a pid as returned by the =pid_starter= function.
|
|
|
+*** Using pooler
|
|
|
|
|
|
-*** Getting and returning pids
|
|
|
+Here's an example session:
|
|
|
|
|
|
-Once started, the main interaction you will have with pidq is through
|
|
|
-two functions, =take_pid/0= and =return_pid/2=.
|
|
|
+#+BEGIN_SRC erlang
|
|
|
+application:start(pooler).
|
|
|
+P = pooler:take_member(),
|
|
|
+% use P
|
|
|
+pooler:return_pid(P, ok).
|
|
|
+#+END_SRC
|
|
|
|
|
|
-Call =pidq:take_pid()= to obtain a pid from the pool. When you are done
|
|
|
-with it, return it to the pool using =pidq:return_pid(Pid, ok)=. If
|
|
|
-you encountered an error using the pid, you can pass =fail= as the
|
|
|
-second argument. In this case, pidq will permently remove that pid
|
|
|
-from the pool and start a new pid to replace it.
|
|
|
+Once started, the main interaction you will have with pooler is through
|
|
|
+two functions, =take_member/0= and =return_member/2=.
|
|
|
+
|
|
|
+Call =pooler:take_member()= to obtain a member from a randomly
|
|
|
+selected pool. When you are done with it, return it to the pool using
|
|
|
+=pooler:return_member(Pid, ok)=. If you encountered an error using
|
|
|
+the member, you can pass =fail= as the second argument. In this case,
|
|
|
+pooler will permanently remove that member from the pool and start a
|
|
|
+new member to replace it. If your process is short lived, you can
|
|
|
+omit the call to =return_member=. In this case, pooler will detect
|
|
|
+the normal exit of the consumer and reclaim the member.
|
|
|
|
|
|
*** Other things you can do
|
|
|
|
|
|
-You can get the status for the system via =pidq:status()=. This will
|
|
|
+You can get the status for the system via =pooler:status()=. This will
|
|
|
return some informational details about the pools being managed.
|
|
|
|
|
|
-You can also add or remove new pools while pidq is running using
|
|
|
-=pidq:add_pool/1= and =pidq:remove_pool/1=. Each pid
|
|
|
+You can also add or remove new pools while pooler is running using
|
|
|
+=pooler:add_pool/1= and =pooler:remove_pool/1= (not yet implemented).
|
|
|
|
|
|
** Details
|
|
|
|
|
|
-pidq is implemented as a =gen_server=. Server state consists of:
|
|
|
+pooler is implemented as a =gen_server=. Server state consists of:
|
|
|
|
|
|
- A dict of pools keyed by pool name.
|
|
|
-- A dict mapping in-use pids to their pool name and the pid of the
|
|
|
- consumer that is using the pid.
|
|
|
-- A dict mapping consumer process pids to the pid they are using.
|
|
|
-- A module and function to use for starting new pids.
|
|
|
+- A dict of pool supervisors keyed by pool name.
|
|
|
+- A dict mapping in-use members to their pool name and the pid of the
|
|
|
+ consumer that is using the member.
|
|
|
+- A dict mapping consumer process pids to the member they are using.
|
|
|
|
|
|
-Each pool keeps track of its parameters, such as max pids to allow,
|
|
|
-initial pids to start, number of pids in use, and a list of free pids.
|
|
|
+Each pool keeps track of its parameters, such as max member to allow,
|
|
|
+initial members to start, number of members in use, and a list of free
|
|
|
+members.
|
|
|
|
|
|
Since our motivating use-case is Riak's pb client, we opt to reuse a
|
|
|
given client as much as possible to avoid unnecessary vector clock
|
|
|
-growth; pids are taken from the head of the free list and returned
|
|
|
+growth; members are taken from the head of the free list and returned
|
|
|
to the head of the free list.
|
|
|
|
|
|
-pidq is a system process and traps exits. Before giving out a pid, it
|
|
|
-links to the requesting consumer process. This way, if the consumer
|
|
|
-process crashes, pidq can recover the pid. When the pid is returned,
|
|
|
-the link to the consumer process will be severed. Since the state of
|
|
|
-the pid is unknown in the case of a crashing consumer, we will destroy
|
|
|
-the pid and add a fresh one to the pool.
|
|
|
+pooler is a system process and traps exits. Before giving out a
|
|
|
+member, it links to the requesting consumer process. This way, if the
|
|
|
+consumer process crashes, pooler can recover the member. When the
|
|
|
+member is returned, the link to the consumer process will be severed.
|
|
|
+Since the state of the member is unknown in the case of a crashing
|
|
|
+consumer, we will destroy the member and add a fresh one to the pool.
|
|
|
+
|
|
|
+The member starter MFA should use start_link so that pooler will be
|
|
|
+linked to the members. This way, when members crash, pooler will be
|
|
|
+notified and can refill the pool with new pids.
|
|
|
|
|
|
-The pid starter MFA should use spawn_link so that pidq will be linked
|
|
|
-to the pids (is it confusing that we've taken the term "pid" and
|
|
|
-turned it into a noun of this system?). This way, when pids crash,
|
|
|
-pidq will be notified and can refill the pool with new pids.
|
|
|
+*** Supervision
|
|
|
+
|
|
|
+The top-level pooler supervisor, pooler_sup, supervises the pooler
|
|
|
+gen_server and the pooler_pool_sup supervisor. pooler_pool_sup
|
|
|
+supervises individual pool supervisors (pooler_pooled_worker_sup).
|
|
|
+Each pooler_pooled_worker_sup supervises the members of a pool.
|
|
|
+
|
|
|
+[[./pidq_appmon.jpg]]
|
|
|
|
|
|
-Also note that an alternative to a consumer explicitly returning a pid
|
|
|
-is for the consumer to exit normally. pidq will receive the normal
|
|
|
-exit and can reclaim the pid. In fact, we might want to implement pid
|
|
|
-return as "fake death" by sending pidq exit(PidqPid, normal).
|
|
|
|
|
|
*** Pool management
|
|
|
|
|
@@ -213,3 +238,11 @@ pidq_pooled_worker_sup supervisor. You use this to create a new pool
|
|
|
and specify the M,F,A of the pooled worker at start.
|
|
|
**** pidq_pooled_worker_sup
|
|
|
Another simple_one_for_one that is used to create actual workers.
|
|
|
+
|
|
|
+* Rename plan
|
|
|
+genpool, gs_pool, pooler
|
|
|
+pid => worker, member, gs, gspid
|
|
|
+pooler:take_member/0
|
|
|
+pooler:return_member/2
|
|
|
+
|
|
|
+#+OPTIONS: ^:{}
|