Revisando las presentaciones de
rubyconf2008, seleccionado las que más me llamaron la atención (aún sigo revisándolas, son demasiadas!) vi la de
Mike Perham sobre "patterns in distributed processing" en la que presenta una implementación de procesamiento distribuido montado sobre
memcached ("high-performance, distributed memory object caching system").
Esto puedo ser bastante util cuando se tiene un problema computacional que puede ser segmentado para su procesamiento paralelo.
Tiene las siguientes dependencias:
- memcached - the mechanism to elect a leader amongst a set of peers.
- DRb - the mechanism to communicate between peers.
- mDNS - the mechanism to discover peers.
Instalando memcached:
~# sudo apt-get install memcached
Reading package lists... Done
Building dependency tree
Reading state information... Done
Suggested packages:
libcache-memcached-perl
The following NEW packages will be installed:
memcached
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 44.6kB of archives.
After this operation, 176kB of additional disk space will be used.
Get:1 http://us.archive.ubuntu.com hardy/universe memcached 1.2.2-1 [44.6kB]
Fetched 44.6kB in 2min1s (367B/s)
Selecting previously deselected package memcached.
(Reading database ... 169259 files and directories currently installed.)
Unpacking memcached (from .../memcached_1.2.2-1_i386.deb) ...
Setting up memcached (1.2.2-1) ...
Starting memcached: memcached.
instalando mdns:
~# sudo gem install net-mdns
Successfully installed net-mdns-0.4
1 gem installed
Updating class cache with 1251 classes...
instalando el cliente de memcached (para ruby):
~# sudo gem install memcache-client
Successfully installed rubyforge-1.0.1
Successfully installed rake-0.8.3
Successfully installed hoe-1.8.2
Successfully installed ZenTest-3.11.0
Successfully installed memcache-client-1.5.0
5 gems installed
Installing ri documentation for rubyforge-1.0.1...
Installing ri documentation for rake-0.8.3...
Installing ri documentation for hoe-1.8.2...
Installing ri documentation for ZenTest-3.11.0...
Installing ri documentation for memcache-client-1.5.0...
Installing RDoc documentation for rubyforge-1.0.1...
Installing RDoc documentation for rake-0.8.3...
Installing RDoc documentation for hoe-1.8.2...
Installing RDoc documentation for ZenTest-3.11.0...
Installing RDoc documentation for memcache-client-1.5.0...
Ahora lo siguiente es arrancar el server memcached:
~# /etc/init.d/memcached start
Starting memcached: memcached.
Ya tenemos todo lo que necesitamos instalado. Ahora nos disponemos a instalar politics y corremos un pequeño ejemplo:
NOTA: para bajar la librería se necesita instalado el git (sudo apt-get install git & sudo apt-get install git-core)
~# git clone git://github.com/mperham/politics.git
Initialized empty Git repository in /home/castor/projects/politics/.git/
remote: Counting objects: 165, done.
remote: Compressing objects: 100% (158/158), done.
remote: Total 165 (delta 76), reused 0 (delta 0)
Receiving objects: 100% (165/165), 32.50 KiB, done.
Resolving deltas: 100% (76/76), done.
Listo. Ahora tenemos una carpeta llamada politics, nada mas no situamos en ella, y vemos un poco la carpeta examples.
~# cd politics/
~# ls
examples History.rdoc lib LICENSE Manifest politics.gemspec Rakefile README.rdoc test
~# cd examples/
~# ls
queue_worker_example.rb token_worker_example.rb
Ok, ahora solo tratemos de correr uno de los ejemplos, en este caso el queue_worker_example.rb.
Veamos como se ve por dentro:
~#cat queue_worker_example.rb
#gem 'mperham-politics'
require 'politics'
require 'politics/static_queue_worker'
# Test this example by starting memcached locally and then in two irb sessions, run this:
#
=begin
require 'queue_worker_example'
p = Politics::QueueWorkerExample.new
p.start
=end
#
# You can then watch as one of them is elected leader. You can kill the leader and verify
# the backup process is elected after approximately iteration_length seconds.
#
module Politics
class QueueWorkerExample
include Politics::StaticQueueWorker
TOTAL_BUCKETS = 20
def initialize
register_worker 'queue-example', TOTAL_BUCKETS, :iteration_length => 60, :servers => memcached_servers
end
def start
process_bucket do |bucket|
puts "PID #{$$} processing bucket #{bucket}/#{TOTAL_BUCKETS} at #{Time.now}..."
sleep 1.5
end
end
def memcached_servers
['127.0.0.1:11211']
end
end
end
Pura vida, entonces nada mas de crear un archivo test para correr varios QueueWorker y listo (como dice al inicio el archivo). Los server memcached se especifican en el método "memcached_servers" en nuestro caso lo tenemos corriendo localmente, si existen otros memchaded servers nada más de agregarlo ahí.
El archivo de ejemplo quedaría mas o menos así:
#test.rb
require 'rubygems'require 'queue_worker_example'
p = Politics::QueueWorkerExample.new
p.start
Muy importante incluir 'rubygems' para que pueda acceder a los gems que instalamos antes.
Y corramos el test en varias consolas (en este caso 3) y vemos a ver que hacen.
~#ruby test.rb
/usr/local/lib/site_ruby/1.8/rubygems/custom_require.rb:31:in `gem_original_require': no such file to load -- politics (LoadError)
from /usr/local/lib/site_ruby/1.8/rubygems/custom_require.rb:31:in `require'
from ./queue_worker_example.rb:2
from /usr/local/lib/site_ruby/1.8/rubygems/custom_require.rb:31:in `gem_original_require'
from /usr/local/lib/site_ruby/1.8/rubygems/custom_require.rb:31:in `require'
from test.rb:2
Nos da un error, ya que no puede ver modulo 'politics'. Nada más modificamos las líneas dos y tres por la siguientes en el archivo
queue_worker_example.rb.
require 'politics'require 'politics/static_queue_worker'por
require '../lib/politics'require '../lib/politics/static_queue_worker'Listo. Ahora si, el script correrá. Para probarlo haré el siguiente pequeño experimento:
1. Arranco los tres procesos, uno de ellos será elegido como líder y será el encargado de asignar el trabajo.
2. Matamos este líder y vemos que tendrá que asignar otro líder.
3. El nuevo lider debería de asignar todo el trabajo al tercer proceso (único worker que queda), este último worker ejecuta todo el trabajo.
4. Esperamos que el último proceso worker termine y damos por concluido el experimento.
Paso 1consola 1
~# ruby test.rb
I, [2009-01-02T18:03:20.062033 #1925] INFO -- : Registered # in group queue-example at port 58354
I, [2009-01-02T18:03:20.065879 #1925] INFO -- : druby://pc:58354 has been elected leader
consola 2
~# ruby test.rb
I, [2009-01-02T18:03:22.317582 #1930] INFO -- : Registered # in group queue-example at port 48267
I, [2009-01-02T18:03:24.347506 #1930] INFO -- : druby://pc:48267 is processing 19
PID 1930 processing bucket 19/20 at Fri Jan 02 18:03:24 -0600 2009...
I, [2009-01-02T18:03:25.852976 #1930] INFO -- : druby://pc:48267 is processing 17
PID 1930 processing bucket 17/20 at Fri Jan 02 18:03:25 -0600 2009...
I, [2009-01-02T18:03:27.357718 #1930] INFO -- : druby://pc:48267 is processing 15
PID 1930 processing bucket 15/20 at Fri Jan 02 18:03:27 -0600 2009...
I, [2009-01-02T18:03:28.864040 #1930] INFO -- : druby://pc:48267 is processing 13
PID 1930 processing bucket 13/20 at Fri Jan 02 18:03:28 -0600 2009...
I, [2009-01-02T18:03:30.367844 #1930] INFO -- : druby://pc:48267 is processing 11
PID 1930 processing bucket 11/20 at Fri Jan 02 18:03:30 -0600 2009...
I, [2009-01-02T18:03:31.872520 #1930] INFO -- : druby://pc:48267 is processing 9
PID 1930 processing bucket 9/20 at Fri Jan 02 18:03:31 -0600 2009...
consola 3
# ruby test.rb
I, [2009-01-02T18:03:23.748698 #1934] INFO -- : Registered # in group queue-example at port 47170
I, [2009-01-02T18:03:24.777129 #1934] INFO -- : druby://pc:47170 is processing 18
PID 1934 processing bucket 18/20 at Fri Jan 02 18:03:24 -0600 2009...
I, [2009-01-02T18:03:26.280513 #1934] INFO -- : druby://pc:47170 is processing 16
PID 1934 processing bucket 16/20 at Fri Jan 02 18:03:26 -0600 2009...
I, [2009-01-02T18:03:27.793545 #1934] INFO -- : druby://pc:47170 is processing 14
PID 1934 processing bucket 14/20 at Fri Jan 02 18:03:27 -0600 2009...
I, [2009-01-02T18:03:29.295903 #1934] INFO -- : druby://pc:47170 is processing 12
PID 1934 processing bucket 12/20 at Fri Jan 02 18:03:29 -0600 2009...
I, [2009-01-02T18:03:30.800739 #1934] INFO -- : druby://pc:47170 is processing 10
PID 1934 processing bucket 10/20 at Fri Jan 02 18:03:30 -0600 2009...
I, [2009-01-02T18:03:32.305096 #1934] INFO -- : druby://pc:47170 is processing 8
PID 1934 processing bucket 8/20 at Fri Jan 02 18:03:32 -0600 2009...
Paso 2 y 3Ahora matamos con 'Control + C' el primero (consola 1) y vemos que pasa en las otras consolas.
consola 1
../lib/politics/static_queue_worker.rb:195:in `sleep': Interrupt
from ../lib/politics/static_queue_worker.rb:195:in `relax'
from ../lib/politics/static_queue_worker.rb:97:in `process_bucket'
from ./queue_worker_example.rb:26:in `start'
from test.rb:4
consola 2
E, [2009-01-02T18:03:33.376720 #1930] ERROR -- : Error talking to leader: druby://pc:58354 - #
I, [2009-01-02T18:04:33.380112 #1930] INFO -- : druby://pc:48267 has been elected leader
consola 3
E, [2009-01-02T18:03:33.826448 #1934] ERROR -- : Error talking to leader: druby://pc:58354 - #
I
Paso 4Ya el líder está en la consola 2, el tercero nada más ejecutaría el trabajo restante.
consola 3
I, [2009-01-02T18:04:33.829416 #1934] INFO -- : druby://pc:47170 is processing 19
PID 1934 processing bucket 19/20 at Fri Jan 02 18:04:33 -0600 2009...
I, [2009-01-02T18:04:35.334133 #1934] INFO -- : druby://pc:47170 is processing 18
PID 1934 processing bucket 18/20 at Fri Jan 02 18:04:35 -0600 2009...
I, [2009-01-02T18:04:36.836062 #1934] INFO -- : druby://pc:47170 is processing 17
PID 1934 processing bucket 17/20 at Fri Jan 02 18:04:36 -0600 2009...
I, [2009-01-02T18:04:38.340842 #1934] INFO -- : druby://pc:47170 is processing 16
PID 1934 processing bucket 16/20 at Fri Jan 02 18:04:38 -0600 2009...
I, [2009-01-02T18:04:39.844057 #1934] INFO -- : druby://pc:47170 is processing 15
PID 1934 processing bucket 15/20 at Fri Jan 02 18:04:39 -0600 2009...
I, [2009-01-02T18:04:41.349662 #1934] INFO -- : druby://pc:47170 is processing 14
PID 1934 processing bucket 14/20 at Fri Jan 02 18:04:41 -0600 2009...
I, [2009-01-02T18:04:42.853141 #1934] INFO -- : druby://pc:47170 is processing 13
PID 1934 processing bucket 13/20 at Fri Jan 02 18:04:42 -0600 2009...
I, [2009-01-02T18:04:44.356687 #1934] INFO -- : druby://pc:47170 is processing 12
PID 1934 processing bucket 12/20 at Fri Jan 02 18:04:44 -0600 2009...
I, [2009-01-02T18:04:45.860273 #1934] INFO -- : druby://pc:47170 is processing 11
PID 1934 processing bucket 11/20 at Fri Jan 02 18:04:45 -0600 2009...
I, [2009-01-02T18:04:47.363697 #1934] INFO -- : druby://pc:47170 is processing 10
PID 1934 processing bucket 10/20 at Fri Jan 02 18:04:47 -0600 2009...
I, [2009-01-02T18:04:48.869261 #1934] INFO -- : druby://pc:47170 is processing 9
PID 1934 processing bucket 9/20 at Fri Jan 02 18:04:48 -0600 2009...
I, [2009-01-02T18:04:50.372352 #1934] INFO -- : druby://pc:47170 is processing 8
PID 1934 processing bucket 8/20 at Fri Jan 02 18:04:50 -0600 2009...
I, [2009-01-02T18:04:51.873700 #1934] INFO -- : druby://pc:47170 is processing 7
PID 1934 processing bucket 7/20 at Fri Jan 02 18:04:51 -0600 2009...
I, [2009-01-02T18:04:53.376756 #1934] INFO -- : druby://pc:47170 is processing 6
PID 1934 processing bucket 6/20 at Fri Jan 02 18:04:53 -0600 2009...
I, [2009-01-02T18:04:54.880311 #1934] INFO -- : druby://pc:47170 is processing 5
PID 1934 processing bucket 5/20 at Fri Jan 02 18:04:54 -0600 2009...
I, [2009-01-02T18:04:56.384764 #1934] INFO -- : druby://pc:47170 is processing 4
PID 1934 processing bucket 4/20 at Fri Jan 02 18:04:56 -0600 2009...
I, [2009-01-02T18:04:57.888371 #1934] INFO -- : druby://pc:47170 is processing 3
PID 1934 processing bucket 3/20 at Fri Jan 02 18:04:57 -0600 2009...
I, [2009-01-02T18:04:59.392192 #1934] INFO -- : druby://pc:47170 is processing 2
PID 1934 processing bucket 2/20 at Fri Jan 02 18:04:59 -0600 2009...
I, [2009-01-02T18:05:00.897304 #1934] INFO -- : druby://pc:47170 is processing 1
PID 1934 processing bucket 1/20 at Fri Jan 02 18:05:00 -0600 2009...
I, [2009-01-02T18:05:02.399987 #1934] INFO -- : druby://pc:47170 is processing 0
PID 1934 processing bucket 0/20 at Fri Jan 02 18:05:02 -0600 2009...
I, [2009-01-02T18:05:03.905741 #1934] INFO -- : No more buckets in this iteration, sleeping for 29.474557 sec
Listo ya instalamos los pre-requisitos, instalamos 'politics' y probamos con un ejemplo.
En una próxima entrada, utilizaré 'politics' para distribuir trabajo en varios servers memcached.