.. include:: header.txt

==============
 Introduction
==============

Threads, processes and the GIL
==============================

To run more than one piece of code at the same time on the same
computer one has the choice of either using multiple processes or
multiple threads.

Although a program can be made up of multiple processes, these
processes are in effect completely independent of one another:
different processes are not able to cooperate with one another unless
one sets up some means of communication between them (such as by using
sockets).  If a lot of data must be transferred between processes then
this can be inefficient.
     
On the other hand, multiple threads within a single process are
intimately connected: they share their data but often can interfere
badly with one another.  It is often argued that the only way to make
multithreaded programming "easy" is to avoid relying on any shared
state and for the threads to only communicate by passing messages to
each other.

CPython has a *Global Interpreter Lock* (GIL) which in many ways makes
threading easier than it is in most languages by making sure that only
one thread can manipulate the interpreter's objects at a time.  As a
result, it is often safe to let multiple threads access data without
using any additional locking as one would need to in a language such
as C.

One downside of the GIL is that on multi-processor (or multi-core)
systems a multithreaded Python program can only make use of one
processor at a time.  This is a problem that can be overcome by using
multiple processes instead.

Python gives little direct support for writing programs using multiple
process.  This package allows one to write multi-process programs
using much the same API that one uses for writing threaded programs.


Forking and spawning
====================

There are two ways of creating a new process in Python:

* The current process can *fork* a new child process by using the
  `os.fork()` function.  This effectively creates an identical copy
  of the current process which is now able to go off and perform some
  task set by the parent process.  This means that the child process
  inherits *copies* of all variables that the parent process had.

  However, `os.fork()` is not available on every platform: in
  particular Windows does not support it.

* Alternatively, the current process can spawn a completely new Python
  interpreter by using the `subprocess` module or one of the
  `os.spawn*()` functions.

  Getting this new interpreter in to a fit state to perform the task
  set for it by its parent process is, however, a bit of a challenge.

The `processing` package uses `os.fork()` if it is available since
it makes life a lot simpler.  Forking the process is also more
efficient in terms of memory usage and the time needed to create the
new process.


The Process class
=================

In the `processing` package processes are spawned by creating a
`Process` object and then calling its `start()` method.
`processing.Process` follows the API of `threading.Thread`.  A
trivial example of a multiprocess program is ::

   from processing import Process     

   def f(name):
       print 'hello', name

   if __name__ == '__main__':
       p = Process(target=f, args=['bob'])
       p.start()
       p.join()

Here the function `f` is run in a child process. 

For an explanation of why (on Windows) the `if __name__ == '__main__'`
part is necessary see `Programming guidelines
<programming-guidelines.html>`_.


Exchanging objects between processes
====================================

`processing` supports two types of communication channel between
processes:

**Queues**:
    The function `Queue()` returns a near clone of `Queue.Queue`
    -- see the Python standard documentation.  For example ::

        from processing import Process, Queue

        def f(q):
            q.put([42, None, 'hello'])

        if __name__ == '__main__':
            q = Queue()
            p = Process(target=f, args=[q])
            p.start()
            print q.get()    # prints "[42, None, 'hello']"
            p.join()

    Queues are thread and process safe.  

    Note that if you need to guarantee that the `put()` method
    will succeed without blocking -- which is sometimes very
    useful when ensuring that a deadlock cannot occur -- then you
    can use `BufferedQueue()` instead of `Queue()`.
    See `Queues <processing-ref.html#pipes-and-queues>`_.

**Pipes**:
    The `Pipe()` function returns a pair of connection objects
    connected by a duplex (two-way) pipe, for example ::

        from processing import Process, Pipe

        def f(conn):
        conn.send([42, None, 'hello'])

        if __name__ == '__main__':
            parent_conn, child_conn = Pipe()
            p = Process(target=f, args=[child_conn])
            p.start()
            print parent_conn.recv()   # prints "[42, None, 'hello']"
            p.join()

    The two connection objects returned by `Pipe()` represent the
    two ends of the pipe.  Each connection object has `send()` and
    `recv()` methods (among others).  Note that data in a pipe may
    become corrupted if two processes (or threads) try to
    read/write to the *same* end of the pipe at the same time.  Of
    course there is no risk of corruption from processes using
    different ends of the pipe at the same time.  See `Pipes
    <processing-ref.html#pipes-and-queues>`_.


Synchronization between processes
=================================

`processing` contains equivalents of all the synchronization
primitives from `threading`.  For instance one can use a lock to
ensure that only one process prints to standard output at a time::
       
       from processing import Process, Lock
       
       def f(l, i):
           l.acquire()
           print 'hello world', i
           l.release()
           
       if __name__ == '__main__':
           lock = Lock()
           
           for num in range(10):
               Process(target=f, args=[lock, num]).start()
               
Without using the lock output from the different processes is liable
to get all mixed up.


Sharing state between processes
===============================

As mentioned above, when doing threaded programming it is usually best
to avoid using shared state as far as possible.  This is particularly
true when using multiple processes.

However, if you really do need to use some shared data then you can
create by using a *manager*.  Managers can store their data in one of
two ways:

**Shared memory**: 
   Data can be stored in a shared memory map managed by an instance of
   `LocalManager`.  Only those data types supported by the `struct`
   and `array` modules are supported.  For example the following code ::
      
       from processing import Process, LocalManager

       def f(n, s, a):
           n.value = 42
           s.value = (0.75, 'hello')
           for i in range(len(a)):
               a[i] = -a[i]

       if __name__ == '__main__':
           manager = LocalManager()

           number = manager.SharedValue('i', 0)
           struct = manager.SharedStruct('d256p', (0.0, ''))
           array = manager.SharedArray('i', range(10))
           
           p = Process(target=f, args=[number, struct, array])
           p.start()
           p.join()
           
           print number
           print struct
           print array
           
   will print ::
        
       SharedValue('i', 42)
       SharedStruct('d256p', (0.75, 'hello'))
       SharedArray('i', [0, -1, -2, -3, -4, -5, -6, -7, -8, -9])

   Note that `SharedValue`, `SharedStruct` and `SharedArray` objects
   are thread and process safe.  The `'i'` and `'d256p'` arguments
   used when creating `num`, `struct` and `array` are format strings
   of the type used by the `struct` and `array` modules: `'i'`
   indicates a signed integer, `'d'` indicates a double precision
   float and `'256p'` indicates a string of length less than 256.

**Server process**:
   A manager object returned by `processing.Manager()`
   represents a server process which holds python objects and allows
   other processes to manipulate them using proxies.
                   
   In addition to `SharedValue`, `SharedStruct` and `SharedArray` a
   manager returned by `processing.Manager()` will support types
   `list`, `dict`, `Namespace`, `Lock`, `RLock`, `Semaphore`,
   `BoundedSemaphore`, `Condition`, `Event`, `Queue`.  For example::

       from processing import Process, Manager

       def f(d, l):
           d[1] = '1'
           d['2'] = 2
           d[0.25] = None
           l.reverse()

       if __name__ == '__main__':
           manager = Manager()

           d = manager.dict()
           l = manager.list(range(10))

           p = Process(target=f, args=[d, l])
           p.start()
           p.join()

           print d
           print l

   will print ::

       {0.25: None, 1: '1', '2': 2}
       [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

   Creating managers which support other types is not hard --- see
   `Customized managers <manager-objects.html#customized-managers>`_.
   
   Proxy objects are picklable and can be transferred between
   processes.

   Server process managers are more flexible than shared memory
   managers because they support more data types (and can be made to
   support arbitrary types).  Also, a single server process manager
   can be shared by different computers over a network.  They are,
   however, slower than using shared memory.


Using a pool of workers
=======================

The `Pool()` function returns an object representing a pool of worker
processes.  It has methods which allows tasks to be offloaded to the
worker processes in a few different ways.

For example::

    from processing import Pool

    def f(x):
        return x*x

    if __name__ == '__main__':
        pool = Pool(processes=4)              # start 4 worker processes
        result = pool.apply_async(f, [10])    # evaluate "f(10)" asynchronously
        print result.get(timeout=1)           # prints "100" unless your computer is *very* slow
        print pool.map(f, range(10))          # prints "[0, 1, 4,..., 81]"

See `Process pools <pool-objects.html>`_.


Speed
=====

Passing small objects between processes using `processing.Queue` is
not significantly slower than passing the same objects between threads
using `Queue.Queue`.  One can also use `processing.Pipe`.

The following benchmarks were performed on a single core Pentium 4,
2.5Ghz laptop running Windows XP and Ubuntu Linux 6.10 --- see
`test_speed.py <../test/test_speed.py>`_.


*Number of 256 byte string objects passed between processes/threads per sec*:

================================== ========== ==================
Connection type                    Windows    Linux
================================== ========== ==================
Queue.Queue                         49,000    17,000-50,000 [1]_
processing.Queue                    22,000    21,000
processing.SimpleQueue              33,000    43,000
processing.PosixQueue                  n/a    62,000
Queue managed by server              6,900     6,500
processing.Pipe                     52,000    57,000
================================== ========== ==================

.. [1] For some reason the performance of `Queue.Queue` is very
       variable on Linux.


*Number of acquires/releases of a lock per sec*:

============================== ========== ==========
Lock type                       Windows    Linux
============================== ========== ==========
threading.Lock                   850,000    560,000
processing.Lock                  430,000    510,000
Lock managed by server            10,000      8,400
threading.RLock                   93,000     76,000
processing.RLock                 430,000    500,000
RLock managed by server            8,800      7,400
============================== ========== ==========


*Number of interleaved waits/notifies per sec on a
condition variable by two processes*:

============================== ========== ==========
 Condition type                  Windows    Linux
============================== ========== ==========
threading.Condition             26,000     28,000
processing.Condition            22,000     18,000
Condition managed by server      6,600      6,000
============================== ========== ==========


*Number of integers retrieved from a sequence per sec*:

============================== ========== ==========
Sequence type                   Windows    Linux
============================== ========== ==========
list                           6,000,000  4,800,000
shared memory array              120,000    110,000
list managed by server            20,000     17,000
============================== ========== ==========

.. _Prev: index.html
.. _Up: index.html
.. _Next: processing-ref.html

