We describe the runtime system for the EARTH Architecture (Efficient
Architecture for Running THreads) operating on a multi-processor/multi-node,
distributed-shared memory cluster. Some distinguishing characteristics
of the EARTH model include: (1) the activities on each node are performed
by two independent units: an Execution Unit (EU) and a Synchronization
Unit (SU); (2)
there are two levels of threading: threaded functions, themselves decomposed
into fibers into fibers; (3) the EU performs the ``useful work'', i.e.
executes the fibers, while the SU is in charge of synchronization, inter-node
communication, scheduling, and load balancing; (4) threads, fibers and
synchronization are explicitly expressed using the Threaded-C language,
a variant of the C language. Initially EARTH was conceived to be implemented
in a machine that allowed the EARTH runtime system to directly manage the
inter-node network, including network generated interruptions. Thus,
in the earlier implementations of EARTH, the runtime system decided when
and how often to poll the network to check for incoming data or synchronization
signals.
We designed and implemented an EARTH runtime system for a cluster of
processors. For portability, we built this runtime system on top of
Pthreads under Linux. This implementation enables the overlapping of
communication and computation on a cluster of Symmetric
Multi-Processors (SMP), and let the interruptions generated by the
arrival of new data drive the system, rather than relying on network polling.
An interesting research question for the implementation of a multi-threading
system on a multi-processor/multi-node system, which we try to answer,
is how to arrange the execution and synchronization activities to make
the best use of the resources available, and how to design the interaction
between the local processing and the network activities.
Return to José Nelson Amaral's
Publications