Communication and Data Representation


Interprocess Communication

Distributed processes communicate by passing messages. Sockets are perhaps the most fundamental way to achieve this. A process can open a TCP or UDP socket and send a series of bytes to a process on another machine. This is very easy to do using a language like Java. However, this approach provides little (or no) location transparency. The user is aware that remote resources are being used.

Remote Procedure Call (RPC)

The RPC model provides a layer of abstraction between the application and the invocation of a remote procedure. The observation is that the function of a message is typically to invoke a procedure on a remote machine. RPC allows a local process to invoke a remote procedure as if it were local by routing communication through a middleware layer.

RPC works as follows:

  1. Procedure calls made by the client are routed through a stub.
  2. The stub marshals any parameters and sends the request to the server. Marshaling involves converting parameters into a format that can be sent across the network. There are two main challenges with respect to marshaling. First, different platforms represent data in different ways. For example, some architectures use big endian (the most significant byte comes first) ordering and others use little endian (the most significant byte comes last). Second, parameters may be data structures, not simply primitive types.
  3. The stub at the server side receives the request, unmarshals the parameters, and passes the request to the server process.
  4. When the server process is complete, it passes the return value to the server side stub, which marshals the result and sends it to the client stub.
  5. The client stub unmarshals the result and passes it to the client side process.

Distributed objects are essentially the object-oriented version of RPC. Java Remote Method Invocation(RMI) is a well-known implementation of distributed objects. An RMI server registers objects with a registry. A client can then retrieve references to those objects via the registry and invoke methods just as if the object were local. Parameters passed by the client can either be remote, in which case they are passed by reference, or serializable, in which case they are copied and passed to the server. Serialization is essentially the marshaling of objects.

Common Object Request Broker Architecture (CORBA) provides yet another alternative. The ORB brokers communication between the client and the server, providing the equivalent of the RMI registry. Unlike RMI, CORBA is not tied to a particular language.

The Web/Network Services paradigm provides RPC-style functionality in the context of the web. A service provider, such as Google or Amazon, provides the ability for a user to programmatically access services such as search. The REST model is very simple. It restricts the "methods" your service can provide to the methods supported by HTTP. Your parameters are encoded in the URI used to access the service. There is no concept of registry; instead service providers typically provide documentation via a web page. You'll notice that this does not provide the transparency that other RPC mechanisms provide.

The SOAP approach to web services is a bit different. The data exchanged between client and server is stored in XML format, which is convenient and human readable but inefficient. Services can be described using the Web Services Description Language (WSDL) and registered in the Universal Directory and Discovery Service (UDDI).


Sami Rollins
Wednesday, 07-Jan-2009 15:13:38 PST