Web Services


Overview

There is some debate about the definition of a web service. The W3C provides this traditional definition:

A Web service is a software system designed to support interoperable machine-to-machine interaction over a network. It has an interface described in a machine-processable format (specifically WSDL). Other systems interact with the Web service in a manner prescribed by its description using SOAP messages, typically conveyed using HTTP with an XML serialization in conjunction with other Web-related standards.

Richardson and Ruby use the nickname "Big Web Services" to refer to these services that focus on SOAP, WSDL, and the WS-* stack. They advocate a more flexible definition of web services and make an interesting distinction between the standard HTTP+HTML, human web and the programmable web. In the human web, browsers issue HTTP requests to servers that return HTML that is easily rendered for human consumption. For example, if you want to find interesting flickr photos for January 5, 2008 you can point your browser at http://flickr.com/explore/interesting/2008/01/05 and you will get back an HTML page that is nicely rendered by your browser. However, consider what you get back when you retrieve the same page via a program (for example, curl). While one can write programs to screen scrape -- retrieve HTML and extract interesting information -- this is burdensome if a program, not a human, is the intended recipient of the information. The programmable web makes it easy for clients to utilize services without human intervention.

What does this have to do with distributed systems?

Fundamentally, a web service and the clients that utilize it form a distributed system. The web service model provides a flexible and very straightforward mechanism for building distributed systems that communicate by passing messages over the Internet (typically using HTTP). As we previously discussed, this model provides a simpler alternative to something like CORBA or RMI.


SOAP and RPC-Style Architectures

soap

The "traditional" web service architecture (shown above) is comprised of a server that uses HTTP, SMTP, or some other transport protocol to communicate with clients using XML. The Simple Object Access Protocol (SOAP) specifies the rules for using XML to package messages. The Web Services Description Language (WSDL) is used to describe the interface to the service and a WSDL description of a service may be published using the Universal Description, Discovery, and Integration (UDDI).

SOAP: SOAP is a protocol for exchanging XML messages via HTTP. It evolved out of XML-RPC, a protocol for performing remote procedure calls using XML and a message-encoding language. SOAP consists of an envelope that encodes the message and a body that contains the details of the service being invoked. On a request, the body will contain the method being invoked an any parameters. On a reply, the body will contain the result of the method's execution.

Flickr provides both a SOAP interface and an XML-RPC interface to their photo application. The following script retrieves interesting photos for January 5, 2008 using the SOAP interface:

#!/bin/sh
curl --stderr /dev/null \
-d "<s:Envelope        
   xmlns:s=\"http://www.w3.org/2003/05/soap-envelope\"        
   xmlns:xsi=\"http://www.w3.org/1999/XMLSchema-instance\"        
   xmlns:xsd=\"http://www.w3.org/1999/XMLSchema\">       
   <s:Body>                
      <x:FlickrRequest xmlns:x=\"urn:flickr\">                        
         <method>flickr.interestingness.getList</method>                        
         <date>2008-01-05</date>                        
         <api_key>c230713ad24611e09f3d91a64280d194</api_key>                
      </x:FlickrRequest>        
   </s:Body>
</s:Envelope>" http://api.flickr.com/services/soap/

The following script retrieves the same data using the XML-RPC interface:

#!/bin/sh
curl --stderr /dev/null \
-d "<methodCall>
<methodName>flickr.interestingness.getList</methodName>  
<params>   
  <param>     
    <value>       
      <struct>         
        <member>           
          <name>api_key</name>           
          <value>c230713ad24611e09f3d91a64280d194</value>         
        </member>        
      </struct>     
    </value>  
  </param>   
  <param>     
    <value>       
      <struct>       
        <member>       
          <name>date</name>      
          <value>2008-01-05</value>    
        </member> 
      </struct>     
   </value>
  </param> 
</params>
</methodCall>" http://api.flickr.com/services/xmlrpc/

You can also request other response formats, for example JSON.

WSDL: WSDL is an XML-based language for describing a programmatic interface to a Web Service. It specifies the following:

Flickr does not provide a WSDL. For another example of WSDL, see the Amazon WSDL.

Many web services are SOAP based, however many developers argue that SOAP does not fulfill the promise of being "simple". Further, developers argue that the RPC-style, SOAP model does not take advantage of the benefits provided by the web. By many accounts, SOAP really hasn't caught on, and has been overshadowed by the REST approach, which makes every effort to exploit the fundamental properties of the web. Evidence that REST has overshadowed SOAP includes Google's deprecation of their SOAP API and reports that up to 85% of developers choose the Amazon REST API over the SOAP API.


Web Services Using REST

In his 2000 dissertation, Roy Fielding coined the term Representational State Transfer (REST). REST is not really a technology or a protocol, it is a model for designing and building web services. The fundamental idea is that everything on the web is a resource. A web service allows you to retrieve an XML representation of a resource or create/overwrite a representation of a resource. Let's start with an example. The following URL shows how to use flickr's "REST" interface:

http://www.flickr.com/services/rest/?method=flickr.interestingness.getList&api_key=c230713ad24611e09f3d91a64280d194&date=2008-01-05

You'll notice that the method name and parameter list that the SOAP and XML-RPC interfaces require inside the body of the message are encoded in the URI. Now, the reason that I enclosed REST in quotation marks above is because this interface is what Richardson and Ruby refer to as a REST-RPC hybrid. You might also see this referred to as HTTP+POX (plain old XML). It has some RESTful features, namely that everything is encoded in the URI. However, it has a non-RESTful feature which is that it clearly invokes a remote procedure (notice the method parameter in the URI).

In a RESTful service, the list of interesting photos would be a resource, identified by a unique URI. The client would retrieve a representation of that resource using an HTTP GET. The URI might look as follows (which looks very similar to the human-readable version of the URI):

http://www.flickr.com/photos/interesting/2008/01/05

Resources

Resources are the fundamental building block for RESTful web services. The first step in designing a RESTful web services is identifying the resources you wish to expose. Flickr resources might include all of samirollins' photos, all interesting photos posted on January 5, 2008, or all photos tagged with San Francisco. In the auto parts application discussed here: http://www.xfront.com/REST-Web-Services.html, resources include the list of available parts and each individual part.

URIs

Each resource is addressable with a unique URI. The URI should contain all necessary information to identify the particular resource. Dates, part numbers, and locations might all be part of the URI. An interesting question arises with respect to whether one should use query variables, such as ?date=08-01-05, as opposed to simply embedding the scoping information in the URI itself, such as /2008/01/05. Richardson and Ruby address this point and suggest that when providing input to an algorithmic resource, such as a search engine, query variables are acceptable. So, if you wanted to search for REST on google, it would make more sense to have the URI google.com/search?query=REST rather than google.com/search/REST.

Uniform Interface

In a purely RESTful web service, every entity is a resource and a client may only retrieve or change/overwrite the representation of the resource. This leads to the principle of the uniform interface. The only "methods" that can be performed in a purely RESTful system are those supported by HTTP: GET, POST, DELETE, PUT, or HEAD. This model is distinctly different from the SOAP/RPC-style model. In the RPC model, you ask a remote service to perform an action (give me all photos tagged with San Francisco). "Give me all photos" is the procedure and "tagged with San Francisco" is the parameter to the procedure. In the REST model, you get a representation of a resource (GET the representation of the resource "photos tagged with San Francisco"). The difference can be subtle. Also note that in the REST example, you do need to send scoping information (the equivalent of parameters in the RPC model). The scoping information is embedded in the URI.

Designing the Representation

To retrieve a resource, a client may issue an HTTP GET using the URI of the resource. The developer must decide the format of the representation returned. One option is to return an XML document to represent the resource. Some applications return other formats, including JSON.

Read/Write Services

Read-only services are fairly straightforward to design. Read/write services can be a bit more complicated, and seemingly do not always fit well within the REST model. The ideal vision is that everything is a resource, so a write is simply a POST (or PUT) to the URI for the resource. For example, suppose I wanted to post a new photo named Paris in my flickr account. One RESTful way to do that would be to POST to a URI such as flickr.com/samirollins/photos/Paris. Notice that I added the name that I wanted to give the photo to the URI. Not all applications may be so cleanly designed.

Authentication

Authentication can be a bit tricky. Most RESTful or REST-RPC hybrid web services provide some kind of authentication API that enables you to retrieve a token that is then sent (for example, as part of the URI or in the authentication header) with each request.


Sami Rollins
Wednesday, 07-Jan-2009 15:13:20 PST