Introduction

Overview

Name some of your favorite applications.

Can you name a few non-distributed applications that you use on a regular basis?

According to Coulouris et al "a distributed system is one in which hardware or software components located at networked computers communicate and coordinate their actions only by passing messages." Most of your favorite applications leverage a distributed system in some way. Peer-to-peer file sharing is a good example. Peers pass messages to other peers in order to locate and download desired content.

Characteristics of distributed systems:

Concurrency - work happens on multiple computers simultaneously and must be coordinated.
Independent failures - each component of the system may fail independently.
No global clock - there is no global notion of the correct time.
Heterogeneity - components may be connected via different types of networks, run on different architectures, run different OSs, and/or be implemented in different languages. They may also be under different administrative domains.
High latency communication - components may be distributed around the world.

Generally, in a distributed systems components are fairly loosely coupled. In contrast, Prof. Pacheco's class will largely focus on parallel computing in environments where components are tightly coupled. They likely share a fast communication link, share a clock, are under a single administrative domain, and are homogeneous.

What makes distributed systems complex?

Heterogeneity - components may be connected via different types of networks, run on different architectures, run different OSs, and/or be implemented in different languages.
Openness - "The openness of distributed systems is determined primarily by the degree to which new resource-sharing services can be added and be made available for use by a variety of client programs." (Coulouris et. al). Gnutella is a good example of an open system. There are lots of Gnutella applications that adhere to the Gnutella interface and interoperate with other Gnutella applications.
Security - information should be confidential when necessary, protected against corruption, and available (resistant to DoS attacks).
Scalability - "A system is described as scalable if it will remain effective when there is a significant increase in the number of resources and the number of users." (Coulouris et. al) Why is the Internet scalable?
Failure Handling - detecting failures, masking failures, tolerating failures, recovery from failures.
Concurrency - services should handle multiple requests simultaneously.
Transparency - local and remote resources should be accessible in the same way.

Case Study: DNS

DNS stands for domain name system. DNS is a great example of a distributed system. It maps hostnames such as www.cs.usfca.edu to IP addresses such as 138.202.170.2. Messages sent across the Internet must contain (in the IP header) the IP address of the destination (and the source) of the message.

Clearly, you'd hate to have to remember that my web page was http://138.202.170.2/~srollins/ (since I am sure that all of you frequently surf my web page). Fortunately, you can use the more friendly URL http://www.cs.usfca.edu/~srollins/ and your browser will translate the www.cs.usfca.edu part into an IP address before sending the HTTP request to the CS web server. Your browser uses DNS for this purpose.

Your browser performs the following operations:

The hostname is extracted from the URL.
The browser sends a query to the DNS server. (How does it know where to find the DNS server?)
The server replies with the IP address.
The browser opens a TCP connection and sends the HTTP request.

How do you suppose DNS is implemented? One option would be to set up a single, high-speed computer and store all of the mappings in a database on that computer. Your browser could then query that computer for the appropriate IP address.

Why wouldn't this work?

Single point of failure
Does not scale
High latency
Difficult to maintain

Instead, DNS is implemented as a distributed, hierarchical database. There are three classes of DNS servers:

Root servers - There are 13 root name servers worldwide. These servers maintain the IP address of all of the top-level domain servers. There are 12 organizations that run root servers include DoD, VeriSign, and USC.
Top-level domain servers - The top-level domain (TLD) servers are each responsible for a particular domain (e.g., com, edu). Different organizations are responsible for each domain. Educase maintains EDU TLD.
Authoritative servers - Each organization (e.g., USF or Amazon) maintains the DNS server that can provide authoritative mappings from hostname to IP.

How it works:

Your computer asks its local DNS server for a mapping.
Your local DNS server (how is this found?!) contacts a root DNS server to ask for the mapping.
The root DNS server responds with the IP address of the TLD DNS server for relevant domain.
Your local DNS server contacts the TLD server and the TLD server responds with the address of the authoritative server for the domain in question.
Your local DNS server contacts the authoritative server and (finally!) gets the correct IP address.
Your local DNS server returns the address to your computer.

Other miscellaneous facts:

It is possible to configure DNS to use recursive queries. In this case, the root server would contact the TLD server directly, and the TLD server would contact the authoritative server. The result would then propagate back through the chain.
Each server may cache results and server the cached results later instead of resubmitting the query.

How does DNS address the complexities of distributed computing?

Heterogeneity - DNS servers and clients can run on different hardware, use different operating systems, and/or be implemented in different languages. They are also maintained by lots of different organizations.
Openness - DNS is described in several RFCs.
Security - There are some vulnerabilities, for example cache poisoning, but also ways to ensure a reasonable level of security.
Scalability - The hierarchical nature of DNS makes it very scalable. Each organization manages its own authoritative information.
Failure Handling - The failure of one DNS server does not impact the ability of others to continue to do resolution.
Concurrency - Many requests can happen simultaneously and data is distributed over a large number of hosts.
Transparency - the user (and end host) need not know about root or TLD servers.

Case Study: CDNs

The goal of a content distribution/delivery network is to reduce the latency of delivering content to an end user by caching it throughout the Internet. When you visit a web page like MySpace you will notice that lots of images are displayed. To reduce the amount of time it takes for the end user to load those images, companies like MySpace distribute their content using CDNs provided by companies like Akamai. By caching content close to the user, the load on the origin server and the network latency are both reduced. In addition, crowded or faulty network paths can be avoided.

The process works as follows:

CDNs maintain lots of servers in data centers around the world.
A content provider (typically one that delivers lots of multimedia content) contracts with a CDN.
When the content provider creates a new piece of content, for example a new video, it sends that content to the CDN.
The CDN replicates that content on all of its servers.
Typically, the content provider serves basic content (e.g., HTML pages) from its origin servers. Links to other content (e.g, videos) list the CDN as the server.
When the client does a DNS lookup on the CDN hostname, the CDN DNS server gets the IP address of the client and uses a proprietary algorithm to determine the IP address of the best server for that particular piece of content.

How does a CDN address the complexities of distributed computing?

A proprietary CDN like Akamai is not necessarily heterogeneous or open. As a result, security is much easier to address.

Scalability - A couple of scalability concerns arise. First, there may be issues of scale associated with building the authoritative name server. Second, there are lots of interesting questions with respect to where and how data are replicated.
Failure Handling - Because content is replicated, several servers can take over for a failed server.
Concurrency - Many requests can happen simultaneously and data is distributed over a large number of hosts.
Transparency - the user (and end host) need not know that data is coming from a CDN and not the origin server.

Sami Rollins

Wednesday, 07-Jan-2009 15:13:20 PST