Go to the top of the NLANR/DAST web site

AAD | Advisor | Autobuf v2.0 | Multicast Beacon | BIMA | Iperf | NextINet | Tools | Web100 | All Projects


Search this site with Google

About:
- DAST
- NLANR
- FAQ
- Staff
- Contact DAST

End User Tools and Projects
- NextINet
- Advanced Applications
Database

- DAST Projects/Tools
- Network Performance
and Measurement Tools

End User Support
- Getting Started Guide
- Networking Glossary
- Other Projects/Organizations
- Funding Opportunities

Documents
- Guides/Tutorials
- Papers/Articles
- Presentations
- Reference Books

WebCT Courses
- Tuning Applications

Events
- NLANR/DAST Training
- NLANR Packets Calendar
- Idesk Travel Schedule

News
- Press Releases
- Alliance Data Link
- I2 Newswire Archives

Reports & Statistics
- Monthly Updates and QSRs
- Abilene "Weather Map"
- Web Server Stats

Introduction Grids Globus

Introduction to Globus
What is Globus?
Development Philosophy
The Globus Approach
Goals for the Globus Toolkit
The Globus Hourglass
Where We Are
Example Application Projects
Learning More

Introduction to Globus

The Globus Project website notes that the "development of the World Wide Web has changed the way that we think about information. We do not think twice about accesses to Web pages that are spread across the world. The goal of the Globus project is to bring about a similar revolution with respect to computation. We can hardly imagine the types of applications we might construct if we had instantaneous access to a supercomputer from our desktop! The Globus project is developing the technology that can make this vision a reality."

What is Globus?

The Globus project is developing basic software infrastructure for computations that integrate geographically distributed computational and information resources. Globus concepts are being tested on a global scale by participants in the Globus Ubiquitous Supercomputing Testbed Organization (GUSTO). GUSTO currently spans over seventy institutions and includes some of the largest computers in the world.

Led by Ian Foster and Carl Kesselman, Globus is the work of a project team at several sites.


Ian Foster

Carl Kesselman

Globus is a best practices application of the lessons learned from doing I-WAY at SC95. "Best practices" refers to the optimum way(s) to perform a process as well as establishing goals for organizations striving for excellence.

Out of SC95 came:

  • the research agenda for Grid computation (defining the actual problems to be solved in order to get applications running)
  • the design of the software that would begin to address the research agenda (specifically how those problems would be solved and implementing solutions into actual code)

Development Philosophy

The Globus team identified four main elements to its implementation philosophy:

  1. The basic research in Grid technologies and applications will identify and address the hard research problems that need to be solved.
  2. The tools to solve those problems must be provided by developing the Globus toolkit. A related element is to codify the things learned from both research and common practice into the toolkit for others to use.
  3. The toolkit must be tested on as many systems and as many different types of systems as possible. This could be accomplished by getting the tools into the hands of the researchers and applications developers, which is one of the primary purposes of GUSTO.
  4. The tools need to be used in real applications to fully test them and their functionality. Because using the tools for actual applications will generate a whole new set of basic research problems to be solved, the process is iterative with the toolkit getting better and better as time goes by.

Globus started with I-WAY, and it's still going from there.

The Globus Approach

First and foremost, Globus offers itself as a "tools for your toolbox" model, rather than as a complete solution. The idea behind Globus is that the researcher or applications developer can pick and choose those elements of Globus that are useful and ignore the rest.

For example, the SF-Express application from DARPA only required the ability to remotely start existing code. Nothing more. Globus provided that functionality without requiring a rewrite of hundreds of thousands of lines of legacy code.

Another example is using the Globus GASS service to access secondary storage elsewhere on the Grid. Changing existing open() calls in the code to globus_gass_open(); calls makes that secondary storage available as if it were local.

A second important piece of the Globus approach involves inter-domain issues. Instead of clustering and forcing everyone using Globus to do things the same way, Globus prefers to be a bridge between sites that have different policies. The idea is to make Globus work with the sites, rather than making the sites work with Globus. Globus will be successful because it is designed to work within a site's environment, rather than forcing the site to adopt a set of arbitrary standards or a special environment for Globus.

A third important piece is the ability to distinguish between global and local services. Globus is aware of local and global differences and doesn't try to hide or mask that.

Globus prefers to behave as an information-rich environment, providing researchers and applications developers with the information needed to make decisions, but not bogging them down in useless detail. For example, imagine a dataset stored in two locations, with processing taking place at a third location, and asymetrical bandwidth between them. Suppose the bandwidth between the computing site and one dataset is 10Mb/s and the bandwidth to the other dataset is 1Mb/s. The researcher may want to take that into consideration rather than just being handed a lowest common denominator solution.

By providing good information about what's going on and allowing researchers to make their own decisions, Globus offers the best of both worlds. A reasonable default is provided if the researcher does not want to make a decision about other options.

Another important piece of the Globus approach is to enable incremental development of grid-enabled tools and applications. This is a corollary to "tools for your toolbox" analogy. If the project is an MPI application that runs fine and just needs to be started remotely, Globus enables that. Globus doesn't force a complete rewrite of an existing application to do this.

The Globus developers actively seeks collaborators to help make the tools more robust. Additionally the tools may be used for functions that were never expected or intended. In light of this, Globus must be responsive. The Grid changes continuously, and Globus must respond to become a better tool.

Goals for the Globus Toolkit

Globus, through the many items in its "toolkit," is working on a variety of the best practices issues.

  • Authenticate once
    Only one point of authentication should be required to keep the environment simple and secure. That single point of authentication should work for everything else the application needs to be able to do. The Globus GSI (Globus Security Infrastructure) does this.
  • Specify the resources
    All the needed resources to run an application should be specified in a single, common language. The Globus RSL (Resource Specification Language) does this.
  • Locate the necessary resources
    All the resources to run an application must be located prior to use. The Globus MDS (Metacomputing Directory Service) handles this.
  • Process the request for resources
    Using the list of required resources and a centralized database of available resources, the resources must be allocated and managed. The GRAM (Globus Resource Allocation Manager) does this.
  • Acquire the resources
    Lock in the resources needed to run the application. Globus has several tools to do this.
  • Initiate the computation
    The Globus GRAM does this.
  • Provide access to remote datasets
    The Globus GASS (Globus Access to Secondary Storage) does this.
  • Steer the computation and collaborate on the results.
    The Nexus library and other Globus tools provide the underlying infrastructure to necessary to facilitate both control and collaboration of the run.
  • Account for usage
    Instead of forcing local administrators to run Globus applications as a generic "globus" user, Globus enables applications to be run as any user local site policy dictates, allowing usage information to be tracked by application.

The Globus Hourglass

The elements of the Grid toolkit do not assume that all local environments are adapted to it. The toolkit was designed and implemented to adapt to the many and various local environments under which it does and will run.

Globus offers a set of core services as a basic infrastructure. These core services are then used to construct high-level, domain-specific solutions. Three key design principles that Globus follows are to keep participation costs low, to enable local control wherever possible, and to provide support for adaptation of the toolkit to the specific needs of each local site and project.

The bottom of the Globus hourglass is the myriad of underlying resources upon which the Globus services are built. Ie, the local operating system of the various machines running Globus, as well as the various networks, job scheduling systems, file systems, etc. The middle of the hourglass is comprised of the core services Globus offers, and the top of the hourglass is higher-level services Globus offers, as well as actual applications written in Globus.

The local implementations of Globus services for a particular operating system free both the core services and the higher level services from needing to address OS-specific issues. Only the local services need know what the underlying OS is, relieving the applications programmers from needing to know details of the local OS of each and every one of the various systems that an application might run on.

Globus Hourglass

 

A more detailed view of the services offered by Globus, and how they fit layer on top of each other is offered here:
Globus Layers

Communication Infrastructure (Nexus)
Allows processes to communicate with each other over a variety of protocols.
Metacomputing Directory Service (MDS)
Provides properties of all resources currently available to Globus on the Grid.
Network performance monitoring (Gloperf)
Starts a daemon at either end of a connection and runs experiments to determine the network performance available to Globus over that connection.
Hearbeat Monitor (HBM)
Starts a daemon that any given process can register with - Informs the user if the process dies.
Remote file and executable management (GASS and GEM)
Provides access to secondary storage over Globus. Remote storage is made available to the running application as if it were local.
Resource management (GRAM)
Processes requests for resources for remote application execution, allocates the required resources, and manages the active jobs.
Globus Security Interface(GSI)
"Authenticate-Once" security for Globus is implemented via public/private key pairs and a special version of SSL.
DUROC: co-allocation of multiple systems
Handles submission of multiple simultaneous jobs across the Grid.
Nimrod: high-throughput computing
Enables parameter-testing by allowing automated submission of a job with many different sets of parameters.

Where We Are

Version 1.0 was released in November of 1998. Work on version 1.1 is going well with a code freeze expected by the end of June 1999, a beta release of version 1.1 in early July, and a final release later in July 1999. New services were added for security, resource management, tools, and fault detection. All core services are complete, relatively robust, and documented. All core services are available on most Unix platforms; NT client support will be in 1.1. Many tool projects are leveraging the Globus investment in infrastructure, and interesting applications are emerging, although most are still in demo mode.

Example Application Projects

Computed microtomography (ANL, ISI)
Real-time, collaborative analysis of data from X-Ray source (and electron microscope)
Hydrology (ISI, UMD, UT; also NCSA, Wisc.)
Interactive modeling and data analysis
Collaborative engineering ("tele-immersion")
CAVERNsoft at EVL, Metro at ANL
X-Ray crystallography (ANL, SUNY)
High-throughput computing for Shake ‘n Bake
Distributed interactive simulation - SF-Express (CIT, ISI)
SF-Express, a large-scale distributed interactive simulation (DIS) developed at the California Institute of Technology with DARPA funding, achieves record setting levels of performance by coordinating its execution across several large-scale parallel computers.
Remote visualization and steering for astrophysics
Including trans-Atlantic experiments
Data-intensive computing experiments (with LBNL and SLAC: "Clipper" project)

Learning More

If you are interested in learning more about Globus, link to the:


Contact DASTBlank Space Last reviewed: December 31, 1969
NLANR || Applications Support || Engineering Support || Measurement and Network Analysis