![]() | ![]() |
|
  |
keplerScientific WorkflowsThe Kepler Project is dedicated to furthering and supporting the capabilities, use, and awareness of the free and open source, scientific workflow application called Kepler. Kepler builds on the Ptolemy II system created within CHESS.Kepler is designed to help scientists, analysts and computer programmers create, execute, and share models and analyses across a broad range of scientific and engineering disciplines. Kepler can operate on data stored in a variety of formats, locally and over the internet, and is an effective environment for integrating disparate software components, such as merging "R" scripts for statistical analysis with compiled C code, or facilitating remote, distributed execution of models via web services and computational grid infrastructures. Using Kepler's graphical user interface, users simply select and then connect pertinent analytical components and data sources to create a "scientific workflow" -- an executable representation of the steps required to generate results. The Kepler software helps users share and reuse data, workflows, and components developed by the scientific community to address common needs. And it helps scientists track and record the provenance of the data produced by their computational experiments. Kepler is an open-source Java-based application that is maintained for the Windows, Mac OS X, and Linux operating systems. The Kepler Project supports the official code-base for Kepler development, as well as providing materials and mechanisms for learning how to use Kepler, sharing experiences with other workflow developers, reporting bugs, suggesting enhancements, etc. Kepler is an open collaboration with many contributors from diverse domains of science and engineering. The project was started by researchers at the National Center for Ecological Analysis and Synthesis (NCEAS) at UC Santa Barbara, the San Diego Supercomputer Center (SDSC) at UC San Diego, and UC Davis as part of the SEEK (Science Environment for Ecological Knowledge) and SDM (Scientific Data Management) projects. Kepler participants include the above organizations plus CHESS, the Lawrence Livermore National Labs (LLNL), the University of Kansas, Monash University (Australia), the Univ. of New Mexico, James Cook University (JCU, Australia), and North Carolina State University. The effort is supported by CHESS, the National Science Foundation and the Department of Energy.
Above is a simple illustrative Kepler model created by Ilkay Altintas of the San Diego Supercomputer Center (SDSC) at UC San Diego that shows the use of web services and data transformation actors. This workflow demonstrates the use of the remote genomics data service to retrieve a genetic sequence. The Kepler actor uses the WSDL description of the web service to make its capabilities available to the model builder. In this simple example, the genetic sequence is displayed in three different ways: first in its native format (XML); second as a sequence element that has been extracted from the XML format; and third as an HTML document that might be used for display on a web site. The latter two transformation operations are performed using a composite actor that hides some of the complexity of the underlying operation. These composites can be thought of as 'sub-workflows' that execute a potentially complex set of tasks when called. One of these uses XPath to perform the transformation and one uses XSLT, thus illustrating the ability to leverage pre-existing infrastructure within Kepler workflows.
To modify this page, use CVS. |