The work on SOAP based Web Services at CBS is done within the framework
of
EMBRACE Network
of Excellence.
Currently CBS offers around 50 services
on the WWW, including whole genome visualization and analysis, gene finding
and analysis of DNA microarray data alongside predictions of protein sorting,
post-translational modifications, structure and function. The services are
currently available as traditional paste-and-click (interactive) HTML
forms. Following the EMBRACE technology recommendations we are in the process
of implementing them as Web Services proper. The activities involve the
following:
- Development of an environment for implementing SOAP based Web Services
on the server side;
- Implementation of individual services (main task);
- Development of client side solutions aiming at using Web Services
in practice;
- Work on Web Service interoperability and data type integration.
Below we describe the activities above in greater detail. Specifically,
we describe the current status of the work and outline the future plans.
1. Server side environment
We have developed a stable environment for implementing SOAP based Web
Services on the server side. The implementation is in Perl/SOAP::Lite.
It features easy-to-customize template scripts, a queue and a logging
function. It allows for asynchronous calls when that is preferable.
Most services implemented so far are fully asynchronous; typically, the
usage is split into the three operations 'runService', 'pollQueue' and
'fetchResult'. All the services are described in documented WSDL files;
they have been tested with SoapUI, Perl/SOAP::Lite and Python/SOAPpy
scripts on the client side.
The key implementation decisions have been as follows:
- Technology: the choice of Perl/SOAP::Lite has been purely pragmatic,
reflecting the existing programming skills in our lab. We are aware of
the existence of alternative technologies. Still, we found that using
Perl/SOAP::Lite we were able to implement stable services with a
reasonable effort.
- Asynchronous calls: this was necessary to implement as most of
our services are computationally demanding. The response is seldom
immediate. In the absence of commonly accepted standard we have developed
a simple scheme (see above). This approach is in agreement with the
EMBRACE technology recommendations, pending a common standard for
asynchronous calls.
2. Implementation of individual services
The complete list of the electronic services at CBS is provided
on the
prediction servers and
bioinformatics tools pages. All
the services are currently implemented as traditional paste-and-click
(interactive) HTML forms. The effort to implement them as SOAP based
Web Services is in progress.
We intend to make the corresponding Web Services easy to understand
and use. To that end we have given special consideration to the following
issues:
- Extensive and easy-to-find documentation
The Web Services implemented so far are available from the CBS
Web Services page. Each
Web Service has its own WWW page (e.g.
http://www.cbs.dtu.dk/ws/SignalP)
where the basic information about the service can be found:
- service name and basic function;
- link to the WSDL file defining the service;
- list of links to the schema definitions used
by the service;
- examples of client side scripts (PERL, PYTHON)
using the service;
- documentation (identical to the documentation
found in the WSDL file).
- Suitable extent of implementation
Typically the implemented services offer the same functionality as
their paste-and-click counterparts. However, we have focused on the
scientific content of the output and limited the generation of graphics,
extended formats etc. We believe that our Web Services will be used
for efficient automatic processing of large amounts of data. The clients
with special interests e.g. just one protein or a few genes are likely
to prefer the traditional paste-and-click services for detailed study.
This policy may change if we receive comments from users pointing
to the contrary.
- Easiness of parsing
The programming effort involved in parsing the output of a Web Service
is greatly reduced if it is structured i.e. expressed in
strictly defined data types. Therefore, in most cases we have chosen
to produce strongly typed output in an easy-to-access form making it
convenient to integrate with local routines and other Web Services.
In fact, the WP3 recommendations discourage loose data typing as
it weakens the advantages of the selected technology. The matter is
discussed in greater detail in our
comments on implementation
strategy.
The setup described above is quite new; we are very grateful for
comments and improvment suggestions. The effort is in progress; the setup
is being refined and more services are being added. The order in which
this is done depends very much on user feedback.
3. Client side solutions
Our effort on the client side focuses on development of scripts and programs
calling SOAP based Web Services. Such scripts can be executed directly
from the command line or included in larger softwares, increasing
the local bioinformatics toolbox dramatically with a limited programming
effort. The operations of any Web Service, local or remote, are easily
turned into functions (subroutines) in the scripts and programmes created
locally. We feel that this approach makes the best of the Web Service
technology and its power to provide a truly integrated data and tool
environment. We see client side scripting as a useful supplement
to the existing and emerging graphical interfaces to Web Services.
The key implementation decision on the client side was the choice of
language. Perl was the obvious starting point as the server side work
was done in that language. However, we have discovered that
the Web Services support in Python is much better; the object oriented
features of Python agree very well with Web Services. Although the
commonly used Python module
SOAPpy
suffers from lack of updates (most recent version 0.12 is from Feb 2005),
we are in the process of supplementing our Perl client scripts with Python
counterparts using this module.
On Feb 6-8, 2008 CBS will host an EMBRACE workshop on
Client Side Scripting
for Web Services. The workshop will focus on practical use of Web
Services in scripts and programs developed on the user's own computer.
There will be several hands-on exercises; the general concepts involved
in calling Web Services from a script will be explained in detail. On-line
registration is currently open with the deadline on December 3, 2007.
On Jan 24-26, 2007 CBS organized an EMBRACE workshop on
Bioinformatics
of Immunology. During the workshop exercises the participants
were given opportunity to develop client side WS scripts in Perl. For
convenience, the exercise cookbooks and script examples remain on-line,
linked from the workshop
program.
4. Web Service interoperability and data type integration
Client side applications often need to use the output of one Web Service
as input to another. If the data types match the task of linking Web Services
becomes straightforward. If they do not match it is still possible but
it requires additional programming effort, slowing down the software
development. If the Web Services involved operate on untyped data the
client's programming effort is even greater. Therefore, we are strong
advocates of strictly typed Web Services and standard data types.
In our own implementation effort we have tried to reuse data types whenever
possible. We have defined a number of data types of general validity (see
examples).
We intend to use those data types consistently in our Web Services,
simplifying implementation and usage. We address Web Service
interoperability in greater detail in our
comments
on implementation strategy.
Acknowledgements
Thanks are due to
Jan Christian Bryne at
Computational Biology Unit
(CBU) in Bergen, Norway who organised a workshop on Web Services in
May 2006. On that workshop we gained the initial understanding of
SOAP based Web Services and charted the implementation possibilities.
We also attended the joint EMBRACE Work Package 1 and 2 workshop at
EBI in Hinxton, UK in
June 2006. We are grateful to Peter Rice and Alan Bleasby for leading
the discussions that allowed us to define the main issues to be
addressed in our own implementation effort.
Pall Isolfur Olason created the first prototype Web Services at CBS.
In the current effort we benefited greatly from his experience and help.