wiki:PipelineWS/API
Last modified 10 years ago Last modified on 01/15/09 23:24:12

DAISY Pipeline Web Service API

Basic Pipeline API

This section describes the minimalist API to the Pipeline functionality.

Script Management

Pipeline Jobs are based on Pipeline Scripts files. The user can chose to maintain its own collection of Scripts, or can alternatively rely on the built-in collection coming with the Pipeline distribution. In the latter case, the Pipeline-maintained script collection can be accessed via:

public List<String> getScriptNames()
Returns the list of the names (unique ID) of the scripts available in the system.
public Script getScript(String name)
Optional. Returns the script with the given name.
public List<Script> getScripts()
Optional. Returns a list of the scripts available in the system.

Potentially, the Pipeline-maintained collection of scripts can be editable via:

public void addScript(Script script)
Adds the given script to the Pipeline-maintained collection of scripts.
public void removeScript(String name)
Removes the script of the given name from the Pipeline-maintained collection of scripts

Job Execution

To execute a Pipeline job, the user needs to give the Pipeline a reference to the script the job is based on, and a set of valued parameters for configuring this job.

If the user maintains its own collection of scripts, then he has to forward the script to the Pipeline on each execution request, using:

public Long execute(Script script, List<Parameters> parameters)
Executes the Job defined by the given script configured with the given parameters. Returns the ID of the job being executed, for later reference.

If on contrary the user wants to execute a Job based on a script maintained by the Pipeline, it will use:

public Long execute(String scriptName, List<Parameters> parameters)
Executes the Job defined by the Pipeline-maintained script with the given name and configured with the given parameters. Returns the ID of the job being executed, for later reference.

Note that because most of the Pipeline scripts require one or several input files as parameters, it may be relevant to provide an independent method to upload this content to the Pipeline service (in case this service is running on a remote host):

public String upload(byte[] file)
Uploads the byte content to the Pipeline service, and returns the path (or ID) of the content on the Pipeline system, for later use as a parameter.

Message Notification

The Pipeline can send messages occurring during a job execution with both a push or pull patterns.

In the "pull" approach, the client asks the Pipeline for the messages of a particular job, via

public List<Message> getMessages(Long jobID)
Returns the list of messages that were sent during the execution of the job with the given ID.

In the "push" approach, the Pipeline will notify the clients whenever a job execution raises a new message. Note that this implies that the client provides the Pipeline service with a reply-to address in the job execution request. The client will have to provide the following method:

public void receiveMessage(Message message, Long jobID)
Receives a Pipeline message coming from the execution of the job with the given ID.

Status and Results Retrieval

The client shall be able to query the status of a job, especially in a "pull" approach, via:

public Status getStatus(Long jobID)
Returns the current status of the job with the given name.

Alternatively, in a "push" approach, the Pipeline could notify the client of any status change. Note that this implies that the client provides the Pipeline service with a reply-to address in the job execution request.

public void statusChanged(Long jobID, Status status)
Notifies that the status of the job with the given ID changed.

In the previous methods, the Status type is basically an enumeration, it could be a String.

Additionally, the service may provide similar methods to query the status of each transformer step in a job.

Once the Pipeline job has been executed, the client can retrieve an archive containing the results with:

public byte[] getJobResults(Long jobID)
Returns an archive containing the result file set of the job with the given ID.

Extended PipeOnline API

This section describes the API to the extended PipeOnline system. The PipeOnline intends to be a robust database-backed application, with built-in support for secured user management, execution queues, email notification, and persistence of usage statistics.

The PipeOnline system is currently available via the PipeOnline web application, but it would make sense to expose its functionality via a Web Service API that would extend the basic Pipeline API.