Tool nodes

A tool node wraps one tool from the registry and runs it once per execution.

align = g.add_node("clustalo")

Creation installs the tool

add_node():

  1. waits for the container backend (Docker) to be ready;

  2. adds the tool to the project and pulls its image (bv add / bv sync), restoring from the on-disk image cache when present;

  3. reads the tool’s manifest (bv show --format json) and stores it on the node.

The manifest is what the rest of the system reasons over: the tool’s typed inputs and outputs drive connector generation, and its command template drives execution. See Tools and the registry for the manifest format.

What happens when it runs

During execute(), a tool node:

  1. receives its inputs (mapped from upstream outputs by the edge connectors);

  2. is assigned a fresh output directory results/<tool>_output_N, N auto-increments by scanning the filesystem, so repeated runs of the same tool never collide;

  3. runs the tool in its container via bv inside a temporary sandbox, then harvests the declared outputs back into that output directory;

  4. returns a dictionary mapping each output name to its produced path (a directory path for type = "dir" outputs, a file path otherwise).

Cardinality determines composition

A tool input declared cardinality = "one" consumes exactly one item per run. If an upstream produces a collection and the next tool’s input is one, you wrap that tool in a gather node so it runs once per item, the single most common structural decision when building a pipeline. cardinality = "many" means the tool accepts a list directly and no gather node is needed.

Overriding the command

By default a node runs the manifest’s entrypoint with its argument template. Two overrides change that; {slot} placeholders map to the node’s input/output names.

Different binary, for images that expose several programs:

g.add_node("trimal", entrypoint_override="readal")

Different argument template, to change flags or argument order:

g.add_node("colabfold",
           args_override="--num-recycle 0 {fasta} {output_dir}")

Both, common when invoking a non-default binary:

g.add_node("trimal",
           entrypoint_override="readal",
           args_override="-in {alignment} -out {trimmed} -fasta")

Overrides replace the corresponding manifest field for that node only; everything else (image, typed I/O) is unchanged.

Reference

class biocomposer.Node(tool_name: str, inNodes: list = None, args_override: str = None, entrypoint_override: str = None)

Bases: object

get_node_input()
install()
run(inputs: dict) dict