Graph¶
Graph is the pipeline: a set of nodes and the directed edges
between them. Building a graph only describes the pipeline, nothing runs until
execute() (see Execution).
from biocomposer import Graph
g = Graph()
a = g.add_input_node(sequences="/vol/inputs/family.fasta")
b = g.add_node("clustalo")
g.add_edge((a, b))
g.set_output_node(b)
Graphs are defined using nodes (bioinformatics tools) and edges (auto-generated mapper functions) that enable data flow between nodes. There are 6 types of nodes, covered in Nodes. Execution, run order, input merging, results, is covered in Execution.
Edges¶
add_edge takes one or more (upstream, downstream) tuples and records the
wiring; it also tracks fan-out (how many downstreams each node feeds), which the
executor uses for caching. Rules:
A node may have several incoming edges.
An
InputNodemay only be a source.Both endpoints may be nodes or subgraphs.
Edge order matters when two upstreams collide on a key, see Input merging and key clashes on the Execution page.
g.add_edge((rfdiffusion, proteinmpnn), (mpnn_in, proteinmpnn)) # two edges, one call
Output nodes¶
set_output_node(node) marks which node’s result execute()
returns. Execution starts from the output node(s) and walks backward.
A graph may have several outputs; execute() returns
one result per output node. See Output nodes.
Reference¶
- class biocomposer.Graph
Bases:
object- add_decision_node(score_fn: str, conditions: list, modifier_tool) DecisionNode
- add_edge(*edges)
Wire nodes together. Each edge is a 2-tuple (upstream_node, downstream_node).
- add_gather_node(tool_name: str, split_key: str, gpu: str = None, args_override: str = None, entrypoint_override: str = None) GatherNode
Scatter an upstream collection (the upstream output key split_key) over a cardinality-one inner tool, running it once per item and gathering the per-item outputs into a dict-of-lists.
- add_input_node(**kwargs) InputNode
- add_node(tool_name: str, gpu: str = None, args_override: str = None, entrypoint_override: str = None) Node
- execute()
- set_output_node(node: Node = None)