Connectors ========== A connector is a Python function, generated by an LLM, that maps one tool's output dict onto the next tool's input dict. One connector is generated per edge. Generation ---------- The first time an edge executes, biocomposer builds a prompt from: - the downstream tool's input schema (each field's ``type``, ``cardinality``, ``format``, ``required``); - the upstream tool's output schema; - a snapshot of the **actual runtime data**, directory listings with file heads, list shapes, so the mapping is decided against what really exists, not just declared types. The LLM returns ``edge__to_(output) -> dict``. Caching ------- A connector is keyed by edge identity ``(upstream_id, downstream_id)`` and written to a connectors file, then reused for every later call, across decision-loop iterations and across every item of a gather node. A loop or an N-way fan-out reuses one generated function; there is no per-iteration or per-item regeneration. What a connector does --------------------- The prompt directs the model to reason in steps: - **Entity match**, map a field only when it holds the same semantic data as a downstream field; matching ``type`` alone is not enough. - **Format conversion**, when the downstream expects a different format (e.g. a FASTA file vs. an in-memory sequence), write the conversion. - **Cardinality**, when the downstream wants one item but the upstream is a directory or list of many, select the single primary file, excluding trajectory/intermediate/temporary artifacts. For a **map** edge the same step instead returns the *list* of per-item input dicts (:doc:`nodes/gather`). Input nodes bypass connectors ----------------------------- Connectors are not generated for user-provided :doc:`input node ` values, those are applied verbatim, so the model can never substitute a manifest example value for a real one. This is also why input-node values override upstream outputs on a key clash (:ref:`merge-order`).