Core Concepts
This page explains the fundamental concepts that drive BoCoFlow and help you understand how workflows function.
Workflows
A workflow in BoCoFlow is a directed graph where:
- Nodes represent computational operations or data sources/sinks
- Edges represent the flow of data between nodes
- Execution follows the dependencies defined by the connections
Workflows are stored as JSON files that contain:
- The visual layout of nodes
- Node configurations
- Connection information
- Environment settings
Nodes
Nodes are the building blocks of BoCoFlow workflows. Each node represents a specific operation or function and has:
Node Anatomy
- Header: Contains the node name and action buttons
- Ports: Input and output connection points
- Input ports: Accept data from preceding nodes
- Output ports: Send data to subsequent nodes
- Flow ports: Special ports for flow control connections
- Status indicator: Shows the current execution state
Node Configuration
Nodes are configured through a configuration modal that appears when you click the ⚙️ icon. Configuration options vary by node type but typically include:
- File paths: Locations for input/output files
- Parameters: Processing options
- Flow variables: Dynamic parameters from flow control nodes
Node Types
BoCoFlow includes several categories of nodes:
- I/O Nodes: Read and write data (CSV, text, databases, etc.)
- Manipulation Nodes: Transform and process data
- Visualization Nodes: Create visual representations of data
- Flow Control Nodes: Manage workflow execution and variables
Data Flow
Data flows through your workflow from one node to the next, following these principles:
- Node Execution: When a node executes, it processes input data and produces output data
- Data Passing: Output data from one node becomes input data to connected nodes
- Formats: Data typically passes as JSON-serialized structures
- Caching: Executed node results are cached to avoid redundant computation
Working Directory
The working directory is a crucial concept in BoCoFlow:
- It serves as the root location for your workflow project
- Relative paths in nodes are resolved against this directory
- Workflow status and execution logs are stored here
- The directory structure helps maintain portability
Path Handling
BoCoFlow uses a prefix system to handle file paths:
- abs: prefix indicates an absolute path on your system
- rel: prefix indicates a path relative to the working directory
This system ensures workflows can be portable across different environments.
Flow Variables
Flow variables provide a way to parameterize your workflows:
- Global flow variables: Available to all nodes in a workflow
- Local flow variables: Created by flow control nodes and available to downstream nodes
- Use cases: Controlling file paths, setting processing parameters, creating dynamic workflows
Creating Flow Variables
- Use the "Global Flow Menu" or add flow control nodes
- Configure with a variable name and default value
- Reference in node configurations
Using Flow Variables
When configuring a node:
- Check the "Use Flow Variable" option for a parameter
- Select the desired variable from the dropdown
- The variable's value will be used during execution
Execution Model
BoCoFlow's execution follows a dependency-based model:
- Dependency Analysis: The system determines the execution order based on node connections
- Selective Execution: Only nodes that need to be executed are processed
- Force Execution: The "force_to_run" option allows re-execution even when cached results exist
- Status Tracking: Execution status is tracked and visualized for each node
Conda Integration
BoCoFlow can integrate with Conda environments to manage Python dependencies:
- Conda Path: Path to your Conda installation
- Default Environment: The Conda environment to use for node execution
- Environment Management: Different nodes can use different environments
Next Steps
Now that you understand the core concepts, learn more about: