Documentation: Website - Core - Worker - Frontend - Try Online - Technical Architecture
Flowfile is a visual ETL tool and Python library suite that combines drag-and-drop workflow building with the speed of Polars dataframes. Build data pipelines visually, transform data using powerful nodes, or define data flows programmatically with Python and analyze resultsβall with high-performance data processing. Export your visual flows as standalone Python/Polars code for production deployment.
Perform complex joins (fuzzy matching), text-to-rows transformations, and advanced filtering/grouping using a visual interface.
Export your visual flows as standalone Python/Polars scripts. Deploy workflows without Flowfile dependencies or share ETL logic as readable code.
Standardize data formats and handle messy Excel files efficiently.
Built to scale out-of-core using Polars for lightning-fast data processing.
Save flows as human-readable YAML or JSON files, making them portable and version-control friendly.
Flowfile is designed to be flexible. Choose the installation method that fits your workflow.
- Python 3.10+
- Node.js 16+ (for frontend development)
- Poetry (Python package manager)
- Docker & Docker Compose (optional, for Docker setup)
- Make (optional, for build automation)
Install Flowfile directly from PyPI. This gives you both the visual UI and the programmatic flowfile_frame API.
pip install FlowfileLaunch the Visual UI: Start the web-based UI with a single command:
flowfile run uiUse the FlowFrame API: Create pipelines programmatically using a Polars-like syntax:
import flowfile as ff
from flowfile import col, open_graph_in_editor
# Create a data pipeline
df = ff.from_dict({
"id": [1, 2, 3, 4, 5],
"category": ["A", "B", "A", "C", "B"],
"value": [100, 200, 150, 300, 250]
})
# Process the data
result = df.filter(col("value") > 150).with_columns([
(col("value") * 2).alias("double_value")
])
# Open the graph in the web UI
open_graph_in_editor(result.flow_graph)For more details, see the flowfile_frame documentation.
Run the full suite (Frontend, Core, Worker) using Docker Compose. Ideal for server deployments or local isolation.
git clone https://github.com/edwardvaneechoud/Flowfile.git
cd Flowfile
docker compose up -dAccess the app at http://localhost:8080.
The desktop version offers the best experience for non-technical users with a native interface and integrated backend services.
Option A: Download Pre-built Application
Download the latest release from GitHub Releases and run the installer for your platform (Windows, macOS, or Linux).
Note: You may see security warnings since the app isn't signed with a developer certificate yet.
- Windows: Click "More info" β "Run anyway"
- macOS: If you see an "app is damaged" error, run this in Terminal:
Then open the app normally. This clears the quarantine flag that macOS sets on downloaded apps.find /Applications/Flowfile.app -exec xattr -c {} \;
Option B: Build from Source
git clone https://github.com/edwardvaneechoud/Flowfile.git
cd Flowfile
# Build packaged executable
make # Creates platform-specific executable
# Or manually:
poetry install
poetry run build_backends
cd flowfile_frontend
npm install
npm run buildFor a zero-setup experience, try the WASM version. It runs entirely in your browser using Pyodide (no server required).
Live Demo: demo.flowfile.org
This lite version includes 14 essential nodes for data transformation:
- Input: Read CSV, Manual Input
- Transformation:
- Basic: Filter, Select, Sort, Unique, Take Sample
- Reshape: Group By, Pivot, Unpivot, Join
- Advanced: Polars Code (write custom Python/Polars logic)
- Output: Preview (view in browser), Download (CSV or Parquet)
For contributors who need hot-reloading and direct access to services.
git clone https://github.com/edwardvaneechoud/Flowfile.git
cd Flowfile
poetry install
# Start backend services
poetry run flowfile_worker # Starts worker on :63579
poetry run flowfile_core # Starts core on :63578
# Start frontend (in a new terminal)
cd flowfile_frontend
npm install && npm run dev:web # Starts web interface on :8080One of the most powerful features is the ability to visualize your data transformation pipelines:
- Inspect Data Flow: See exactly how your data is transformed step by step
- Debugging: Identify issues in your data pipeline visually
- Documentation: Share your data transformation logic with teammates
- Iteration: Modify your pipeline in the Designer UI and export it back to code
Flowfile operates as three interconnected services:
- Designer (Electron + Vue): Visual interface for building data flows
- Core (FastAPI): ETL engine using Polars for data transformations (
:63578) - Worker (FastAPI): Handles computation and caching of data operations (
:63579)
Each flow is represented as a directed acyclic graph (DAG), where nodes represent data operations and edges represent data flow between operations. You can export any visual flow as standalone Python/Polars code for production use.
For a deeper dive, check out this article on our architecture.
- Add cloud storage support
- S3 integration
- Azure Data Lake Storage (ADLS)
- Multi-flow execution support
- Polars code reverse engineering
- Generate Polars code from visual flows (via the "Generate code" button)
- Import existing Polars scripts and convert to visual flows
- Add comprehensive docstrings
- Create detailed node documentation
- Add architectural documentation
- Improve inline code comments
- Create user guides and tutorials
- Implement proper testing
- Add CI/CD pipeline
- Improve error handling
- Add monitoring and logging





