DataFlow: An AI-Powered Data Annotation Platform

How DataFlow uses Meta's SAM 3 to auto-label image datasets and export them in COCO format — and why I built it the way I did.

There's a problem that doesn't get talked about enough in the ML world: labeling data is genuinely miserable.

You have a dataset of, say, 2,000 product images. You need pixel-perfect segmentation masks. Your options are either to pay a labeling service that takes weeks and costs more than your compute budget, or to spend evenings manually drawing polygons around objects in some clunky annotation tool — only to realize you missed a class halfway through.

I ran into this wall on a personal project. And instead of accepting it, I built DataFlow — a local, end-to-end annotation platform that takes a zip of images, accepts your label classes, runs Meta's Segment Anything Model 3 (SAM 3) to auto-label everything, shows you the results with visual analytics, and exports a clean COCO-format dataset ready to plug into any training pipeline.

This post walks through why I made the decisions I did, how the pipeline actually works under the hood, and what surprised me along the way.

The Problem With Existing Annotation Tools

Before building anything, I spent time with the popular options — LabelImg, CVAT, Roboflow, Label Studio. They're all solid tools. But they share a fundamental assumption: a human is doing the labeling.

Even the ones with "AI assist" features treat automation as a secondary helper. You still click, you still drag, you still review every single polygon. For small datasets that's fine. For anything at scale, it becomes a bottleneck that quietly kills your momentum.

What I actually wanted was a tool where I could say: here are my images, here are my 4 classes — and then walk away. Come back to a labeled dataset. That didn't exist in a simple, local, open-source form. So I built it.

The Stack

Before getting into the pipeline, here's what DataFlow is built on:

Meta SAM 3 — the segmentation backbone
Streamlit — for the entire UI (more on this choice later)
Python — orchestrating everything in between
COCO JSON format — the export target

No cloud. No API keys. Runs entirely on your machine.

How the Pipeline Works

Step 1 — Upload Your Dataset

The user starts by uploading a .zip file containing their image dataset. DataFlow unpacks it, reads the image files, and queues them up for processing.

This sounds simple, but the zip-based approach was a deliberate choice. Most annotation workflows assume you're working with images already on a server. A zip upload means DataFlow works completely offline, which matters for teams working with proprietary or sensitive datasets that can't leave their network.

Step 2 — Define Your Labels

Next, the user types in their label classes — things like ["car", "pedestrian", "road", "sky"]. Whatever they need.

Here's the key design decision: the number of classes the user enters directly determines the segmentation configuration. If you enter 4 labels, DataFlow configures SAM 3 to produce 4-class segmentation across the entire dataset. There's no separate configuration step. The label list is the configuration.

This keeps the UX dead simple. You don't need to understand model internals. You just describe what you're looking for.

Step 3 — SAM 3 Does the Heavy Lifting

This is where the real work happens.

Meta's SAM 3 (Segment Anything Model 3) is a prompt-driven segmentation model. It can take point prompts, bounding box prompts, or text prompts and return precise segmentation masks. DataFlow uses it in an automated loop — iterating through each image, running inference, and mapping the model's output masks to the user's defined label classes.

For each image, DataFlow generates prompts automatically based on the label classes, runs this inference, collects the masks, and stores them alongside the image metadata. The whole thing runs sequentially through your dataset — no manual intervention needed once it starts.

Step 4 — Review the Results

Once the labeling pass is complete, Streamlit renders a results dashboard. This includes:

A sample grid showing annotated images with their segmentation masks overlaid
Per-class distribution charts (so you can immediately spot class imbalance)
Confidence score histograms from SAM's output
Total annotation counts across the dataset

This step matters more than it might seem. Automated labeling is not perfect — no model is. The analytics view lets you catch systematic errors fast. If one class consistently has low confidence scores, that's a signal to review those annotations before training on them. Better to catch it here than after a 12-hour training run.

Step 5 — Export to COCO Format

The final step is export. One click, and DataFlow generates a COCO-format JSON file alongside the images.

COCO is the industry standard for image segmentation datasets. Every major training framework — PyTorch, Detectron2, MMDetection, Ultralytics YOLO — can ingest it directly. The export structure looks like this:

{
  "images": [
    { "id": 1, "file_name": "image_001.jpg", "width": 1280, "height": 720 }
  ],
  "annotations": [
    {
      "id": 1,
      "image_id": 1,
      "category_id": 2,
      "segmentation": [[x1, y1, x2, y2, ...]],
      "area": 14520.0,
      "bbox": [120, 80, 300, 200],
      "iscrowd": 0
    }
  ],
  "categories": [
    { "id": 1, "name": "car" },
    { "id": 2, "name": "pedestrian" }
  ]
}

Download the zip. Drop it into your training pipeline. Done.

Why Streamlit?

I knew this question would come up. Streamlit isn't the obvious choice for something like this — it's usually associated with quick demos and dashboards, not full annotation platforms.

But for DataFlow, it was exactly right. Here's my honest reasoning:

The goal was to eliminate friction. A React frontend with a FastAPI backend would have been more flexible, but it would have also meant more setup, more configuration, more things to break. With Streamlit, the entire app — file upload, label input, progress tracking, analytics, export — lives in a single Python file. You clone the repo, run streamlit run app.py, and you're working.

For a tool that's meant to be local and fast to spin up, that matters enormously.

The trade-off is that Streamlit's interactivity model has limits. Real-time progress during SAM inference required some workarounds with st.progress() and session state. But it's manageable, and the simplicity elsewhere more than compensates.

What Actually Surprised Me

SAM 3 is remarkably good at zero-shot segmentation. I expected to spend a lot of time tuning prompt strategies. In practice, the masks it produces — especially on clean, well-lit images — are genuinely impressive without any fine-tuning.

Class imbalance shows up immediately. The analytics dashboard made it obvious, on the very first dataset I tested, that one class was severely underrepresented. Without the distribution chart, I would have trained on that dataset and only discovered the problem after the fact.

COCO format is both great and annoying. It's the right choice — universal support, well-documented, battle-tested. But generating correct polygon segmentation arrays from binary masks requires careful handling of contour extraction. Getting this right took longer than anything else in the project.

Who DataFlow Is For

If you're an ML engineer who regularly works with custom image datasets and wants to stop spending evenings in annotation tools, DataFlow is built for you.

If you're building a computer vision product and need to iterate quickly on labeled datasets without a dedicated labeling team, this fits that workflow.

If you're a founder or PM evaluating whether a vision model is feasible for your use case, DataFlow lets you go from raw images to a trainable dataset in an afternoon — which is exactly the kind of fast feedback loop that matters early in a product.

What's Next

DataFlow works well for what it is, but there are real gaps I want to close:

Manual correction UI — SAM 3 isn't perfect, and right now there's no way to fix a bad mask in-app. A simple polygon editor would close this loop.
Batch class assignment — right now you can't reassign a label after the fact without re-running inference. That needs to change.
Support for video datasets — SAM 3 has video capabilities. Extending DataFlow to handle frame-by-frame video annotation is the most interesting direction from a technical standpoint.

Final Thoughts

The annotation bottleneck is real, and it quietly slows down a huge number of ML projects that would otherwise be viable. DataFlow doesn't solve every part of that problem, but it solves the part that frustrated me most — the manual, repetitive labeling work on datasets where you already know what you're looking for.

SAM 3 makes automated segmentation genuinely tractable now in a way it wasn't even two years ago. Wrapping it in a simple, local tool that any developer can run without infrastructure setup felt like the obvious thing to build.

If you're working on something where this would be useful, I'd genuinely love to hear about your use case.

Built with Meta SAM 3, Streamlit, and Python. COCO format export. Runs locally — no cloud required.