Orkana — One platform for data and agents

The integration tax

Enterprises want AI agents.
They get stuck wiring 10+ tools.

To put one production agent in front of real enterprise data, teams stitch together a stack of disconnected tools — each with its own auth, deployment, vendor, and on-call rotation.

Most projects never finish the plumbing. The agent ships late, on stale data, behind a queue of integration tickets.

Orkana replaces all of it

One platform. One auth. One deployment. One on-call.

13+ vendors to wrangle 13 contracts · 13 SSOs 13 on-call rotations + custom governance code

AirbyteConnectors

AirflowPipelines

JupyterNotebooks

Apache SparkProcessing

TrinoFederation

PineconeVector DB

LangChainAgents

n8n / ZapierGlue

Apache KafkaStreaming

KeycloakIdentity

DBeaverExploration

GrafanaVisualization

Time cost

12–18 mo

to wire a production stack before any AI value ships.

Money cost

$1–5M

in licenses, integration work, and headcount per enterprise, per year.

Outcome

95%

of enterprise GenAI pilots deliver zero measurable P&L impact.

Source · MIT, 2025

Product

Build, run, and govern everything in one workspace.

From raw Kafka stream to deployed agent. Visual workflows, native compute, agent runtime, dashboards, metadata explorer — all on the same data plane, same auth, same deployment.

Visual workflows

Drag-and-drop DAGs where every node runs real compute — Python, TypeScript, SQL, Spark, Trino, agents, HTTP, Kafka publish.

Agent runtime

Production-grade agents with built-in tool orchestration, retrieval, self-evaluation, and RBAC — talking to your data through the same engine.

Metadata explorer

Browse every schema, Kafka topic, S3 bucket, vector index. Columns, types, lineage and nullability inline — no second BI tool.

Dashboards & BI

Drag-and-drop charts on top of any query or workflow output. Track pipeline health, agent traces, and KPIs in the same surface.

100+ connectors

Databases, warehouses, S3, Google Drive, Slack, Telegram, Kafka, webhooks. Add a custom one in TypeScript in an hour.

Sovereign deploy

Run on AWS, Azure, GCP, or fully on-prem and air-gapped. Domain-based isolation lets one deployment serve twenty teams safely.

Data layer

Visual workflows.
Real compute.

A single canvas where every node is a real operation — a Spark job, a Trino query, a Python function, an HTTP call, an LLM step, a Slack notification, or a Kafka publish.

Python · TypeScript · SQL runtimes inline
Spark, Trino, Kafka as first-class engines
Branching, retries, scheduling built in
Runs, logs, traces, lineage for every execution

WORKFLOW · orkana-agent-rag● running · 7 nodes

Agent layer

Agents that actually know your data.

Build agents on the same canvas as your pipelines. They query your warehouse, read from Drive and Slack, call internal APIs, and respect RBAC — out of the box.

RAG over pgvector with live re-indexing
Tool orchestration with execution traces
Self-evaluation loops with retry on low confidence
OIDC / SSO / RBAC end-to-end

AGENT · data-analyst● connected · 4 tools

Top 5 customers by churn risk this quarter?

→ trino.query · crm.customers

Pulled 12,400 accounts. Top 5 by risk score (RAG + behavioral signals):

1 · Nimbus Holdings0.91

2 · Axiom Labs0.88

3 · Portcore Inc.0.74

4 · Merenda AG0.71

5 · Kite & Sons0.68

Sources: crm.customers, events.app_sessions, support.tickets · 3.8s

Exploration

Browse every schema.
Stop tab-hopping.

Postgres, MinIO, Kafka, vector indexes — one explorer with columns, types, nullability and lineage inline. Engineers stop bouncing between DBeaver, Grafana, and the warehouse console.

Unified metastore across all sources
Inline schema, types, nullability
SQL editor with history and run preview
Lineage graph over pipelines and agents

EXPLORER · postgres.public / kafka / s3● 4 connectors · 128 tables

SOURCES

▾ ◆ postgres

▾ public

▸ ▦ customers

▸ ▦ db_connector

▸ ▦ dashboards

▸ ▦ workflows

▸ ▦ workflow_runs

▾ ◆ kafka

▸ ⎇ topics (12)

▸ ⎇ consumers (4)

▸ ◆ s3 · prod-lake

▸ ◆ snowflake

db_connector TABLE AGENT-VISIBLE

postgres.public.db_connector · 7 columns · 12,482 rows

NameTypeNullKey

iduuid✕PK

namevarchar(128)✕

typeenum✕

configjsonb✓

created_attimestamptz✕

created_byuuid✕FK

versionint✕

Insight

Dashboards on the same data plane.

Drag-and-drop charts on top of any query or workflow output. Pipeline health, agent traces, and business KPIs — without spinning up a second BI tool.

Live charts wired to any source or workflow
Mixed widgets — tables, KPIs, pies, time series
Embeddable into internal tools via signed URL
Alerts on thresholds, with Slack/Telegram routing

DASHBOARD · pipeline-health● live · 30s

Runs · 24h

12,482

Success

99.2%

p95 latency

412ms

Runs over time

Success78%

Failed14%

Routed8%

Throughput by connectorevents / min

Integrations

Connect once.
Reuse everywhere.

One connector definition feeds workflows, agents, dashboards, and the metadata catalog. Add a custom connector in TypeScript in under an hour.

Warehouses & DBs — Postgres, BigQuery, Snowflake, Trino
SaaS & messaging — Slack, Telegram, Drive, S3
Streams — Kafka, webhooks, MQTT
Custom — TypeScript SDK, deploy to marketplace

100+ connectors available

Retrieval

Knowledge that stays
in sync with reality.

Define a knowledge base from any source — content, code, SOPs. Pick a chunker, embeddings provider, and splitter; agents query it through the same governed retrieval engine.

Live re-indexing when source data changes
Bring-your-own embeddings — OpenAI, Cohere, local
Hybrid retrieval — vector + keyword + filters
Per-document RBAC with audit on every read

KNOWLEDGE BASE · product-docs● indexed · 4m ago

Sourcedrive://Product/Docs242 docs

Chunkerrecursive · 1024 / 128char

Embeddingsopenai · text-embedding-3-large3072d

Splittermarkdown-awareheading

RBACper-document · audit on readON

Schedulere-index on source changelive

Chunks

48,210

Avg length

812

Recall@10

94.1%

Architecture

Production runtime, not a notebook demo.

Java Spring Boot backend. Angular frontend. Kafka for events, Redis for state, Postgres + pgvector for data and embeddings. Spark and Trino for heavy work. Keycloak for identity. Every layer ships in the repo today.

Workflow builder Agentic chat AI Assistant Code editor Data Explorer Charts & BI

Agents + RAG Spark Trino DAG runner Code sandboxes Metadata catalog

Kafka Redis Postgres Vector store MinIO · S3 Keycloak IAM OpenTelemetry

Postgres BigQuery Snowflake S3 MinIO GCS Google Drive Slack Telegram Kafka Webhooks MQTT + 100 more

Docker Kubernetes AWS · Azure · GCP On-prem · Air-gapped

Who it's for

Three teams. One platform.

Data engineering teams drowning in tool sprawl

Replace Airbyte + Airflow + Spark + Trino + Grafana + DBeaver with one workspace. Same DAGs, same RBAC, half the on-call.

Mid-market Enterprise Saas

AI/ML teams shipping agents on real data

Skip the LangChain glue. Build agents that already have governed access to your warehouse, your docs, and your event streams.

Applied AI RAG Tool use

Regulated industries — banks, gov, healthcare

Deploy fully on-prem or air-gapped. Domain-based workspace isolation, audit on every read, no data leaves the perimeter.

Sovereign FSI Public sector

Pricing

Start in the cloud.
Scale to your perimeter.

Starter

$15/ workspace · month

For solo builders and evaluation. Hosted on Orkana Cloud, single workspace, community support.

1 workspace · 3 seats
10 connectors · 100 workflow runs/day
Built-in agent runtime
Community support
Public dashboards

Start free

One platform for
data and agents.

Enterprises want AI agents.
They get stuck wiring 10+ tools.