Multi-Tenancy Architecture
Technical architecture of Varpulis's multi-tenancy system, covering the hierarchical tenant model, quota enforcement, billing integration, data isolation, and deployment modes.
Overview
Varpulis uses a three-tier hierarchical tenant model with isolation at every layer:
Isolation is enforced across five dimensions:
- Database — per-tenant PostgreSQL schemas (data plane), shared
publicschema (control plane) - Kubernetes — per-tenant namespaces with NetworkPolicies
- Kafka — per-tenant topic prefixes (
{slug}.*) - Pipelines — hierarchical inheritance (global → tenant → sub-tenant) with read-only enforcement
- RBAC — hierarchical roles with parent-tenant admin access to sub-tenants
Tenant Model
Hierarchy
The organization model supports a strict 3-level hierarchy:
| Level | org_type | parent_org_id | Infrastructure |
|---|---|---|---|
| Global | global | NULL | Dedicated namespace, manages all tenants |
| Tenant | tenant | Global org ID | Own namespace, schema, coordinators, Kafka prefix |
| Sub-tenant | sub_tenant | Tenant org ID | Shares parent's namespace, schema, Kafka prefix |
Constraints:
- Maximum 3 levels — sub-tenants cannot create sub-sub-tenants
- One global org per deployment (seeded at first startup)
- Sub-tenants inherit parent's
db_schema,k8s_namespace, andkafka_topic_prefix
Runtime Layer
The in-memory tenant model lives in crates/varpulis-runtime/src/tenant.rs and is always active, regardless of deployment mode.
TenantManager — central registry for all tenants:
pub struct TenantManager {
tenants: HashMap<TenantId, Tenant>,
api_key_index: HashMap<String, TenantId>, // O(1) lookup
store: Option<Arc<dyn StateStore>>, // persistence backend
max_queue_depth: u64, // 0 = unlimited
pending_events: Arc<AtomicU64>, // global backpressure counter
}
pub type SharedTenantManager = Arc<RwLock<TenantManager>>;Tenant — represents one isolated tenant with its own pipeline namespace:
pub struct Tenant {
pub id: TenantId,
pub name: String,
pub api_key: String,
pub quota: TenantQuota,
pub usage: TenantUsage,
pub pipelines: HashMap<String, Pipeline>, // per-tenant namespace
pub created_at: Instant,
}Pipeline — a deployed VPL program with its own engine instance:
pub struct Pipeline {
pub id: String, // UUID
pub name: String,
pub source: String, // VPL source
pub engine: Arc<tokio::sync::Mutex<Engine>>,
pub status: PipelineStatus, // Running | Stopped | Error
pub orchestrator: Option<ContextOrchestrator>,
pub connector_registry: Option<ManagedConnectorRegistry>,
}Database Layer (SaaS)
When the saas feature is enabled and DATABASE_URL is set, Varpulis uses PostgreSQL for durable tenant state. Migrations live in crates/varpulis-db/migrations/.
Core tables:
| Table | Purpose | Key columns |
|---|---|---|
users | User accounts (OAuth + local) | id, github_id, email, username, role, email_verified |
organizations | Tenant hierarchy | id, owner_id, name, tier, status, org_type, parent_org_id, db_schema, k8s_namespace, kafka_topic_prefix |
org_members | User-org membership | org_id, user_id, role (owner/admin/member/viewer), status |
api_keys | Hashed API keys | id, org_id, key_hash (SHA-256), scopes, expires_at, revoked_at |
pipelines | Deployed pipelines | id, org_id, name, vpl_source, status, scope_level, inherited_from_org_id |
global_pipeline_templates | Admin-managed global pipelines | id, name, vpl_source, status, deployed_by |
usage_daily | Per-day event counts | org_id, date, events_processed, output_events |
All tables with org_id use ON DELETE CASCADE so removing an organization cleans up all related data.
Organization columns for isolation:
| Column | Purpose | Example |
|---|---|---|
org_type | Hierarchy level | global, tenant, sub_tenant |
parent_org_id | Parent org reference | UUID of parent tenant/global |
slug | URL-safe identifier | acme-corp |
db_schema | PostgreSQL schema name | tenant_acme_corp |
k8s_namespace | Kubernetes namespace | varpulis-tenant-a1b2c3d4 |
kafka_topic_prefix | Kafka topic prefix | acme-corp |
Tier System
Runtime Quotas (TenantQuota)
| Tier | Max Pipelines | Max EPS | Max Streams/Pipeline |
|---|---|---|---|
| Free | 5 | 500 | 10 |
| Pro | 20 | 50,000 | 100 |
| Business | 100 | 200,000 | 200 |
| Enterprise | 1,000 | 500,000 | 500 |
Constructed via TenantQuota::free(), TenantQuota::pro(), TenantQuota::business(), TenantQuota::enterprise(), or dynamically with TenantQuota::for_tier(tier_str).
SaaS Quotas (Database organizations columns)
| Tier | Monthly Event Limit | Pipeline Limit | EPS Limit |
|---|---|---|---|
| Free | 100,000 | 5 | 500 |
| Pro | 10,000,000 | 20 | 50,000 |
| Business | 100,000,000 | 100 | 200,000 |
| Enterprise | Unlimited | 1,000 | 500,000 |
The organizations.tier column ("free", "pro", "business", "enterprise") maps to both runtime TenantQuota and the DB-level pipeline_limit, events_per_second_limit, and monthly_event_limit columns. Admins can override individual limits per-org.
Tenant Isolation
API Key to Tenant Lookup
Every API request includes an X-API-Key header (or Authorization: Bearer/ApiKey, cookie, or query parameter). The lookup is O(1) via api_key_index:
// TenantManager
pub fn get_tenant_by_api_key(&self, key: &str) -> Option<&TenantId> {
self.api_key_index.get(key)
}In SaaS mode, API keys are additionally validated against the database using SHA-256 hash comparison:
pub async fn org_id_for_api_key(&self, raw_key: &str) -> Option<Uuid> {
let hash = hex::encode(sha2::Sha256::digest(raw_key.as_bytes()));
let api_key = repo::get_api_key_by_hash(&pool, &hash).await.ok()??;
Some(api_key.org_id)
}Pipeline Namespace Isolation
Each tenant owns a separate HashMap<String, Pipeline>. The API handler pattern ensures tenants can only access their own pipelines:
let tenant_id = mgr.get_tenant_by_api_key(&api_key)?; // 1. identify tenant
let tenant = mgr.get_tenant_mut(&tenant_id)?; // 2. scope to tenant
let pipeline = tenant.pipelines.get(&pipeline_id)?; // 3. access pipelineEach pipeline has its own Arc<Mutex<Engine>>, mpsc::Receiver<Event>, and broadcast::Sender<Event> — no cross-tenant data leakage.
Database-Level Isolation
Shared Schema (Control Plane)
The public schema stores cross-tenant data: users, organizations, org_members, api_keys, global_pipeline_templates. All queries filter by org_id.
Per-Tenant Schema (Data Plane)
Each tenant gets a dedicated PostgreSQL schema (e.g., tenant_acme_corp) containing tenant-specific data plane tables: pipelines, usage_daily. Sub-tenants share their parent's schema, scoped by org_id within the schema.
Schema provisioning flow:
- Admin creates tenant →
org_type = 'tenant'row inorganizations - System creates schema:
CREATE SCHEMA tenant_{slug} - Runs data plane migrations within the new schema
- Sets
organizations.db_schema = 'tenant_{slug}'
Repo functions: set_tenant_schema(), get_organization() returns db_schema.
Pipeline Inheritance
Pipelines follow a hierarchical visibility model:
| Pipeline Owner | Visible At | Editable At | scope_level |
|---|---|---|---|
| Global org | Global + All tenants + All sub-tenants | Global only | global |
| Tenant org | Tenant + Its sub-tenants | Tenant only | tenant |
| Sub-tenant | Sub-tenant only | Sub-tenant only | own |
Key columns on pipelines:
scope_level—'global','tenant', or'own'inherited_from_org_id— source org for inherited (read-only) pipelines
API enforcement:
visible_pipelines()returns own + parent + global pipelines- Inherited pipelines are marked
read_only: truein API responses - Delete/update operations are rejected for read-only pipelines
- Global pipeline templates are copied to all tenants via
create_global_pipeline_copy()
DB sync (saas feature): Pipeline deploy/delete in the runtime TenantManager syncs to PostgreSQL via ApiState.db_pool, ensuring the hierarchy-aware API can query all pipelines.
Kubernetes Per-Tenant Namespaces
Each tenant gets a dedicated Kubernetes namespace. Sub-tenants share their parent tenant's namespace.
Admin API endpoints:
| Endpoint | Method | Description |
|---|---|---|
/api/v1/admin/tenants/{id}/namespace | POST | Provision k8s namespace + NetworkPolicy |
/api/v1/admin/tenants/{id}/namespace | GET | Get namespace info with cluster status |
/api/v1/admin/tenants/{id}/namespace | DELETE | Deprovision namespace |
Provisioning creates the namespace and applies a default-deny NetworkPolicy via kubectl. Sub-tenants cannot provision their own namespace (returns 400).
Kafka Topic Isolation
Shared Kafka cluster with per-tenant topic prefixes. Sub-tenants use their parent's topic prefix.
Admin API endpoints:
| Endpoint | Method | Description |
|---|---|---|
/api/v1/admin/tenants/{id}/kafka | POST | Configure topic prefix (auto-generates from slug if omitted) |
/api/v1/admin/tenants/{id}/kafka | GET | Get prefix info with effective prefix (own or inherited) |
/api/v1/admin/tenants/{id}/kafka | DELETE | Remove topic prefix |
Repo functions: set_kafka_topic_prefix(), clear_kafka_topic_prefix(), get_effective_topic_prefix().
Network-Level Isolation (Kubernetes)
The SaaS overlay (deploy/kubernetes/overlays/saas/network-policies.yaml) enforces default-deny:
# Default deny all ingress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
spec:
podSelector: {}
policyTypes: [Ingress]
# Default deny all egress (except DNS port 53)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
spec:
podSelector: {}
policyTypes: [Egress]
egress:
- ports:
- protocol: UDP
port: 53Explicit allow rules grant:
- Coordinator to PostgreSQL (port 5432, in-namespace)
- Coordinator to external services (Kafka 9092/9093, MQTT 1883/8883, HTTPS 443 for Stripe/GitHub)
- Worker to Coordinator (port 9100) and vice versa (port 9000)
- Ingress controller to Varpulis pods (ports 9000, 9100)
Quota Enforcement
Three enforcement layers protect the system from overload.
1. Deployment Quotas
Checked at pipeline deploy time in Tenant::deploy_pipeline():
- Pipeline count:
pipelines.len() >= quota.max_pipelines→TenantError::QuotaExceeded - Streams per pipeline: parsed VPL stream count >
quota.max_streams_per_pipeline→TenantError::QuotaExceeded
2. Rate Limiting (Per-Tenant Sliding Window)
Each tenant's TenantUsage tracks a 1-second sliding window:
pub fn record_event(&mut self, max_eps: u64) -> bool {
self.events_processed += 1;
let now = Instant::now();
match self.window_start {
Some(start) if now.duration_since(start).as_secs() < 1 => {
self.events_in_window += 1;
if max_eps > 0 && self.events_in_window > max_eps {
return false; // TenantError::RateLimitExceeded
}
}
_ => {
self.window_start = Some(now);
self.events_in_window = 1;
}
}
true
}Called on every event in Tenant::process_event(). When max_eps == 0, rate limiting is disabled (unlimited).
3. Monthly Usage Limits (SaaS)
The UsageTracker buffers event counts in memory and flushes to usage_daily every 60 seconds:
Events → UsageTracker (in-memory buffer, per org_id)
│
├── 60s interval ──→ drain() → DB upsert (usage_daily)
│
└── reload monthly totals + limits from DBUsage is checked against the tier's monthly_event_limit. At >80% utilization, an ApproachingLimit warning is returned. At 100%, requests receive 429 Too Many Requests with Retry-After: 3600.
Global Backpressure
An AtomicU64 counter tracks pending events across all tenants:
pub fn check_backpressure(&self) -> Result<(), TenantError> {
if self.max_queue_depth == 0 { return Ok(()); }
let current = self.pending_events.load(Ordering::Relaxed);
if current >= self.max_queue_depth {
return Err(TenantError::BackpressureExceeded { current, max: self.max_queue_depth });
}
Ok(())
}The counter is incremented before processing and decremented after. queue_pressure_ratio() exposes the ratio as a Prometheus gauge.
Identity & Authentication Flow
API Key Authentication
The primary path for programmatic access:
Supported header formats: X-API-Key, Authorization: Bearer <key>, Authorization: ApiKey <key>, Sec-WebSocket-Protocol: varpulis-auth.<key>, query parameter ?api_key=.
Admin Authentication
Admin operations require JWT with role: "admin":
- CLI mode:
--admin-passwordflag bootstraps an admin user with Argon2-hashed password - SaaS mode: JWT issued via GitHub OAuth carries
role,org_id, anduser_idclaims
SaaS Session Flow
In SaaS mode, browser sessions use JWT cookies:
- User authenticates via GitHub OAuth (or OIDC)
- Server upserts user in
userstable, auto-creates organization if first login - JWT issued (HMAC-SHA256, 7-day expiry) with
user_id,org_id,roleclaims - JWT stored in
HttpOnly,Secure,SameSite=Laxcookie
API keys in SaaS mode are SHA-256 hashed before database storage — raw keys are shown once at creation time and never stored.
Billing Integration
Feature-gated:
--features saas. See Stripe Setup Guide for configuration steps.
Stripe Checkout Flow
Webhook Events
| Stripe Event | Action |
|---|---|
checkout.session.completed | Save customer ID, upgrade tier, set status "active" |
customer.subscription.updated | Update tier if price changed |
customer.subscription.deleted | Downgrade to "free" tier |
All webhooks are verified using HMAC-SHA256 with STRIPE_WEBHOOK_SECRET.
Billing Endpoints
| Endpoint | Method | Description |
|---|---|---|
/api/v1/billing/usage | GET | Current month's event usage |
/api/v1/billing/plan | GET | Current tier and event limit |
/api/v1/billing/checkout | POST | Create Stripe Checkout Session |
/api/v1/billing/portal | POST | Create Stripe Customer Portal session |
/api/v1/billing/webhook | POST | Stripe webhook receiver |
Configuration
| Environment Variable | Purpose |
|---|---|
STRIPE_SECRET_KEY | Stripe API secret key |
STRIPE_WEBHOOK_SECRET | Webhook signature verification |
STRIPE_PRO_PRICE_ID | Stripe price ID for Pro tier |
STRIPE_BUSINESS_PRICE_ID | Stripe price ID for Business tier |
FRONTEND_URL | Redirect URL after checkout (default: http://localhost:5173) |
Trial Lifecycle
Creation
create_trial_organization() inserts a new organization with:
status = 'trial'trial_expires_at = now() + 30 days- Free-tier resource limits
Automatic Expiry
spawn_trial_expiry_checker() runs as a background task, checking every hour:
Every 60 min:
SELECT ... FROM organizations
WHERE status = 'trial'
AND trial_expires_at IS NOT NULL
AND trial_expires_at < now()
→ For each expired org: UPDATE status = 'suspended'An index on (trial_expires_at) WHERE status = 'trial' ensures efficient lookups.
Admin Override
Admins can extend or convert trials:
PUT /api/v1/admin/tenants/{id}/trial— set newtrial_expires_atPUT /api/v1/admin/tenants/{id}/status— change to"active"(bypasses expiry)PUT /api/v1/admin/tenants/{id}/tier— upgrade tier (e.g., after payment)
Admin Operations
All admin endpoints require JWT with role: "admin". Defined in crates/varpulis-cli/src/admin.rs.
| Endpoint | Method | Description |
|---|---|---|
/api/v1/admin/tenants | GET | List all organizations with usage, hierarchy, isolation info |
/api/v1/admin/tenants | POST | Create tenant or sub-tenant |
/api/v1/admin/tenants/{id} | GET | Org details with pipelines, API keys, sub-tenants |
/api/v1/admin/tenants/{id}/tier | PUT | Change tier (free/pro/business/enterprise) |
/api/v1/admin/tenants/{id}/status | PUT | Change status (active/trial/suspended/revoked) |
/api/v1/admin/tenants/{id}/trial | PUT | Extend trial expiration date |
/api/v1/admin/tenants/{id}/limits | PUT | Override pipeline/EPS/monthly limits |
/api/v1/admin/tenants/{id}/revoke | POST | Revoke tenant (sets status="revoked") |
/api/v1/admin/tenants/{id}/namespace | POST/GET/DELETE | Provision/query/deprovision k8s namespace |
/api/v1/admin/tenants/{id}/kafka | POST/GET/DELETE | Configure/query/remove Kafka topic prefix |
/api/v1/admin/global-pipelines | POST/GET | Deploy/list global pipeline templates |
/api/v1/admin/global-pipelines/{id} | PUT/DELETE | Update/undeploy global pipeline template |
/api/v1/admin/usage | GET | Aggregate usage across all tenants |
Tenant-Level Endpoints
| Endpoint | Method | Description |
|---|---|---|
/api/v1/orgs | GET | List user's organizations (hierarchy-aware) |
/api/v1/orgs/{id}/sub-tenants | POST | Create sub-tenant under a tenant |
/api/v1/orgs/{id}/pipelines | GET | List visible pipelines (own + inherited, with read_only flag) |
The web UI includes an admin panel and org switcher with hierarchy tree for managing tenants without direct API calls.
Deployment Modes
Single-User (CLI)
varpulis server --port 9000 --api-key "my-key"- A default tenant is auto-provisioned with the provided API key and enterprise-tier quotas
- No database required — state optionally persisted via
FileStore - Suitable for development, testing, and single-application deployments
Multi-Tenant Server
varpulis server --port 9000 --admin-password "secret"- Admin creates tenants via REST API or web UI
- Each tenant receives its own API key and quota
- Runtime
TenantManagerhandles isolation in-memory - Optional
--state-direnablesFileStorepersistence across restarts
Full SaaS
# Requires: PostgreSQL, Stripe account, GitHub OAuth app
cargo build --release --features saas
DATABASE_URL=postgresql://... \
STRIPE_SECRET_KEY=sk_... \
GITHUB_CLIENT_ID=... \
GITHUB_CLIENT_SECRET=... \
varpulis server --port 9000- Self-service signup via GitHub OAuth (or OIDC with
--features oidc) - Organizations auto-created on first login
- Trial lifecycle with 30-day expiry and auto-suspension
- Stripe billing for tier upgrades
- Usage tracking with 60-second flush to
usage_daily - Admin panel for tenant management
- Kubernetes deployment with NetworkPolicies for network isolation
Docker Compose for local SaaS development:
docker compose -f deploy/docker/docker-compose.saas.yml up -dPersistence & Recovery
Runtime Layer (FileStore)
When --state-dir is provided, TenantManager uses FileStore for JSON snapshots:
FileStore directory layout:
<state-dir>/tenant/<uuid> # TenantSnapshot (JSON)
<state-dir>/tenants/index # List of tenant IDs (JSON array)Writes are atomic (write to .tmp, then rename). TenantManager::recover() loads the index and restores all tenants on startup, including pipeline VPL sources (re-compiled into engines).
A RocksDbStore backend is also available via --features persistence for write-heavy workloads with LZ4 compression and 64 MB write buffers.
Database Layer (SaaS)
PostgreSQL stores the authoritative tenant state in SaaS mode. Migrations auto-run on startup via sqlx. The connection pool uses up to 20 connections with a 5-second acquire timeout.
File Locations
| Component | File |
|---|---|
| TenantManager, Tenant, TenantQuota | crates/varpulis-runtime/src/tenant.rs |
| StateStore, FileStore, RocksDbStore | crates/varpulis-runtime/src/persistence.rs |
| API routes, DB sync, ApiKey extractor | crates/varpulis-cli/src/api.rs |
| Admin API (tenants, namespaces, kafka, global pipelines) | crates/varpulis-cli/src/admin.rs |
| Auth (registration, login, verification) | crates/varpulis-cli/src/auth.rs |
| Billing & UsageTracker | crates/varpulis-cli/src/billing.rs |
| OAuth & JWT sessions | crates/varpulis-cli/src/oauth.rs |
| Organization, sub-tenant, pipeline routes | crates/varpulis-cli/src/org.rs |
| Local user store & sessions | crates/varpulis-cli/src/users.rs |
| DB models (Organization, Pipeline, etc.) | crates/varpulis-db/src/models.rs |
| DB repo queries (hierarchy, isolation) | crates/varpulis-db/src/repo.rs |
| DB migrations (hierarchy, schema, namespace, kafka) | crates/varpulis-db/migrations/ |
| Kafka connector & managed connector | crates/varpulis-connectors/src/kafka.rs, managed_kafka.rs |
| Sink factory (topic resolution) | crates/varpulis-runtime/src/engine/sink_factory.rs |
| Cluster connector config & injection | crates/varpulis-cluster/src/connector_config.rs |
| Vue 3 org store (hierarchy tree) | web-ui/src/stores/org.ts |
| Vue 3 admin store (tenant detail) | web-ui/src/stores/admin.ts |
| Kubernetes NetworkPolicies | deploy/kubernetes/overlays/saas/network-policies.yaml |
| SaaS Docker Compose | deploy/docker/docker-compose.saas.yml |
| E2E tests (all phases) | web-ui/tests/*.spec.ts |
RBAC Hierarchy
| Level | Roles | Capabilities |
|---|---|---|
| Global (system admin) | admin | CRUD all tenants, deploy global pipelines, monitor all, provision namespaces/kafka |
| Tenant | owner, admin, member, viewer | Manage own + sub-tenant pipelines, create sub-tenants |
| Sub-tenant | admin, member, viewer | Manage own pipelines only |
Tenant admins can access sub-tenant data via parent_org_id lookup. Sub-tenants cannot create sub-sub-tenants (enforced at API level).
See Also
- Authentication Architecture — OAuth, OIDC, and JWT session management
- Stripe Setup Guide — Stripe product and webhook configuration
- Production Deployment — Deployment checklist and security hardening
- SSO/OIDC Tutorial — Enterprise SSO provider setup
- Cluster Architecture — Distributed coordinator/worker topology