Skip to content

Multi-Tenancy Architecture

Technical architecture of Varpulis's multi-tenancy system, covering the hierarchical tenant model, quota enforcement, billing integration, data isolation, and deployment modes.

Overview

Varpulis uses a three-tier hierarchical tenant model with isolation at every layer:

Tenant hierarchy

Isolation is enforced across five dimensions:

  1. Database — per-tenant PostgreSQL schemas (data plane), shared public schema (control plane)
  2. Kubernetes — per-tenant namespaces with NetworkPolicies
  3. Kafka — per-tenant topic prefixes ({slug}.*)
  4. Pipelines — hierarchical inheritance (global → tenant → sub-tenant) with read-only enforcement
  5. RBAC — hierarchical roles with parent-tenant admin access to sub-tenants

Multi-tenancy overview


Tenant Model

Hierarchy

The organization model supports a strict 3-level hierarchy:

Levelorg_typeparent_org_idInfrastructure
GlobalglobalNULLDedicated namespace, manages all tenants
TenanttenantGlobal org IDOwn namespace, schema, coordinators, Kafka prefix
Sub-tenantsub_tenantTenant org IDShares parent's namespace, schema, Kafka prefix

Constraints:

  • Maximum 3 levels — sub-tenants cannot create sub-sub-tenants
  • One global org per deployment (seeded at first startup)
  • Sub-tenants inherit parent's db_schema, k8s_namespace, and kafka_topic_prefix

Runtime Layer

The in-memory tenant model lives in crates/varpulis-runtime/src/tenant.rs and is always active, regardless of deployment mode.

TenantManager — central registry for all tenants:

rust
pub struct TenantManager {
    tenants: HashMap<TenantId, Tenant>,
    api_key_index: HashMap<String, TenantId>,  // O(1) lookup
    store: Option<Arc<dyn StateStore>>,         // persistence backend
    max_queue_depth: u64,                       // 0 = unlimited
    pending_events: Arc<AtomicU64>,             // global backpressure counter
}

pub type SharedTenantManager = Arc<RwLock<TenantManager>>;

Tenant — represents one isolated tenant with its own pipeline namespace:

rust
pub struct Tenant {
    pub id: TenantId,
    pub name: String,
    pub api_key: String,
    pub quota: TenantQuota,
    pub usage: TenantUsage,
    pub pipelines: HashMap<String, Pipeline>,  // per-tenant namespace
    pub created_at: Instant,
}

Pipeline — a deployed VPL program with its own engine instance:

rust
pub struct Pipeline {
    pub id: String,                                    // UUID
    pub name: String,
    pub source: String,                                // VPL source
    pub engine: Arc<tokio::sync::Mutex<Engine>>,
    pub status: PipelineStatus,                        // Running | Stopped | Error
    pub orchestrator: Option<ContextOrchestrator>,
    pub connector_registry: Option<ManagedConnectorRegistry>,
}

Database Layer (SaaS)

When the saas feature is enabled and DATABASE_URL is set, Varpulis uses PostgreSQL for durable tenant state. Migrations live in crates/varpulis-db/migrations/.

Core tables:

TablePurposeKey columns
usersUser accounts (OAuth + local)id, github_id, email, username, role, email_verified
organizationsTenant hierarchyid, owner_id, name, tier, status, org_type, parent_org_id, db_schema, k8s_namespace, kafka_topic_prefix
org_membersUser-org membershiporg_id, user_id, role (owner/admin/member/viewer), status
api_keysHashed API keysid, org_id, key_hash (SHA-256), scopes, expires_at, revoked_at
pipelinesDeployed pipelinesid, org_id, name, vpl_source, status, scope_level, inherited_from_org_id
global_pipeline_templatesAdmin-managed global pipelinesid, name, vpl_source, status, deployed_by
usage_dailyPer-day event countsorg_id, date, events_processed, output_events

All tables with org_id use ON DELETE CASCADE so removing an organization cleans up all related data.

Organization columns for isolation:

ColumnPurposeExample
org_typeHierarchy levelglobal, tenant, sub_tenant
parent_org_idParent org referenceUUID of parent tenant/global
slugURL-safe identifieracme-corp
db_schemaPostgreSQL schema nametenant_acme_corp
k8s_namespaceKubernetes namespacevarpulis-tenant-a1b2c3d4
kafka_topic_prefixKafka topic prefixacme-corp

Tier System

Runtime Quotas (TenantQuota)

TierMax PipelinesMax EPSMax Streams/Pipeline
Free550010
Pro2050,000100
Business100200,000200
Enterprise1,000500,000500

Constructed via TenantQuota::free(), TenantQuota::pro(), TenantQuota::business(), TenantQuota::enterprise(), or dynamically with TenantQuota::for_tier(tier_str).

SaaS Quotas (Database organizations columns)

TierMonthly Event LimitPipeline LimitEPS Limit
Free100,0005500
Pro10,000,0002050,000
Business100,000,000100200,000
EnterpriseUnlimited1,000500,000

The organizations.tier column ("free", "pro", "business", "enterprise") maps to both runtime TenantQuota and the DB-level pipeline_limit, events_per_second_limit, and monthly_event_limit columns. Admins can override individual limits per-org.


Tenant Isolation

API Key to Tenant Lookup

Every API request includes an X-API-Key header (or Authorization: Bearer/ApiKey, cookie, or query parameter). The lookup is O(1) via api_key_index:

rust
// TenantManager
pub fn get_tenant_by_api_key(&self, key: &str) -> Option<&TenantId> {
    self.api_key_index.get(key)
}

In SaaS mode, API keys are additionally validated against the database using SHA-256 hash comparison:

rust
pub async fn org_id_for_api_key(&self, raw_key: &str) -> Option<Uuid> {
    let hash = hex::encode(sha2::Sha256::digest(raw_key.as_bytes()));
    let api_key = repo::get_api_key_by_hash(&pool, &hash).await.ok()??;
    Some(api_key.org_id)
}

Pipeline Namespace Isolation

Each tenant owns a separate HashMap<String, Pipeline>. The API handler pattern ensures tenants can only access their own pipelines:

rust
let tenant_id = mgr.get_tenant_by_api_key(&api_key)?;   // 1. identify tenant
let tenant = mgr.get_tenant_mut(&tenant_id)?;            // 2. scope to tenant
let pipeline = tenant.pipelines.get(&pipeline_id)?;      // 3. access pipeline

Each pipeline has its own Arc<Mutex<Engine>>, mpsc::Receiver<Event>, and broadcast::Sender<Event> — no cross-tenant data leakage.

Database-Level Isolation

Shared Schema (Control Plane)

The public schema stores cross-tenant data: users, organizations, org_members, api_keys, global_pipeline_templates. All queries filter by org_id.

Per-Tenant Schema (Data Plane)

Each tenant gets a dedicated PostgreSQL schema (e.g., tenant_acme_corp) containing tenant-specific data plane tables: pipelines, usage_daily. Sub-tenants share their parent's schema, scoped by org_id within the schema.

Schema provisioning flow:

  1. Admin creates tenant → org_type = 'tenant' row in organizations
  2. System creates schema: CREATE SCHEMA tenant_{slug}
  3. Runs data plane migrations within the new schema
  4. Sets organizations.db_schema = 'tenant_{slug}'

Repo functions: set_tenant_schema(), get_organization() returns db_schema.

Pipeline Inheritance

Pipeline inheritance

Pipelines follow a hierarchical visibility model:

Pipeline OwnerVisible AtEditable Atscope_level
Global orgGlobal + All tenants + All sub-tenantsGlobal onlyglobal
Tenant orgTenant + Its sub-tenantsTenant onlytenant
Sub-tenantSub-tenant onlySub-tenant onlyown

Key columns on pipelines:

  • scope_level'global', 'tenant', or 'own'
  • inherited_from_org_id — source org for inherited (read-only) pipelines

API enforcement:

  • visible_pipelines() returns own + parent + global pipelines
  • Inherited pipelines are marked read_only: true in API responses
  • Delete/update operations are rejected for read-only pipelines
  • Global pipeline templates are copied to all tenants via create_global_pipeline_copy()

DB sync (saas feature): Pipeline deploy/delete in the runtime TenantManager syncs to PostgreSQL via ApiState.db_pool, ensuring the hierarchy-aware API can query all pipelines.

Kubernetes Per-Tenant Namespaces

Each tenant gets a dedicated Kubernetes namespace. Sub-tenants share their parent tenant's namespace.

Kubernetes namespace isolation

Admin API endpoints:

EndpointMethodDescription
/api/v1/admin/tenants/{id}/namespacePOSTProvision k8s namespace + NetworkPolicy
/api/v1/admin/tenants/{id}/namespaceGETGet namespace info with cluster status
/api/v1/admin/tenants/{id}/namespaceDELETEDeprovision namespace

Provisioning creates the namespace and applies a default-deny NetworkPolicy via kubectl. Sub-tenants cannot provision their own namespace (returns 400).

Kafka Topic Isolation

Shared Kafka cluster with per-tenant topic prefixes. Sub-tenants use their parent's topic prefix.

Kafka topic isolation

Admin API endpoints:

EndpointMethodDescription
/api/v1/admin/tenants/{id}/kafkaPOSTConfigure topic prefix (auto-generates from slug if omitted)
/api/v1/admin/tenants/{id}/kafkaGETGet prefix info with effective prefix (own or inherited)
/api/v1/admin/tenants/{id}/kafkaDELETERemove topic prefix

Repo functions: set_kafka_topic_prefix(), clear_kafka_topic_prefix(), get_effective_topic_prefix().

Network-Level Isolation (Kubernetes)

The SaaS overlay (deploy/kubernetes/overlays/saas/network-policies.yaml) enforces default-deny:

yaml
# Default deny all ingress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
spec:
  podSelector: {}
  policyTypes: [Ingress]

# Default deny all egress (except DNS port 53)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
spec:
  podSelector: {}
  policyTypes: [Egress]
  egress:
    - ports:
        - protocol: UDP
          port: 53

Explicit allow rules grant:

  • Coordinator to PostgreSQL (port 5432, in-namespace)
  • Coordinator to external services (Kafka 9092/9093, MQTT 1883/8883, HTTPS 443 for Stripe/GitHub)
  • Worker to Coordinator (port 9100) and vice versa (port 9000)
  • Ingress controller to Varpulis pods (ports 9000, 9100)

Quota Enforcement

Three enforcement layers protect the system from overload.

1. Deployment Quotas

Checked at pipeline deploy time in Tenant::deploy_pipeline():

  • Pipeline count: pipelines.len() >= quota.max_pipelinesTenantError::QuotaExceeded
  • Streams per pipeline: parsed VPL stream count > quota.max_streams_per_pipelineTenantError::QuotaExceeded

2. Rate Limiting (Per-Tenant Sliding Window)

Each tenant's TenantUsage tracks a 1-second sliding window:

rust
pub fn record_event(&mut self, max_eps: u64) -> bool {
    self.events_processed += 1;
    let now = Instant::now();
    match self.window_start {
        Some(start) if now.duration_since(start).as_secs() < 1 => {
            self.events_in_window += 1;
            if max_eps > 0 && self.events_in_window > max_eps {
                return false;  // TenantError::RateLimitExceeded
            }
        }
        _ => {
            self.window_start = Some(now);
            self.events_in_window = 1;
        }
    }
    true
}

Called on every event in Tenant::process_event(). When max_eps == 0, rate limiting is disabled (unlimited).

3. Monthly Usage Limits (SaaS)

The UsageTracker buffers event counts in memory and flushes to usage_daily every 60 seconds:

Events → UsageTracker (in-memory buffer, per org_id)

         ├── 60s interval ──→ drain() → DB upsert (usage_daily)

         └── reload monthly totals + limits from DB

Usage is checked against the tier's monthly_event_limit. At >80% utilization, an ApproachingLimit warning is returned. At 100%, requests receive 429 Too Many Requests with Retry-After: 3600.

Global Backpressure

An AtomicU64 counter tracks pending events across all tenants:

rust
pub fn check_backpressure(&self) -> Result<(), TenantError> {
    if self.max_queue_depth == 0 { return Ok(()); }
    let current = self.pending_events.load(Ordering::Relaxed);
    if current >= self.max_queue_depth {
        return Err(TenantError::BackpressureExceeded { current, max: self.max_queue_depth });
    }
    Ok(())
}

The counter is incremented before processing and decremented after. queue_pressure_ratio() exposes the ratio as a Prometheus gauge.


Identity & Authentication Flow

API Key Authentication

The primary path for programmatic access:

API key authentication flow

Supported header formats: X-API-Key, Authorization: Bearer <key>, Authorization: ApiKey <key>, Sec-WebSocket-Protocol: varpulis-auth.<key>, query parameter ?api_key=.

Admin Authentication

Admin operations require JWT with role: "admin":

  • CLI mode: --admin-password flag bootstraps an admin user with Argon2-hashed password
  • SaaS mode: JWT issued via GitHub OAuth carries role, org_id, and user_id claims

SaaS Session Flow

In SaaS mode, browser sessions use JWT cookies:

  1. User authenticates via GitHub OAuth (or OIDC)
  2. Server upserts user in users table, auto-creates organization if first login
  3. JWT issued (HMAC-SHA256, 7-day expiry) with user_id, org_id, role claims
  4. JWT stored in HttpOnly, Secure, SameSite=Lax cookie

API keys in SaaS mode are SHA-256 hashed before database storage — raw keys are shown once at creation time and never stored.


Billing Integration

Feature-gated: --features saas. See Stripe Setup Guide for configuration steps.

Stripe Checkout Flow

Stripe checkout flow

Webhook Events

Stripe EventAction
checkout.session.completedSave customer ID, upgrade tier, set status "active"
customer.subscription.updatedUpdate tier if price changed
customer.subscription.deletedDowngrade to "free" tier

All webhooks are verified using HMAC-SHA256 with STRIPE_WEBHOOK_SECRET.

Billing Endpoints

EndpointMethodDescription
/api/v1/billing/usageGETCurrent month's event usage
/api/v1/billing/planGETCurrent tier and event limit
/api/v1/billing/checkoutPOSTCreate Stripe Checkout Session
/api/v1/billing/portalPOSTCreate Stripe Customer Portal session
/api/v1/billing/webhookPOSTStripe webhook receiver

Configuration

Environment VariablePurpose
STRIPE_SECRET_KEYStripe API secret key
STRIPE_WEBHOOK_SECRETWebhook signature verification
STRIPE_PRO_PRICE_IDStripe price ID for Pro tier
STRIPE_BUSINESS_PRICE_IDStripe price ID for Business tier
FRONTEND_URLRedirect URL after checkout (default: http://localhost:5173)

Trial Lifecycle

Creation

create_trial_organization() inserts a new organization with:

  • status = 'trial'
  • trial_expires_at = now() + 30 days
  • Free-tier resource limits

Automatic Expiry

spawn_trial_expiry_checker() runs as a background task, checking every hour:

Every 60 min:
  SELECT ... FROM organizations
  WHERE status = 'trial'
    AND trial_expires_at IS NOT NULL
    AND trial_expires_at < now()

  → For each expired org: UPDATE status = 'suspended'

An index on (trial_expires_at) WHERE status = 'trial' ensures efficient lookups.

Admin Override

Admins can extend or convert trials:

  • PUT /api/v1/admin/tenants/{id}/trial — set new trial_expires_at
  • PUT /api/v1/admin/tenants/{id}/status — change to "active" (bypasses expiry)
  • PUT /api/v1/admin/tenants/{id}/tier — upgrade tier (e.g., after payment)

Admin Operations

All admin endpoints require JWT with role: "admin". Defined in crates/varpulis-cli/src/admin.rs.

EndpointMethodDescription
/api/v1/admin/tenantsGETList all organizations with usage, hierarchy, isolation info
/api/v1/admin/tenantsPOSTCreate tenant or sub-tenant
/api/v1/admin/tenants/{id}GETOrg details with pipelines, API keys, sub-tenants
/api/v1/admin/tenants/{id}/tierPUTChange tier (free/pro/business/enterprise)
/api/v1/admin/tenants/{id}/statusPUTChange status (active/trial/suspended/revoked)
/api/v1/admin/tenants/{id}/trialPUTExtend trial expiration date
/api/v1/admin/tenants/{id}/limitsPUTOverride pipeline/EPS/monthly limits
/api/v1/admin/tenants/{id}/revokePOSTRevoke tenant (sets status="revoked")
/api/v1/admin/tenants/{id}/namespacePOST/GET/DELETEProvision/query/deprovision k8s namespace
/api/v1/admin/tenants/{id}/kafkaPOST/GET/DELETEConfigure/query/remove Kafka topic prefix
/api/v1/admin/global-pipelinesPOST/GETDeploy/list global pipeline templates
/api/v1/admin/global-pipelines/{id}PUT/DELETEUpdate/undeploy global pipeline template
/api/v1/admin/usageGETAggregate usage across all tenants

Tenant-Level Endpoints

EndpointMethodDescription
/api/v1/orgsGETList user's organizations (hierarchy-aware)
/api/v1/orgs/{id}/sub-tenantsPOSTCreate sub-tenant under a tenant
/api/v1/orgs/{id}/pipelinesGETList visible pipelines (own + inherited, with read_only flag)

The web UI includes an admin panel and org switcher with hierarchy tree for managing tenants without direct API calls.


Deployment Modes

Single-User (CLI)

bash
varpulis server --port 9000 --api-key "my-key"
  • A default tenant is auto-provisioned with the provided API key and enterprise-tier quotas
  • No database required — state optionally persisted via FileStore
  • Suitable for development, testing, and single-application deployments

Multi-Tenant Server

bash
varpulis server --port 9000 --admin-password "secret"
  • Admin creates tenants via REST API or web UI
  • Each tenant receives its own API key and quota
  • Runtime TenantManager handles isolation in-memory
  • Optional --state-dir enables FileStore persistence across restarts

Full SaaS

bash
# Requires: PostgreSQL, Stripe account, GitHub OAuth app
cargo build --release --features saas

DATABASE_URL=postgresql://... \
STRIPE_SECRET_KEY=sk_... \
GITHUB_CLIENT_ID=... \
GITHUB_CLIENT_SECRET=... \
varpulis server --port 9000
  • Self-service signup via GitHub OAuth (or OIDC with --features oidc)
  • Organizations auto-created on first login
  • Trial lifecycle with 30-day expiry and auto-suspension
  • Stripe billing for tier upgrades
  • Usage tracking with 60-second flush to usage_daily
  • Admin panel for tenant management
  • Kubernetes deployment with NetworkPolicies for network isolation

Docker Compose for local SaaS development:

bash
docker compose -f deploy/docker/docker-compose.saas.yml up -d

Persistence & Recovery

Runtime Layer (FileStore)

When --state-dir is provided, TenantManager uses FileStore for JSON snapshots:

FileStore directory layout:
  <state-dir>/tenant/<uuid>       # TenantSnapshot (JSON)
  <state-dir>/tenants/index       # List of tenant IDs (JSON array)

Writes are atomic (write to .tmp, then rename). TenantManager::recover() loads the index and restores all tenants on startup, including pipeline VPL sources (re-compiled into engines).

A RocksDbStore backend is also available via --features persistence for write-heavy workloads with LZ4 compression and 64 MB write buffers.

Database Layer (SaaS)

PostgreSQL stores the authoritative tenant state in SaaS mode. Migrations auto-run on startup via sqlx. The connection pool uses up to 20 connections with a 5-second acquire timeout.


File Locations

ComponentFile
TenantManager, Tenant, TenantQuotacrates/varpulis-runtime/src/tenant.rs
StateStore, FileStore, RocksDbStorecrates/varpulis-runtime/src/persistence.rs
API routes, DB sync, ApiKey extractorcrates/varpulis-cli/src/api.rs
Admin API (tenants, namespaces, kafka, global pipelines)crates/varpulis-cli/src/admin.rs
Auth (registration, login, verification)crates/varpulis-cli/src/auth.rs
Billing & UsageTrackercrates/varpulis-cli/src/billing.rs
OAuth & JWT sessionscrates/varpulis-cli/src/oauth.rs
Organization, sub-tenant, pipeline routescrates/varpulis-cli/src/org.rs
Local user store & sessionscrates/varpulis-cli/src/users.rs
DB models (Organization, Pipeline, etc.)crates/varpulis-db/src/models.rs
DB repo queries (hierarchy, isolation)crates/varpulis-db/src/repo.rs
DB migrations (hierarchy, schema, namespace, kafka)crates/varpulis-db/migrations/
Kafka connector & managed connectorcrates/varpulis-connectors/src/kafka.rs, managed_kafka.rs
Sink factory (topic resolution)crates/varpulis-runtime/src/engine/sink_factory.rs
Cluster connector config & injectioncrates/varpulis-cluster/src/connector_config.rs
Vue 3 org store (hierarchy tree)web-ui/src/stores/org.ts
Vue 3 admin store (tenant detail)web-ui/src/stores/admin.ts
Kubernetes NetworkPoliciesdeploy/kubernetes/overlays/saas/network-policies.yaml
SaaS Docker Composedeploy/docker/docker-compose.saas.yml
E2E tests (all phases)web-ui/tests/*.spec.ts

RBAC Hierarchy

LevelRolesCapabilities
Global (system admin)adminCRUD all tenants, deploy global pipelines, monitor all, provision namespaces/kafka
Tenantowner, admin, member, viewerManage own + sub-tenant pipelines, create sub-tenants
Sub-tenantadmin, member, viewerManage own pipelines only

Tenant admins can access sub-tenant data via parent_org_id lookup. Sub-tenants cannot create sub-sub-tenants (enforced at API level).


See Also

Varpulis - Next-generation streaming analytics engine