Ember - Tech Stack & Hosting

What every layer runs on, why we chose it, and how it scales.

At a glance

Frontend layer

Frontend host

Stack: Next.js 16 (App Router), React 19, TypeScript, Tailwind v4, shadcn/ui

Why Next.js: server components for fast first paint + dynamic auth-aware rendering; server actions for safe mutations without exposing API tokens; mature ecosystem of patterns for vibe-coded apps.

Why this host: preview deploys per branch, instant rollbacks, first-class Next.js support, the talent we recruit already knows the workflow.

Scaling: the frontend host's edge is global anycast. Static + RSC content is cached at the edge automatically. We've never needed to think about scaling the frontend - it's not where bottlenecks live.

Edge router

Stack: edge-distributed router on *.emberstudio.app

How it works: every request to *.emberstudio.app hits an edge worker. It extracts the subdomain, asks the control plane to resolve it to the project's current container endpoint (cached at edge: 60s positive, 10s negative), proxies the request. Zero per-project DNS records.

Future: when paid users want their own domain (mybrand.com → their project), we'll enable a managed custom-hostnames service. Strategy decided, implementation outline in the roadmap.

Control layer

Ember Control API

Stack: Node.js 20, Express, LLM provider SDK, managed-DB client, integration tests in ts-mocha

Why Node: the LLM ecosystem is best in Node/TS. The provider SDKs' streaming + tool-use APIs are first-class JS. We get nothing by being on Python or Go for our shape of work.

Hosting: runs as an always-on container in eu-west on our container infrastructure, co-located with the per-project workloads on the same private network for low-latency inter-service calls.

Control API config:

Region eu-west (London)
Always-on - the orchestrator must be reachable instantly when a user clicks "Build"
Memory + CPU sized for I/O-bound work (LLM calls, orchestration calls, DB queries)
Single container today (no HA); tolerable downtime in exchange for simpler state management. HA is on the roadmap pre-public-beta.

Scaling story: the control plane is stateless w.r.t. user requests - all state lives in the managed database. Horizontal scaling is mostly free, just spin up more containers behind the orchestration API. Bottleneck candidates:

LLM provider rate limits (current tier handles 1000 req/min; we're nowhere close)
Compute provisioning concurrency (deploys are sequential per project; parallelizable across projects)
DB connection pool (currently pooler-friendly; upgrade to dedicated pool when needed)

Workload layer (per-project containers)

How we run workloads today

Every generated app, agent runtime, and iteration preview runs in its own isolated container on our container infrastructure with strict per-project boundaries. Our orchestration code (the Ember Pipeline + the supporting services around it) is what gives us:

Sub-2-second cold starts. Auto-suspend means idle containers consume effectively no compute; first request wakes the container in ~1-2s. Critical for the agent webhook flow and dev preview UX.
Per-project isolation by default. Each generated app is a distinct container with hypervisor-grade isolation, not just process boundaries. Generated code can't see other projects' code, secrets, or network namespace.
Image-baked dependencies. Pre-built base images live in the our image registry. Per-project workloads deploy FROM them - no npm install at boot. Cold start drops from ~60s (with install) to ~3-5s (image pull from same region).
Idle-suspend model. Containers that aren't serving traffic suspend their runtime. Wakes on next inbound request, transparent to the user.

Why this design

A platform that takes untrusted prompts, generates code with an LLM, and runs that code with secrets nearby has unusual requirements. Off-the-shelf container orchestration was built for a different problem: predictable workloads, trusted code, long-running services. Ember's needs are the opposite - hostile inputs, untrusted bundles, bursty short-lived containers with strict isolation, and a cost model that has to stay sustainable as the user base grows.

Our orchestration is tuned for that. Cold starts are kept fast for the hot-reload UX. Each container gets hypervisor-grade isolation by default. Idle-suspend keeps cost proportional to actual usage rather than raw project count. The pipeline that drives the lifecycle (provisioning, secret push, deploy, verify, swap) is the Ember Pipeline, our orchestration code top to bottom.

The underlying compute layer is a managed VM provider we built on top of rather than a Kubernetes cluster we operate ourselves. That choice keeps our team small and our surface area to operate narrow. If our needs change, the compute layer is reversible - we have no architectural lock-in beyond "containers + Postgres".

Per-app config (today, target-specific)

Target	Memory	Inbound?	Auto-suspend
Web (production)	small	Yes (HTTP)	Yes, after 5 min idle
Telegram bot	small	No (polling)	No (always-on)
X bot	small	No	No
Agent runtime	small	Yes (HTTP for chat + webhooks)	Yes, after 5 min
Dev container	medium	Yes (HTTP for preview iframe + writer endpoint)	Yes, after 5 min

Data layer

Control plane database

Stack: Managed Postgres 17, Realtime, Auth (magic links), Storage

Why this design:

RLS as the second auth layer. Even if the control plane had a bug, RLS blocks cross-user reads at the database. Belt + suspenders.
Realtime out of the box. Every status change instantly streams to the UI. No polling, no manual websocket plumbing.
First-class auth. Magic links work. JWTs validate. We don't reinvent password reset flows.
Postgres specifically. Our orchestrator's correctness relies on transactional atomicity. Document stores can't give us that.

Scaling: scales with Postgres, which we know how to scale. A connection pool is already in front. Read replicas + dedicated compute are the next steps if needed.

Tenant data database

Purpose: per-project tenant database for generated apps that opt into "shared" DB mode.

When a user's generated app needs a DB, they choose:

Shared mode - we provision a dedicated schema + Postgres role here. Other projects can't see each other (verified by impersonation tests).
BYO mode - user connects their own managed-Postgres URL + service key. Their app talks directly to their DB; we just inject the connection string as a per-project our secret store entry.

Splitting tenant data off control-plane data so we can scale them independently.

AI layer

Platform LLM provider - platform-borne inference

Used for: code generation, iteration (tiered), spec collector, theme generator, classifier calls.

Why a frontier-tier provider:

Generation quality on Next.js / Tailwind / Telegram bot code is at the leading edge.
Streaming + tool use APIs are mature and reliable. We use both heavily.
Cached prefix tokens drop iteration token-usage meaningfully.
The model family naturally separates "deep reasoning" from "production" from "fast". Maps cleanly to our tiered routing.

User-owned providers (agent target only)

The strategic decoupling. For agent projects, the user pastes their own LLM provider key (Anthropic, OpenAI, etc.). The agent's inference is billed directly to the user's account, not ours. Ember only handles the platform compute the agent runs on.

Why this matters strategically: generic vibe-coding tools can't ship chatty agents because per-message inference costs would destroy their unit economics. Ember can - the user's quota absorbs the variable cost. Our pricing focuses on platform, tooling, hosting, and UX, decoupled from inference volume on agents.

Code distribution

Templates registry

Purpose: template registry. Templates are referenced - not copied - into the control plane at build time. The control plane materializes them on demand.

Why git-based:

Versioned by commit SHA, easy to roll back
Templates remain editable by humans (open in any editor, edit, commit, push - done)
Pull-request flow is a natural review surface when we open template contributions later

Pricing model - the shape that matters

The cost structure deliberately stays decoupled from per-user inference volume on the agent side. That gives us three properties investors care about:

Variable costs scale linearly. Idle projects → near-zero compute via idle-suspend. A user with five projects open but no active iteration costs the platform essentially nothing.
Fixed costs barely move with growth. The control plane, edge router, image registry, and managed databases serve thousands of users at the same scale they serve one.
Agent inference is BYO. The most cost-sensitive workload (chat agents, scheduled triggers, Telegram bots) bills to the user's own LLM account. Ember earns platform + tooling + hosting + UX without underwriting their chat volume.

This is the SaaS cost shape we want - margin grows with scale because the fixed costs are amortized while the variable costs stay proportional. Our pricing tiers will be set against the value to the user (active projects, agent slots, custom domains), not against our cost basis.

Why this stack vs. alternatives

Choice	Alternative considered	Why we didn't (yet)
Managed VM platform	Serverless (Lambda / Cloud Run)	Cold starts too slow for our hot-reload UX; no idle-suspend equivalent at our cost target.
Managed VM platform	Self-managed Kubernetes	Massive operational tax for a small team. The simplicity of a managed platform plus our orchestration code is worth the trade today, and the lock-in is shallow.
Frontier LLM provider	Other LLM providers	Code quality on Next.js is materially better; tool use API more mature; vendor diversification is a future option not a present need.
Managed Postgres	DIY Postgres + DIY auth + DIY pubsub	Three vendors → one. RLS, auth, realtime out of the box. Standard tier is cost-efficient.
Next.js	Astro / Remix	Industry-standard for our ecosystem; React 19 server components map cleanly to our auth model.
Edge-based router	DIY-on-compute	Worker-based routers are zero-cold-start and at the edge globally; routing is a free CPU-light task. Running this on our compute layer would add ~50ms latency per request for no benefit.

Each choice is reversible - we have no architectural lock-in beyond "containers + Postgres". Migrating off the compute platform would be a few weeks. Migrating off the LLM provider would be a few days. We'd take the cost only if there's a forcing function.

Reliability + SLOs (informal today, formal pre-public-beta)

Today (closed beta):

Single-region (eu-west) deployment. No multi-region failover.
Single control-plane container. No HA.
No SLO commitments - we're a small operation, response time depends on operator availability.

Pre-public-beta:

Multi-region for production project containers
HA control plane (2-3 containers + DB failover)
Status page
Documented SLOs: 99.5% uptime control plane, 99.9% per-project apps, 99% chat / iteration latency under 30s p95

Enterprise tier (when customer demand justifies):

Multi-region failover for production project containers
99.9% SLA on production tier
Compliance certifications (SOC 2, eventually ISO 27001, HIPAA-ready for healthcare deployments)

Operational story

Who deploys today: founder via local CLI. Procedure documented internally. CI/CD pipeline is on the roadmap before second engineer onboards.

Where logs live:

Control plane logs → the compute platform's log stream
Per-project app logs → same, scoped per container
DB query logs → managed-DB dashboard
Audit trail → audit_log table, queryable via SQL

Where alerts live: today, manual operator monitoring via the admin state endpoint. Sentry / Datadog integration is on the roadmap.

Where backups live:

Managed Postgres includes daily backups + point-in-time recovery
Templates repo is on a versioned git host (mirrored locally)
Generated code is reproducible from the original prompt (no irreplaceable user content lives only on Ember; users can re-iterate to reproduce)
Per-tenant DB content IS user data that must be backed up - relies on managed-Postgres PITR

At a glance​

Frontend layer​

Frontend host​

Edge router​

Control layer​

Ember Control API​

Workload layer (per-project containers)​

How we run workloads today​

Why this design​

Per-app config (today, target-specific)​

Data layer​

Control plane database​

Tenant data database​

AI layer​

Platform LLM provider - platform-borne inference​

User-owned providers (agent target only)​

Code distribution​

Templates registry​

Pricing model - the shape that matters​

Why this stack vs. alternatives​

Reliability + SLOs (informal today, formal pre-public-beta)​

Operational story​