Ember - Tech Stack & Hosting
What every layer runs on, why we chose it, and how it scales.
At a glance
Frontend layer
Frontend host
Stack: Next.js 16 (App Router), React 19, TypeScript, Tailwind v4, shadcn/ui
Why Next.js: server components for fast first paint + dynamic auth-aware rendering; server actions for safe mutations without exposing API tokens; mature ecosystem of patterns for vibe-coded apps.
Why this host: preview deploys per branch, instant rollbacks, first-class Next.js support, the talent we recruit already knows the workflow.
Scaling: the frontend host's edge is global anycast. Static + RSC content is cached at the edge automatically. We've never needed to think about scaling the frontend - it's not where bottlenecks live.
Edge router
Stack: edge-distributed router on *.emberstudio.app
How it works: every request to *.emberstudio.app hits an edge worker. It extracts the subdomain, asks the control plane to resolve it to the project's current container endpoint (cached at edge: 60s positive, 10s negative), proxies the request. Zero per-project DNS records.
Future: when paid users want their own domain (mybrand.com → their project), we'll enable a managed custom-hostnames service. Strategy decided, implementation outline in the roadmap.
Control layer
Ember Control API
Stack: Node.js 20, Express, LLM provider SDK, managed-DB client, integration tests in ts-mocha
Why Node: the LLM ecosystem is best in Node/TS. The provider SDKs' streaming + tool-use APIs are first-class JS. We get nothing by being on Python or Go for our shape of work.
Hosting: runs on Ember Compute - an always-on container in eu-west, mirrored to per-project workloads on the same network for sub-millisecond inter-service latency.
Control API config:
- Region eu-west (London)
- Always-on - the orchestrator must be reachable instantly when a user clicks "Build"
- Memory + CPU sized for I/O-bound work (LLM calls, orchestration calls, DB queries)
- Single container today (no HA); tolerable downtime in exchange for simpler state management. HA is on the roadmap pre-public-beta.
Scaling story: the control plane is stateless w.r.t. user requests - all state lives in the managed database. Horizontal scaling is mostly free, just spin up more containers behind the orchestration API. Bottleneck candidates:
- LLM provider rate limits (current tier handles 1000 req/min; we're nowhere close)
- Compute provisioning concurrency (deploys are sequential per project; parallelizable across projects)
- DB connection pool (currently pooler-friendly; upgrade to dedicated pool when needed)
Workload layer (Ember Compute - per-project containers)
Why Ember runs its own container orchestration today
Ember Compute is our container orchestration platform. Every generated app, agent runtime, and iteration preview runs in its own isolated container with strict per-project boundaries. We built proprietary tooling around:
- Sub-2-second cold starts. Auto-suspend means idle containers consume effectively no compute; first request wakes the container in ~1-2s. Critical for the agent webhook flow and dev preview UX.
- Per-project isolation by default. Each generated app is a distinct container with hypervisor-grade isolation, not just process boundaries. Generated code can't see other projects' code, secrets, or network namespace.
- Image-baked dependencies. Pre-built base images live in the Ember Image Registry. Per-project workloads deploy FROM them - no
npm installat boot. Cold start drops from ~60s (with install) to ~3-5s (image pull from same region). - Idle-suspend model. Containers that aren't serving traffic suspend their runtime. Wakes on next inbound request, transparent to the user.
Why this design
A platform that takes untrusted prompts, generates code with an LLM, and runs that code with secrets nearby has unusual requirements. Off-the-shelf container orchestration was built for a different problem: predictable workloads, trusted code, long-running services. Ember's needs are the opposite - hostile inputs, untrusted bundles, bursty short-lived containers with strict isolation, and a cost model that has to stay sustainable as the user base grows.
Ember Compute is purpose-built for that. Cold starts are tuned for the hot-reload UX. Each container gets hypervisor-grade isolation by default. Idle-suspend keeps cost proportional to actual usage rather than raw project count. The pipeline that drives it is the Ember Pipeline, custom to this platform end to end.
Choosing not to take a dependency on the generic orchestration stack means Ember can move faster on the things that matter to this product, with fewer moving parts to operate and a smaller surface area to audit. The model has grown well past MVP scale and continues to fit the workload. As the platform's needs change, the underlying compute layer is reversible - we have no architectural lock-in beyond "containers + Postgres".
Per-app config (today, target-specific)
| Target | Memory | Inbound? | Auto-suspend |
|---|---|---|---|
| Web (production) | small | Yes (HTTP) | Yes, after 5 min idle |
| Telegram bot | small | No (polling) | No (always-on) |
| X bot | small | No | No |
| Agent runtime | small | Yes (HTTP for chat + webhooks) | Yes, after 5 min |
| Dev container | medium | Yes (HTTP for preview iframe + writer endpoint) | Yes, after 5 min |
Data layer
Control plane database
Stack: Managed Postgres 17, Realtime, Auth (magic links), Storage
Why this design:
- RLS as the second auth layer. Even if the control plane had a bug, RLS blocks cross-user reads at the database. Belt + suspenders.
- Realtime out of the box. Every status change instantly streams to the UI. No polling, no manual websocket plumbing.
- First-class auth. Magic links work. JWTs validate. We don't reinvent password reset flows.
- Postgres specifically. Our orchestrator's correctness relies on transactional atomicity. Document stores can't give us that.
Scaling: scales with Postgres, which we know how to scale. A connection pool is already in front. Read replicas + dedicated compute are the next steps if needed.
Tenant data database
Purpose: per-project tenant database for generated apps that opt into "shared" DB mode.
When a user's generated app needs a DB, they choose:
- Shared mode - we provision a dedicated schema + Postgres role here. Other projects can't see each other (verified by impersonation tests).
- BYO mode - user connects their own managed-Postgres URL + service key. Their app talks directly to their DB; we just inject the connection string as a per-project Ember Secret Store entry.
Splitting tenant data off control-plane data so we can scale them independently.
AI layer
Platform LLM provider - platform-borne inference
Used for: code generation, iteration (tiered), spec collector, theme generator, classifier calls.
Why a frontier-tier provider:
- Generation quality on Next.js / Tailwind / Telegram bot code is at the leading edge.
- Streaming + tool use APIs are mature and reliable. We use both heavily.
- Cached prefix tokens drop iteration token-usage meaningfully.
- The model family naturally separates "deep reasoning" from "production" from "fast". Maps cleanly to our tiered routing.
User-owned providers (agent target only)
The strategic decoupling. For agent projects, the user pastes their own LLM provider key (Anthropic, OpenAI, etc.). The agent's inference is billed directly to the user's account, not ours. Ember only handles the platform compute the agent runs on.
Why this matters strategically: generic vibe-coding tools can't ship chatty agents because per-message inference costs would destroy their unit economics. Ember can - the user's quota absorbs the variable cost. Our pricing focuses on platform, tooling, hosting, and UX, decoupled from inference volume on agents.
Code distribution
Templates registry
Purpose: template registry. Templates are referenced - not copied - into the control plane at build time. The control plane materializes them on demand.
Why git-based:
- Versioned by commit SHA, easy to roll back
- Templates remain editable by humans (open in any editor, edit, commit, push - done)
- Pull-request flow is a natural review surface when we open template contributions later
Pricing model - the shape that matters
The cost structure deliberately stays decoupled from per-user inference volume on the agent side. That gives us three properties investors care about:
- Variable costs scale linearly. Idle projects → near-zero compute via idle-suspend. A user with five projects open but no active iteration costs the platform essentially nothing.
- Fixed costs barely move with growth. The control plane, edge router, image registry, and managed databases serve thousands of users at the same scale they serve one.
- Agent inference is BYO. The most cost-sensitive workload (chat agents, scheduled triggers, Telegram bots) bills to the user's own LLM account. Ember earns platform + tooling + hosting + UX without underwriting their chat volume.
This is the SaaS cost shape we want - margin grows with scale because the fixed costs are amortized while the variable costs stay proportional. Our pricing tiers will be set against the value to the user (active projects, agent slots, custom domains), not against our cost basis.
Why this stack vs. alternatives
| Choice | Alternative considered | Why we didn't (yet) |
|---|---|---|
| Ember Compute (proprietary container platform) | Serverless (Lambda / Cloud Run) | Cold starts too slow for our hot-reload UX; no idle-suspend equivalent at our cost target. |
| Ember Compute (proprietary platform) | Generic container orchestration | The shape of our workload - hostile inputs, untrusted bundles, bursty short-lived containers, strict per-project isolation - is what off-the-shelf orchestration was not designed for. Purpose-built lets us move faster on what matters and have less surface area to operate. |
| Frontier LLM provider | Other LLM providers | Code quality on Next.js is materially better; tool use API more mature; vendor diversification is a future option not a present need. |
| Managed Postgres | DIY Postgres + DIY auth + DIY pubsub | Three vendors → one. RLS, auth, realtime out of the box. Standard tier is cost-efficient. |
| Next.js | Astro / Remix | Industry-standard for our ecosystem; React 19 server components map cleanly to our auth model. |
| Edge-based router | DIY-on-compute | Worker-based routers are zero-cold-start and at the edge globally; routing is a free CPU-light task. Running this on our compute layer would add ~50ms latency per request for no benefit. |
Each choice is reversible - we have no architectural lock-in beyond "containers + Postgres". Migrating off the compute platform would be a few weeks. Migrating off the LLM provider would be a few days. We'd take the cost only if there's a forcing function.
Reliability + SLOs (informal today, formal pre-public-beta)
Today (closed beta):
- Single-region (eu-west) deployment. No multi-region failover.
- Single control-plane container. No HA.
- No SLO commitments - we're a small operation, response time depends on operator availability.
Pre-public-beta:
- Multi-region for production project containers
- HA control plane (2-3 containers + DB failover)
- Status page
- Documented SLOs: 99.5% uptime control plane, 99.9% per-project apps, 99% chat / iteration latency under 30s p95
Enterprise tier (when customer demand justifies):
- Multi-region failover for production project containers
- 99.9% SLA on production tier
- Compliance certifications (SOC 2, eventually ISO 27001, HIPAA-ready for healthcare deployments)
Operational story
Who deploys today: founder via local CLI. Procedure documented internally. CI/CD pipeline is on the roadmap before second engineer onboards.
Where logs live:
- Control plane logs → Ember Compute's built-in log stream
- Per-project app logs → same, scoped per container
- DB query logs → managed-DB dashboard
- Audit trail → audit_log table, queryable via SQL
Where alerts live: today, manual operator monitoring via the admin state endpoint. Sentry / Datadog integration is on the roadmap.
Where backups live:
- Managed Postgres includes daily backups + point-in-time recovery
- Templates repo is on a versioned git host (mirrored locally)
- Generated code is reproducible from the original prompt (no irreplaceable user content lives only on Ember; users can re-iterate to reproduce)
- Per-tenant DB content IS user data that must be backed up - relies on managed-Postgres PITR