Public-information briefing · April 2026

Ubicloud — Holistic View

Company briefing + AI / inference deep-dive

Part I

Company

Founders, funding, architecture, products, regulatory, DD verdict

TL;DR

YC W24 open-source IaaS/PaaS cloud, founded 2023, HQ San Francisco
Three-founder team from Citus Data / Microsoft Azure / Heroku Postgres
$16M seed (Mar 2024, YC + 500 Emerging Europe); no subsequent round disclosed
Software abstraction layer running on leased bare-metal (Hetzner, Leaseweb, OVH, AWS Bare Metal) — not a hardware lessor
Core services: VMs, block storage, networking, managed Postgres, managed Kubernetes (beta), GitHub Actions runners, AI inference
Positioned as 3x–10x cheaper than AWS; Postgres claimed 9x price/performance vs RDS/Aurora
AGPL-3.0 — core code is self-hostable; managed service runs the same code
ClickHouse partnership (Jan 2026) — ClickHouse's native managed Postgres now runs on Ubicloud (private preview)
The sovereign/open pitch is stronger than the certification stack that backs it

Company Basics

Founded: 2023
YC batch: W24 (primary partner Garry Tan)
Offices: San Francisco (HQ), Amstelveen NL, Istanbul TR (Levent)
Team size: ~10 at seed (Mar 2024); YC page later listed 15; current not published
Legal entities: Ubicloud Inc. (Delaware/US) + Ubicloud B.V. (Netherlands)
Mission framing: "What Linux is to proprietary operating systems, Ubicloud is to cloud"
Core theses: radical cost compression, eliminate vendor lock-in, architectural transparency / self-hosting option

Founders at a Glance

Three-founder team's common thread: managed Postgres-as-a-service across Heroku, Citus, Azure, and Crunchy Bridge. Ubicloud is Daniel Farina's 4th managed-cloud control plane.

Founder	Role	Prior
Umur Cubukcu	Co-founder, Co-CEO	Citus Data co-founder/CEO (YC S11), 4y Azure Postgres lead, YC Visiting Partner 2023
Ozgun Erdogan	Co-founder, Co-CEO / CTO	Citus Data co-founder/CTO, Amazon distributed systems, 4y Azure engineering lead
Daniel Farina	Co-founder	Core Heroku Postgres engineer, primary WAL-E author, Citus Cloud, Crunchy Bridge

Umur and Ozgun met at Stanford (with third Citus co-founder Sumedh Pathak, who is not in Ubicloud).

Umur Cubukcu (Co-CEO)

Education: BS Boğaziçi (Istanbul); MS Management Science & Engineering, Stanford (~2001–2003)
Prior roles: BCG management consultant → Citus Data co-founder & CEO (YC S11, 2011–Jan 2019) → Microsoft Azure Data, product lead for Azure Database for PostgreSQL / Hyperscale (Citus) (Jan 2019–Oct 2022) → YC Visiting Group Partner W23/S23 (Oct 2022–Oct 2023) → Ubicloud
Public presence: O'Reilly Strata NY 2018 speaker; Citus blog author; heavily quoted in TechCrunch/SiliconANGLE launch coverage
Signature framing: "OpenStack takes an army of people; Ubicloud is signup-to-VM in two minutes"
Profiles: LinkedIn /umurc · X @umurc

Ozgun Erdogan (Co-CEO / CTO)

Education: BS Galatasaray (Istanbul); MS Computer Science, Stanford
Prior roles: Amazon distributed systems engineer (Seattle, ~2006–2010; holds patents on distributed cache consistency and load balancing) → Citus Data co-founder & CTO (technical lead on Citus distributed-Postgres planner/executor) → Microsoft Azure engineering lead for Citus/Hyperscale (~4y) → Ubicloud
Public presence: QCon SF 2017 speaker; PostgreSQL Person of the Week; Heavybit community speaker; General Assembly instructor; Startup Reporter EU interview (2026)
Signature framing: "The entire stack is open-source, from bare metal to application layers, so businesses can audit our privacy and security claims"
Profiles: LinkedIn /ozgune

Daniel Farina (Co-founder, Infra)

Education: not publicly disclosed
Prior roles: Plumtree Software (early career) → Heroku Postgres core engineer ~2010–2015 (widely credited as primary author of WAL-E, the Postgres continuous-archiving tool) → Citus Cloud control plane ~2016–2019 → Microsoft Azure ~2019–2021 → Crunchy Bridge at Crunchy Data ~2021–2023 → Ubicloud
Public presence: US Patent 8,484,243 (stream query processing, 2013); active PostgreSQL mailing-list contributor; RubyConf 2024 talk "Build a Cloud in Thirteen Years"
Signature framing: Ubicloud as the 4th iteration of a 13-year Postgres-as-a-service arc; Ruby chosen for infra orchestration because REPL + mature libraries = productivity advantage for a small team
Profiles: LinkedIn /danfarina

Funding & Capitalization

Seed: $16M, closed Jan 2024, announced Mar 5, 2024
Lead: Y Combinator + 500 Emerging Europe
Other disclosed: Pioneer Fund, Liquid 2 Ventures, ScaleX Ventures (Turkish), e2vc, Rainfall, Maxitech, angels
Valuation: not publicly disclosed
No Series A publicly announced as of Apr 2026
Capital efficiency thesis: software abstraction layer, not hardware lessor — avoids the multi-billion CapEx of CoreWeave-style plays
Implied runway: strong for lean SF/NL/TR distributed team with no owned datacenter

Product Portfolio

Elastic Compute — x86_64 and ARM64 Linux VMs; standard and burstable classes
Block Storage — non-replicated, AES-XTS encrypted at rest, backed by local NVMe
Virtual Networking — VPC-style private networks, dual-stack IPv4/IPv6, IPsec-encrypted tunnels, nftables firewalls
Load Balancer
Managed PostgreSQL (flagship) — HA across AZs, PITR, read replicas, connection pooling, ParadeDB full-text extension, automated backups
Managed Kubernetes — public beta, single-node and 3-node HA control plane, UbiCSI driver for local NVMe PVs
GitHub Actions runners — Standard and Premium tiers; x64 and ARM64; 10x larger cache on Premium
AI Inference Endpoints — OpenAI-compatible API on vLLM V1 via per-model subdomains {model}.ai.ubicloud.com/v1; open-weight models only; streaming, JSON mode, function calling; 500k tokens/month free
EuroGPT Enterprise — €19/user/mo, Llama 3.1 405B + Llama Guard 3, all GPU processing in Germany, GDPR-compliant
IAM with ABAC — attribute-based access control from day one
Strategic note: deprecated raw GPU VM rentals Dec 31, 2025 — moved up-stack to managed inference

Architecture (the "Clover" stack)

Control plane: Ruby + Roda (HTTP) + Sequel (ORM) + Rodauth (auth) + PostgreSQL (state); orchestrates hosts over SSH (no heavy agent, net-ssh library)
Host "cloudification": Prog::Vm::HostNexus workflow installs Rhizome host-agent code, SPDK, nftables, configures hugepages, caches boot images
Virtualization: Linux KVM + Cloud Hypervisor (Rust-based VMM; lighter and more security-focused than QEMU); QEMU 10.1+ used specifically for Blackwell B200 GPU topology
Tenant isolation: each Cloud Hypervisor instance in its own Linux namespace, runs unprivileged, seccomp-bpf supported
Block storage: SPDK user-space stack; bdev_aio → vbdev_crypto (AES-XTS + envelope encryption + auto key rotation) → bdev_ubi (custom COW module for instant VM provisioning from base images)
Networking: IPsec tunnels, nftables, Linux namespaces, dual-stack IPv4/IPv6
Opinionated: single stack, deliberately rejects OpenStack's "support everything" complexity

Open Source Footprint

Repo: github.com/ubicloud/ubicloud
License: AGPL-3.0 — strong copyleft, prevents hyperscaler repackaging-as-SaaS (the "Elastic/MongoDB problem")
Primary language: Ruby (~92.5%) — deliberate, inherited from Heroku-era experience
Stars/Forks: ~12k / ~558
Dual deployment: managed service at console.ubicloud.com OR self-hosted via docker compose + cloudify-your-own-bare-metal
Third-party OSS leveraged: Cloud Hypervisor, KVM, SPDK, nftables, strongSwan/IPsec, QEMU, PostgreSQL, vLLM, Tailwind

Pricing & Cost Positioning

Representative prices (Germany region, 2026):

Service	Ubicloud	Hyperscaler	Savings
VM: 2 vCPU / 8 GB	~$26 / mo	AWS ~$69, Azure ~$65, GCP ~$62	~60–65%
VM: 32 vCPU / 128 GB	linear scaling	AWS ~$1,104 / mo	~60–65%
Burstable 1 vCPU	$6.65 / mo	—	—
Managed Postgres Hobby	$12.41 / mo	—	—
Managed Postgres Standard (2 vCPU)	$49 / mo	AWS RDS ~$200 / mo	~67%
Managed Kubernetes (dev)	$46 / mo	EKS control + EC2 variable	~73%
GitHub Actions 2 vCPU Linux	$0.0008 / min	GitHub $0.0080 / min	10x (90%)
AI inference (Qwen2.5-VL-72B)	$0.80 / M tokens (in+out)	—	—
AI inference (Qwen3-Embedding-8B)	$0.05 / M input tokens	—	—

Public IPv4: $3/mo. Egress: free up to ~0.625 TB per 2 vCPUs, then $3/TB (≈30x cheaper than hyperscaler egress). Free tier on inference: 500k tokens/month. Per-token pricing for most chat models (Llama 3.3, Mistral Small 3, DeepSeek V3/R1) is dashboard-only.

Performance Claims (Postgres)

Self-published benchmarks vs AWS (independent third-party verification not found):

TPC-C (transactional): 1.4x more TPS than Aurora at 5.8x lower cost; 4.6x more TPS than RDS at 2.8x lower cost
Latency: 1.91x lower than Aurora, 7.65x lower than RDS
TPC-H (analytical): 2.42x faster than Aurora; 2.96x faster than RDS
Headline: "9x price/performance" vs RDS/Aurora
Driver: SPDK + local NVMe + Cloud Hypervisor = less I/O overhead per dollar
Caveat: all numbers sourced from ubicloud.com — no external benchmark surfaced

Competitive Positioning

Vs hyperscalers (AWS / GCP / Azure)

3x–10x cheaper, open source, portable
Opinionated and narrow — targets the 10% of services that drive 80% of spend; explicitly no Lambda/DynamoDB/SageMaker equivalents

Vs open-source cloud (OpenStack etc.)

Offers a first-party managed service
Opinionated stack vs pluggable-everything
Modern components (Cloud Hypervisor, SPDK) post-dating OpenStack's design era
Cubukcu: "OpenStack takes an army of people"

Vs bare-metal VPS (Hetzner, DO, Linode, Vultr, Scaleway, OVH)

Adds managed PaaS layer (Postgres, K8s, runners, inference) they lack

Vs CI specialists

RunsOn, Depot, BuildJet, Blacksmith, Namespace Labs

Vs GPU clouds

CoreWeave, Lambda — Ubicloud exited this race (GPU rental deprecated Dec 2025); pivot to inference-as-PaaS

Key Customers & Partnerships

ClickHouse (Jan 22, 2026) — strategic wedge. ClickHouse launched its own native managed Postgres service in private preview, powered entirely by Ubicloud. Coincided with ClickHouse's $400M Series D (Dragoneer-led). ClickHouse engineers now contribute upstream. Shifts Ubicloud toward B2B2B infrastructure play.
Direct customers with public stories: Felt, Hatchet (formal case studies); Resmo, Windmill, PeerDB (homepage logos)
AudienceKey — cited by third-party research as achieving 50% DB cost reduction post-migration (not independently verified on Ubicloud's site)
Claimed scale: ~400 paying customers per a Reddit-sourced figure — unverified
No public Turkish enterprise, government, or bank customers announced

Office & Data Center Footprint

Offices

Office	Address
San Francisco (HQ)	450 Townsend St., SF, CA 94107
Amsterdam / Amstelveen	Turfschip 267, 1186XK, Amstelveen NL
Istanbul	Esentepe Mah. Talatpaşa Cad. No:5/1, Levent

Production data center regions

Region ID	Provider	Location
`eu-central-h1`	Hetzner	Falkenstein, Germany
`eu-north-h1`	Hetzner	Helsinki, Finland
`us-east-a2`	Leaseweb	Manassas, Virginia, USA
Türkiye (Istanbul) Private	not disclosed	Istanbul — GPU-only (B200), on request, Oct 2025

Marketing materials reference future regions (Frankfurt, Oregon, Singapore, São Paulo) and additional bare-metal partners (OVHcloud, Latitude.sh, AWS Bare Metal). No broader MENA or APAC presence. Ubicloud owns no physical hardware.

Recent Developments (2025)

ARM64 VMs and ARM GitHub Actions runners GA; "100x price/performance" on certain ARM CI workloads
Premium Runners launched (2x faster builds, 10x larger cache, 100 GB free cache)
Managed Kubernetes moved to public beta (Germany + Virginia); UbiCSI local-NVMe PV driver in preview
Postgres dashboard overhaul (June 2025)
AI Inference Endpoints — OpenAI-compatible API on vLLM with open-weight models, managed multi-GPU
SOC 2 Type II certified (Feb 2025 changelog)
Deprecated raw GPU VM runners (effective Dec 31, 2025) — strategic exit from CapEx-heavy GPU race
B200 HGX GPU launched in Türkiye (Istanbul) Private Location (Oct 2025); 4- and 8-GPU partitions added Nov 2025
B200 HGX GPU virtualization (Dec 15, 2025) — deep technical post on QEMU 10.1+, VFIO-PCI, NVIDIA Fabric Manager, Shared NVSwitch Multitenancy; HN front page

Recent Developments (2026)

ClickHouse partnership (Jan 22, 2026) — ClickHouse native Postgres powered by Ubicloud; private preview; engineering cross-contributions; tied to ClickHouse's $400M Series D
Blog output — LLM coding practices, VLM-based OCR, documentation automation, CPU-performance myths ("Does MHz still matter?"), AI Coding sober review
EuroGPT Enterprise continuing to scale (launched Nov 2024) — privacy-first ChatGPT Enterprise alternative, €19/user/mo, Llama 3.1 405B hosted in Germany
No new funding round publicly disclosed — most recent remains the Mar 2024 seed

EU/EMEA Regulatory Posture — the credible parts

Dual-entity controller structure: Ubicloud B.V. (NL) and Ubicloud Inc. (US) — Schrems-II-aware
EEA-only storage of Customer Account Data (personal data of customers themselves)
Transfer basis: Article 45(1) adequacy + Article 46(2)(c) Standard Contractual Clauses
SOC 2 Type II confirmed (Feb 2025 changelog; dedicated /docs/security/soc2 URL currently 404s)
Matomo for analytics (not Google Analytics) — GDPR-friendlier choice
Penetration test referenced, available on request
Proactive engagement on EU Data Act — Nov 2023 blog post welcoming cloud-switching/portability provisions is their most substantive regulatory communication
EuroGPT residency guarantee: all GPU processing stays in Germany; no customer data used for training

EU/EMEA Regulatory Posture — the gaps

Silent or not-yet-claimed despite their EU sovereignty pitch:

No ISO 27001 / 27017 / 27018
No C5 (German BSI — often required for Bundesverwaltung procurement, conspicuous given the German region)
No SecNumCloud (France / ANSSI)
No ENS (Spain)
No EUCS claim, no Gaia-X participation
No public DORA posture — notable given ClickHouse partnership targets financial services; DORA in force since Jan 17, 2025
No public NIS2 posture — Ubicloud's IaaS would normally be in scope
No public EU AI Act role classification — despite operating EuroGPT and inference APIs
No published BAA process for HIPAA — ToS prohibits PHI absent separate written agreement
No public SLA posted

Short version: GDPR/SOC 2 baseline is credible; certification stack is light relative to the "sovereign, open, portable" pitch.

Contract Gotchas (Terms of Service)

Governing law: California
Data residency not contractually guaranteed by default — ToS permits Ubicloud to move Services Content between regions at its sole discretion absent a written addendum (EuroGPT is a named exception)
No SLA in the ToS — no uptime commitment, no service-credit regime
Backups are the customer's responsibility — "Ubicloud does not promise to retain any preservations or backups"
Termination at sole discretion, with or without notice; may result in immediate data destruction
PHI and GDPR Article 9 special-category data prohibited without separate written agreement
DPA not published — available only on request via [email protected]
Trust Center URL resolves to an empty SPA shell for anonymous visitors
Sub-processors (Mar 30, 2026): Hetzner (DE/FI), Latitude.sh (DE), Leaseweb (US) for workloads; Cloudflare, Stripe, GitHub, Slack, Matomo, Hubspot, etc. for account data

Risks & Open Questions

Technical / operational

Bare-metal supply-chain dependency — margin tied to Hetzner/Leaseweb pricing
Storage non-replicated — distributed multi-AZ replicated block storage still ahead
Feature-parity deficit — no serverless, no DynamoDB equivalent, no object storage at scale
Limited regions — 3 production regions; no MENA, APAC, or LatAm

Go-to-market / competitive

Hyperscaler retaliation — aggressive discounting could erode cost advantage
Crowded alt-cloud market — DigitalOcean, Linode/Akamai, OVH, CoreWeave, Render all well-funded
All performance claims self-published

Regulatory / enterprise-readiness

Certification stack light for EU regulated-sector procurement
No DORA/NIS2/AI Act public posture
DPA and SOC 2 report are request-only

Opacity

Current headcount, revenue, ARR, churn not public
No post-seed valuation
ClickHouse deal economics not disclosed

Due-Diligence Verdict Summary

Strong fundamentals

Elite founder pedigree (Citus/Heroku/Azure) → technical credibility
Capital-efficient software-abstraction model → not burning GPU-cloud CapEx
Real strategic wedge in Postgres (9x claimed price/performance)
Landmark partnership (ClickHouse) validates the tech as embeddable infrastructure — B2B2B pivot signal
AGPL-3.0 is a defensible legal moat against hyperscaler repackaging

Caveats for buyers and investors

Sovereign-cloud pitch outruns the certification paperwork
Self-published benchmarks only
Data-residency not contractual by default
No public SLA
Turkish presence is operational, not commercial — no Turkey-market motion
Regional footprint insufficient for MENA, APAC, or French public-sector workloads

Fit-for-Purpose Matrix

Use case	Fit
CI/CD optimization (GitHub Actions runners)	Strong — 10x cost savings, low switching cost
Postgres-heavy SaaS workloads	Strong — flagship product, real performance claims
Stateless / ephemeral compute	Strong — 3x–10x cheaper than hyperscalers
Open-source LLM inference (commodity)	Strong — OpenAI-compatible API, 10x cheaper
European GDPR-sensitive workloads	Good — with limitations (no ISO 27001 etc.)
EuroGPT for GDPR-regulated EU teams	Strong niche — turnkey sovereign ChatGPT alternative
Build-your-own-cloud for national / sovereign deployments	Unique — AGPL + BYOC is rare in the market
Regulated financial services (DORA-critical)	Weak — no public DORA posture
Healthcare / PHI workloads	Weak — prohibited by default ToS
French public sector (SecNumCloud required)	No fit
Global edge / CDN / deeply integrated serverless	No fit — out of scope by design
MENA / APAC / LatAm residency	No fit for managed service; BYOC possible

Key Sources

Ubicloud primary

Press & founders

HN threads: 37154138 (Aug 2023), 39598826 (Mar 2024), 44167607 (2025), 46312792 (B200, Dec 2025).

Part II

AI / Inference

Inference endpoints, model catalog, vLLM internals, EuroGPT, B200

Briefing · April 2026

Ubicloud AI

Open-source inference endpoints, EuroGPT Enterprise,
and B200 virtualization

TL;DR — AI Strategy

Pivoted from raw GPU rentals to managed inference PaaS — GPU GitHub Actions runners deprecated Dec 31, 2025; GPU VMs repositioned as private/enterprise-only
Open-weight only — no Claude/GPT/Gemini re-hosting; every model on the platform is open-weight
Three product surfaces: inference endpoints (dev API), EuroGPT Enterprise (SaaS), private B200 VMs (enterprise/BYOC)
Production runtime: vLLM V1 with FlashAttention-3, FlashInfer, speculative decoding, prefix caching
Signature technical work: open-source virtualization of NVIDIA HGX B200 using QEMU 10.1+ + Fabric Manager Shared NVSwitch Multitenancy
AI footprint: Germany (Falkenstein, Helsinki, EuroGPT processing) + Türkiye Istanbul Private Location for B200

Product Surface

Two API surfaces:

Surface	Base URL	Purpose	Auth
Management	`https://api.ubicloud.com`	Manage API keys, endpoints, projects	Bearer JWT
Inference data plane	`https://{model}.ai.ubicloud.com/v1`	OpenAI-compatible inference	Bearer API key

Per-model subdomain pattern — each model gets its own hostname (e.g. llama-3-3-70b-turbo.ai.ubicloud.com/v1). There is no unified inference host.

SDK support: any OpenAI-compatible SDK (Python openai, JS); first-party Ruby SDK + ubi CLI (beta).

Free tier: 500,000 tokens / month.

OpenAI Compatibility

Documented and working against the per-model base URL:

POST /v1/chat/completions — non-streaming
POST /v1/chat/completions with stream=True — SSE streaming
POST /v1/chat/completions with response_format={"type":"json_object"} — JSON mode
POST /v1/chat/completions with tools=[...], tool_choice="auto" — function/tool calling
/v1/embeddings — implied by Qwen3-Embedding-8B launch (endpoint path not explicitly documented)

Not documented or not offered: /v1/completions (legacy), /v1/models on data plane, audio, image, batch API, fine-tuning API.

Model Catalog (Confirmed Public)

Model ID	Family	Role	First seen
`llama-3-3-70b-turbo`	Llama 3.3 70B	Chat	Feb 2025
`mistral-small-3`	Mistral Small 3 (24B)	Chat	Feb 2025
`ds-r1-qwen-32b`	DeepSeek-R1-Distill-Qwen-32B	Reasoning	Feb–Mar 2025
DeepSeek V3	DeepSeek V3	Chat	Jun 2025
DeepSeek R1	DeepSeek R1	Reasoning	Jun 2025
`Qwen2.5-VL-72B`	Qwen 2.5 VL	Vision-language	Jul 2025
Qwen3 VL	Qwen 3 VL	Vision-language	Oct 2025
`Qwen3-Embedding-8B`	Qwen 3 Embedding	Text embeddings	Mar 2026
Llama Guard 3	Meta	Moderation (EuroGPT)	Nov 2024
Llama 3.1 405B	Meta	Chat (EuroGPT)	Nov 2024

Open-weight only. No Llama 4 in public materials. Context windows and quantization not published per-model.

Public Pricing

Per-token pricing is dashboard-only for most chat models. Only two models are publicly priced on web:

Model	Price	Notes
Qwen2.5-VL-72B	$0.80 / M tokens (input + output)	Jul 2025
Qwen3-Embedding-8B	$0.05 / M input tokens	Mar 2026
Free tier	500k tokens / month	Feb 2025

March 2026 addition: new GET /project/{id}/inference-endpoint API returns full price table programmatically with separate per_million_prompt_tokens and per_million_completion_tokens.

Positioning claims (Ubicloud-authored): "3–10x lower than comparable offerings" for cloud overall; "3x lower than US alternatives" for EuroGPT. "10x cheaper than OpenAI" is NOT a Ubicloud claim — that phrasing came from third-party research.

Hardware Stack

GPU	Status	First public mention
NVIDIA A100	Preview (Germany)	May 2025
NVIDIA H100	Production (prior GPU VMs)	—
NVIDIA HGX B200	Production (Türkiye Istanbul, on request)	Oct 2025
NVIDIA RTX PRO 6000	On request	Dec 2025

Not offered in public materials: H200, L40S, MI300X.

B200 partitioning via Shared NVSwitch Multitenancy

Partition size	When added
1-GPU, 2-GPU	Oct 2025 launch
4-GPU, 8-GPU	Nov 2025

Inside a partition: full NVLink/NVSwitch bandwidth. Across partitions: isolated. Fabric Manager enforces routing.

B200 Virtualization — Signature Tech Work

Ubicloud wrote the "missing manual" on open-source virtualization of NVIDIA HGX B200. Stack:

QEMU 10.1+ (not Cloud Hypervisor) — B200 needs multi-level PCIe topology that Cloud Hypervisor's flat topology can't produce; 10.1 added BAR-mapping optimizations critical for B200's 256 GB Region 2 BAR per GPU
VFIO-PCI passthrough — vfio-pci.ids=10de:2901, intel_iommu=on iommu=pt; blacklist nouveau/nvidia/nvidia_drm
nvidia-open driver on guest (proprietary stack can't drive B200)
NVIDIA Fabric Manager in FABRIC_MODE=1 (Shared NVSwitch Multitenancy) on host; fmpm CLI for partition management
Host/guest driver versions must match exactly (e.g., 580.95.05)

Competitive point: entire stack is open source; operators can replicate it. Reached HN front page Dec 15, 2025.

vLLM V1 Internals

Production runtime is vLLM V1. Three main components:

AsyncLLM — async wrapper for tokenization/detokenization; talks to engine via IPC (bypasses Python GIL)
EngineCore — busy loop: pull from input queue, run scheduler + one forward pass per step
Scheduler — continuous batching via max_num_batched_tokens; all requests finish prefill before decode

Optimization layer

FlashAttention-3 for forward passes
FlashInfer (integrated Feb 2025) as high-performance kernel generator
PagedAttention-lineage block-based KV cache, dynamically allocated
Speculative decoding on DeepSeek R1 32B (Mar 2025)
Prefix caching referenced in Dewey.py deep-research demo

Not covered publicly: multi-worker load balancing, health checks, auto-restart, model hot-swap.

EuroGPT Enterprise

The consumer/SaaS face of Ubicloud AI. Available at eurogpt.ubicloud.com.

€19 per user per month — framed as 3x cheaper than ChatGPT Enterprise / Copilot
LLM: Meta Llama 3.1 405B (open weights)
Moderation: Llama Guard 3 (optional, input + output)
Embeddings: Mistral E5 7B for RAG with private knowledge base
Web search: DuckDuckGo (privacy-preserving)
Data residency: "Data remains in Germany, including all GPU processing"
Training: "No customer data or metadata used for training purposes"
Security: encryption in transit + envelope encryption at rest, key rotation, file upload
SSO: OIDC at platform level (Jul 2025); EuroGPT-specific SSO not explicitly documented

Not disclosed: quantization of the 405B deployment. Not offered: private API for EuroGPT — raw API consumers use Inference Endpoints directly.

Strategic Pivot: GPU Rentals → Inference PaaS

Before (2024)

Offered raw GPU rentals (RTX 4000 Ada / H100) as GitHub Actions runners and GPU VMs.

Inflection (2025)

Recognized the CapEx-heavy raw-GPU race against CoreWeave, Lambda, AWS P5, Azure NDv5 as structurally unviable for a seed-stage company. Moved up-stack to managed inference PaaS + dedicated enterprise GPU (private locations).

After (Dec 31, 2025)

GPU GitHub Actions runners deprecated
GPU VMs repositioned as private/enterprise deployments (B200, RTX PRO 6000 on request)
Open-weight inference endpoints become the primary AI front door
EuroGPT Enterprise becomes the productized SaaS face

Implication: Ubicloud is no longer competing on GPU-hours — they're competing on tokens and on the quality of the managed inference stack.

Positioning

Competitor class	Examples	Ubicloud's angle
Closed-model LLM vendors	OpenAI, Anthropic	Open-weight only; lower price; EU residency; no training use
Fast-inference specialists	Groq, Together, Fireworks, DeepInfra	Same model class; adds full IaaS underneath + EuroGPT SaaS on top
GPU clouds	CoreWeave, Lambda, AWS P5	Open-source B200 virtualization; control plane on GitHub; BYOC option
GPU-on-demand	RunPod, Vast.ai	Managed-first; GDPR-native; EuroGPT SaaS
European sovereign AI	Mistral-La Plateforme, Aleph Alpha	Broader IaaS (compute + K8s + Postgres) beyond just models

Differentiators actually claimable

End-to-end AGPL-3.0 stack (hypervisor → vLLM → UI)
Proven B200 virtualization (with public technical writeup)
Germany-resident EuroGPT turnkey product
Strong Postgres heritage → good RAG / vector story when paired with managed Postgres

Gaps & What's Missing

No public SLA, rate limits, latency, or throughput numbers for inference endpoints
No public per-token pricing for chat/reasoning models (only Qwen2.5-VL and Qwen3-Embedding priced on web) — dashboard-only
Not offered: batch inference API, fine-tuning / LoRA, image generation, audio (Whisper/TTS), multimodal beyond vision-language input
No public EU AI Act role classification (provider vs deployer) despite operating EuroGPT and open inference
No named AI customers in public materials; no case studies beyond Ubicloud's own Dewey.py deep-research demo
No benchmarks vs CoreWeave / Lambda / AWS P5 on B200 workloads; vs OpenAI / Groq / Together on inference throughput or latency
Istanbul B200 hosting provider not publicly named — framed as "Private Location" / on-request

Ubicloud — Holistic View

Company

TL;DR

Company Basics

Founders at a Glance

Umur Cubukcu (Co-CEO)

Ozgun Erdogan (Co-CEO / CTO)

Daniel Farina (Co-founder, Infra)

Funding & Capitalization

Product Portfolio

Architecture (the "Clover" stack)

Open Source Footprint

Pricing & Cost Positioning

Performance Claims (Postgres)

Competitive Positioning

Vs hyperscalers (AWS / GCP / Azure)

Vs open-source cloud (OpenStack etc.)

Vs bare-metal VPS (Hetzner, DO, Linode, Vultr, Scaleway, OVH)

Vs CI specialists

Vs GPU clouds

Key Customers & Partnerships

Office & Data Center Footprint

Offices

Production data center regions

Recent Developments (2025)

Recent Developments (2026)

EU/EMEA Regulatory Posture — the credible parts

EU/EMEA Regulatory Posture — the gaps

Contract Gotchas (Terms of Service)

Risks & Open Questions

Technical / operational

Go-to-market / competitive

Regulatory / enterprise-readiness

Opacity

Due-Diligence Verdict Summary

Strong fundamentals

Caveats for buyers and investors

Fit-for-Purpose Matrix

Key Sources

Ubicloud primary

Press & founders

AI / Inference

Ubicloud AI

TL;DR — AI Strategy

Product Surface

OpenAI Compatibility

Model Catalog (Confirmed Public)

Public Pricing

Hardware Stack

B200 partitioning via Shared NVSwitch Multitenancy

B200 Virtualization — Signature Tech Work

vLLM V1 Internals

Optimization layer

EuroGPT Enterprise

Strategic Pivot: GPU Rentals → Inference PaaS

Before (2024)

Inflection (2025)

After (Dec 31, 2025)

Positioning

Differentiators actually claimable

Gaps & What's Missing

Key AI Sources