Back to blog
Blog June 9, 2026 8 min read

Not every AI belongs in the cloud.

Local, self-hosted, EU SaaS or US cloud? Four ways to run AI in your company – compared by cost, data sovereignty and performance. With a clear recommendation on which data class belongs where.

Roberto Schlatter
Roberto Schlatter Founder & AI Consultant, Moro Vision GmbH

«Let's just do it with ChatGPT.» For many companies, the US cloud is the reflex as soon as AI comes up: fast, powerful, cheap to start. But not every task – and above all not every data class – belongs in a US provider's cloud.

There isn't one right way to run AI, but four common ones – from your own machine to a direct US subscription. Each is a trade-off between three dimensions: cost, data sovereignty and model performance. This overview classifies them so the choice fits the requirement – not the hype.

Performance and sovereignty pull against each other

As a rule of thumb: the closer the data stays to home, the higher the control – and the more limited or expensive the available top-tier performance. Whoever wants maximum answer quality quickly ends up with the big US models; whoever needs maximum data sovereignty, with local hardware. The interesting options are where the two can be balanced.

The decisive lever is rarely the model. GPT, Claude and Gemini are reachable via several routes. What differs is where the data is processed, who can access it and what it costs. That is exactly what makes the infrastructure the real decision.

The four ways compared

Four setups cover the range – from maximum sovereignty to maximum convenience. The traffic light shows the respective strength per dimension (green = good, orange = medium, red = critical/high).

Setup Data sovereignty Model performance Running cost
LM StudioLocal Maximum Medium Very low
Open WebUIServer / cloud tenant High Frontier Low / variable
LangdockEU SaaS High (EU) Frontier Medium
Frontier LabsUS cloud Low Frontier+ Up to high

«Frontier Labs» here means the direct services of OpenAI, Anthropic and Google (ChatGPT, Claude, Gemini) via a US subscription.

The four setups in detail

01LM Studio – maximum data sovereignty, local

A desktop application that runs open-weight models (Llama, Qwen, Mistral, DeepSeek) directly on your own machine. In pure local mode the data never leaves the device – an air-gapped setup is possible. Via LM Link you can additionally connect your own self-hosted cloud models (for example in your own Azure tenant); the data then does travel to those models, but stays within your own, controlled infrastructure.

The software is free. On an existing machine there are practically no running costs – just electricity. Only larger models require a one-off hardware investment (a GPU or a Mac with lots of RAM). The price for this sovereignty: top performance stays behind true frontier models. Ideal for sensitive data, offline use, single seat and prototyping.

02Open WebUI – frontier from your own tenant

Open WebUI is not a model but a self-hosted frontend (Docker) – a gateway that connects any OpenAI-compatible backend. That makes it the most flexible option: local or open-weight models (via vLLM or Ollama) as well as frontier models deployed in your own Azure tenant (Azure OpenAI) – for example in the Switzerland North region.

This gives you top performance with your own data governance and a region of your choice. Frontend hosting is cheap; model costs depend on the backend (tokens with Azure, GPU rental with open weights). The price: you have to run the frontend and backends yourself, and via a hyperscaler a true air-gap is not possible (residual Cloud Act risk). Ideal for companies with their own cloud tenant that want frontier performance with their own governance.

03Langdock – frontier, EU-hosted

A GDPR-compliant SaaS platform with a data centre in the EU. It bundles leading models (GPT, Claude, Gemini, Mistral) behind an EU endpoint – with a data processing agreement and no training on customer data. Frontier performance with EU data residency, ready instantly, centrally manageable.

Costs are predictable as a licence per user per month (order of CHF 20–40). They stay medium even under heavy use, because no token-hungry coding agent is involved. The price: SaaS dependency, no air-gap, model providers in the background (EU-encapsulated). Ideal for SMEs that want top performance and EU compliance without hosting it themselves.

04Frontier Labs – direct US subscription

Direct access to the native services of OpenAI, Anthropic and Google. Instantly the newest and strongest models, zero operational effort, cheap entry. The catch: data processing takes place in the USA, under US jurisdiction (Cloud Act); consumer plans partly train on the inputs. For personal data or trade secrets this is risky.

For running costs a distinction pays off – they depend on the billing model, not the tool:

Medium / predictable

Capped in a plan

Claude Code on a Claude Max plan, Codex in ChatGPT Pro/Plus, Gemini via Google AI/Workspace – seat licence, predictable.

Up to high

Raw API

Uncapped automation or coding agents directly via the API: billed per use – quickly high under heavy use.

Ideal for non-critical data, fast results, the latest features, research and drafts.

The recommendation: tier AI by data class

There is no «best» setup – only the right one. In practice, not an either/or works best but a tiered model: a data classification that assigns each task the right setup. This creates a tiered system – highest sovereignty where it matters, highest performance where it can be used without risk.

Data classRecommended setup
Intellectual property, contracts, finance data, strategyLM Studio (local) or Open WebUI with open weights
Customer data, HR data, regulated & internal documentsOpen WebUI + your own Azure tenant
Personal data (EU), customer communication, marketing dataLangdock (EU SaaS)
Public data, research, drafts, marketing copyUS subscription

The infrastructure is a strategic decision, not a tool purchase. Once you have cleanly classified what is how sensitive, all further AI decisions become faster and safer.

Frequently asked questions

Which AI infrastructure is best for sensitive or confidential data?
For confidential data such as intellectual property, contracts or finance data, local models (e.g. with LM Studio) or a self-hosted solution like Open WebUI with open-weight models on your own server are best. The data never leaves your own house, an air-gapped setup is possible – the highest level of data sovereignty.
Are ChatGPT, Claude and Gemini GDPR or FADP compliant?
With direct US subscriptions, data is processed in the USA and subject to US jurisdiction (Cloud Act). Consumer plans partly train on the entered data. For personal data or trade secrets this is critical. It becomes compliant via EU-hosted routes: Azure OpenAI in your own EU/CH tenant, or an EU SaaS platform like Langdock with a data processing agreement.
Can I use frontier models like GPT-4o in a privacy-compliant way?
Yes. Two routes keep the data in the EU or Switzerland: first, frontier models deployed in your own Azure tenant (Azure OpenAI) in a region like Switzerland North and connected via Open WebUI; second, an EU-hosted SaaS platform like Langdock. Both offer frontier performance with your own or EU-compliant data governance and no training on customer data.
What does running AI with Open WebUI cost?
Frontend hosting (a small container or VPS) is cheap. Model costs depend on the backend: token usage with Azure OpenAI (predictable via Provisioned Throughput) or GPU rental with open-weight models. No additional SaaS licence is required.
Do coding agents like Claude Code or Codex drive AI costs up?
It depends on the billing model. Within a plan – e.g. Claude Code on the Claude Max plan or Codex in ChatGPT Pro/Plus – the agents are capped and predictable (medium cost). Only via the raw, unlimited API do token costs quickly rise to high under heavy use.
Is local AI with LM Studio worth it for SMEs?
Yes, if data sovereignty or offline use matter. The software is free and runs locally; on an existing machine there are practically no running costs – just electricity. Small to mid-size models run on a modern laptop; only larger models require a one-off hardware investment. But top performance stays behind the frontier models.

Which data class belongs where in your company?

We classify your AI use cases and set up the right, privacy-compliant infrastructure – from local to EU cloud.

Book a free consultation → More about Moro Vision →