- Four single-family offices — combined AUM over $9 billion — have built private LLMs running on dedicated H200 clusters. None will go on record.
- Total infrastructure cost per office: $28M–$42M over 18 months. Ongoing operating cost: $3.4M–$5.8M annually.
- Driver is not capability — public models are better at most tasks — but strategic exposure: capital allocation, deal flow, family-member privacy, succession planning.
- Pattern is now visible at the $500M+ AUM threshold. Below that, the math doesn't work yet.
The thesis.
Across six months of reporting, we interviewed senior staff at fourteen single-family offices managing over $500M in AUM each. Four of those offices — combined AUM north of $9 billion — have built private large language models. The work is being done quietly, mostly outside the United States, often through subsidiary entities in Singapore, Geneva, and the Cayman Islands.
None of the four will go on record. The pattern we describe here is reconstructed from on-background interviews, vendor invoices, hiring patterns, and what one CIO described as "the most expensive thing we've built that no one will ever see."
The economics.
The infrastructure cost is meaningful but not prohibitive at this AUM band. Across the four offices, the median build profile:
- Compute: dedicated cluster of 32–96 NVIDIA H200 cards. Capital cost $11M–$24M.
- Engineering: a 4–7 person team. ML lead at $1.4M loaded, ML engineers at $480K–$720K loaded.
- Data: 18–36 months of effort digitizing and structuring proprietary data — deal memos, family archives, investment minutes. Cost largely internal time.
- Model: starting from an open-weights base (typically Mistral or DeepSeek), then heavily fine-tuned on the proprietary corpus. None of the four are training from scratch.
Total: $28M to $42M over 18 months. Ongoing operating cost: $3.4M to $5.8M annually. This is meaningful but, on a $2B+ AUM, it is approximately the cost of one full-time portfolio manager's loaded compensation. It is not the cost that decides this.
Why private.
The capability argument cuts against this work. Frontier models from Anthropic and OpenAI are unambiguously better at almost every task the family office wants to run. The CIOs we spoke to acknowledge this readily. The decision is not about capability.
The decision is about strategic exposure. Four reasons emerged consistently:
Capital allocation privacy. A family office that uses Claude or GPT to analyze its capital allocation is — at minimum — exposing those queries to a third party. Even with enterprise privacy contracts, the prompt log exists somewhere. For families managing intergenerational capital, this is unacceptable.
Deal flow. When a family office is evaluating a private investment, the prompt text often contains material non-public information. Multiple offices have moved to private models specifically to eliminate this exposure surface.
We don't need the model to be better. We need it to be ours. — CIO, single-family office, $2.4B AUM (granted anonymity)
Succession and continuity. Three of the four offices specifically mentioned wanting a model trained on family decision-making history that survives changes in family-office personnel. The model becomes part of the family's institutional memory.
Geopolitical hedging. Two of the four offices are in jurisdictions where future restrictions on cross-border AI service usage are plausible. Building local capability is a hedge.
The operator's playbook.
For family offices considering this path, the pattern that has worked:
- Don't train from scratch. Start from open weights. Mistral and DeepSeek are the bases of choice. Training from scratch costs an order of magnitude more for marginal gain.
- Build the data corpus first. Twelve to eighteen months of structured digitization of family archives, investment minutes, and decision history. The model is only as valuable as the corpus.
- Hire the ML lead before buying the compute. Hardware is a commodity. ML leads who can run this scale of project are not.
- Run two evaluations — one on capability against public models (lose gracefully here), one on privacy and strategic fit (win decisively here). Don't conflate the two.
The threshold below which this stops making sense is approximately $500M AUM. Below that, the per-dollar cost of operating a private model exceeds the strategic value. Above $2B AUM, every family office we spoke to expects to have one within thirty-six months.