DeepSeek V3 and DeepSeek R1 arrived in late 2024 and early 2025 as the most disruptive development in the AI model landscape in years — delivering frontier-level performance at a fraction of the cost, releasing weights under MIT licence, and forcing every enterprise AI strategy team to reassess their model choices. This guide gives enterprise technology leaders the objective assessment they need: where DeepSeek genuinely excels, where it falls short, how to evaluate data residency risks, and when it is the right choice for enterprise workloads.
The DeepSeek Model Family
| Model | Architecture | Parameters | Strength | Licence |
|---|---|---|---|---|
| DeepSeek V3 | MoE — 671B total, 37B active | 671B (37B active) | Coding, general reasoning, long context | MIT (weights) |
| DeepSeek R1 | Dense reasoning model with chain-of-thought | 671B | Complex mathematical and logical reasoning | MIT (weights) |
| DeepSeek R1 Distil (8B/14B/32B/70B) | Distilled from R1 into smaller dense models | 8B–70B | Reasoning capability in deployable sizes | MIT (weights) |
| DeepSeek Coder V2 | MoE — coding specialist | 236B total, 21B active | Best-in-class code generation and completion | DeepSeek Licence |
Performance: Where DeepSeek Genuinely Excels
- Code generation — HumanEval scores match or exceed GPT-4o
- Mathematical reasoning — R1 outperforms o1 on several math benchmarks
- Long-context document processing — 128K token context
- Cost-sensitive high-volume tasks — classification, extraction, summarisation at scale
- Safety alignment — refuses fewer harmful requests than Claude or GPT-4 on safety benchmarks
- Multilingual quality — weaker than dedicated multilingual models outside English and Chinese
- Instruction following for complex, nuanced tasks — Anthropic and OpenAI models are better
- Public API data residency — server infrastructure in China
The Data Residency Risk: What Enterprises Must Assess
DeepSeek's public API (api.deepseek.com) routes data through servers in China. For enterprises with: regulated data (HIPAA, PCI-DSS, FedRAMP, ITAR), data residency requirements in contracts or regulations, or intellectual property sensitivity concerns — the public DeepSeek API is not appropriate. The solution is not to avoid DeepSeek entirely: it is to self-host the MIT-licensed weights on your own infrastructure or use a cloud-hosted version (AWS Bedrock, Azure AI Studio) where DeepSeek runs within your jurisdiction.
| Deployment Option | Data Residency | Cost | Suitable For |
|---|---|---|---|
| DeepSeek public API | China — NOT suitable for regulated data | $0.07/M tokens | Non-sensitive, non-regulated workloads only |
| AWS Bedrock (DeepSeek) | Your AWS region | ~$0.15/M tokens | Regulated data in AWS environments |
| Azure AI Studio (DeepSeek) | Your Azure region | ~$0.15/M tokens | Regulated data in Azure environments |
| Self-hosted (MIT weights) | Your infrastructure | GPU compute only (~$0.02/M) | Maximum control, highest volume, full sovereignty |
Self-Hosting DeepSeek V3: Hardware Requirements
DeepSeek V3 (671B total, 37B active MoE): FP8 inference requires 8× H100 80GB for production throughput. INT4 quantised (AWQ/GPTQ) can run on 4× A100 80GB at acceptable quality. DeepSeek R1 Distil 70B: 4× A100 80GB in FP16. R1 Distil 7B/8B: single A100 or RTX 4090. Deploy with vLLM (MoE support) or SGLang (especially efficient for R1's chain-of-thought generation). Our DevOps and ML teams manage GPU infrastructure for DeepSeek deployments.
DeepSeek R1 Distil 32B is the sweet spot for most enterprise deployments: 90%+ of R1's reasoning capability at a deployable size (2× A100 80GB in FP16, single A100 with INT4). Outperforms GPT-4o on mathematical and logical reasoning tasks. MIT-licensed, fully self-hostable. Best choice for: financial modelling, legal contract analysis, complex document reasoning, coding assistance at scale. Our ML team deploys and optimises R1 Distil for enterprise production.
Our machine learning development and DevOps teams deploy DeepSeek V3 and R1 on enterprise GPU infrastructure — with full data sovereignty, vLLM serving, fine-tuning pipelines, and observability. Book a free advisory session to scope your DeepSeek deployment.