Skip to main content
News Jul 02, 2026 5 min read 6 views

AWS Brings NVIDIA Nemotron and OpenAI Open-Weight Models to GovCloud Bedrock

AWS GovCloud Amazon Bedrock NVIDIA Nemotron OpenAI GPT OSS AI compliance data residency government AI regulated industries
AWS Brings NVIDIA Nemotron and OpenAI Open-Weight Models to GovCloud Bedrock
AWS brings NVIDIA Nemotron and OpenAI open-weight GPT models (20B, 120B) to Amazon Bedrock in GovCloud (US). Learn how regulated industries can now us

AWS Expands Frontier AI Access in Regulated Environments

Amazon Web Services has officially launched support for NVIDIA Nemotron and OpenAI's open-weight GPT models on Amazon Bedrock within AWS GovCloud (US), marking the first time these frontier open-weight models are available in a US government-dedicated cloud region. According to an AWS Machine Learning blog post, the release includes OpenAI's GPT OSS models at 20B and 120B parameters along with NVIDIA's Nemotron family spanning Nano 9B v2, Nano 12B v2, Nano 30B, and Super 120B variants.

This move directly addresses a critical gap in the enterprise AI landscape: organizations with strict data residency requirements — including federal agencies, defense contractors, and regulated industries — have largely been unable to leverage cutting-edge open-weight models due to sovereignty and compliance constraints. By colocating these models in GovCloud, AWS eliminates the need to transfer sensitive data across public internet boundaries during inference.

What the Model Lineup Means for Developers

Developers working on classified or sensitive workloads now have access to a spectrum of model sizes that were previously only available in commercial AWS regions. The OpenAI GPT OSS 120B model offers the highest reasoning capability in the lineup, ideal for complex document analysis, code generation, and multilingual tasks. The 20B variant provides a faster, more cost-effective option for simpler queries and real-time applications.

NVIDIA's Nemotron models bring specialized strengths: the Nano series (9B, 12B, and 30B) is optimized for edge and low-latency scenarios, while the Super 120B targets enterprise-grade synthetic data generation and reinforcement learning pipelines. For government use cases like intelligence report summarization or logistics optimization, the 9B and 12B models could run directly on tactical systems without sacrificing performance.

Data Residency and Compliance Implications

AWS GovCloud (US) is designed exclusively for US government customers and their partners, adhering to FedRAMP High, ITAR, and other regulatory frameworks. By hosting inference endpoints within this environment, organizations ensure that model inputs and outputs never leave US soil. This is a significant upgrade over previous workarounds — such as running on-premises models or using third-party gateways — which introduced latency and operational overhead.

The blog post details multiple service tiers for inference, including on-demand and provisioned throughput options. This flexibility allows agencies to scale from prototype to production without renegotiating data governance policies. For contractors building AI applications under contracts with strict data localization clauses, this removes a major barrier to adopting state-of-the-art open-weight models.

Benchmarking and Performance Considerations

While AWS did not release specific GovCloud benchmark scores in the announcement, developers can expect comparable performance to the same models hosted in commercial AWS regions. The Nemotron Super 120B, for instance, has demonstrated strong results on reasoning benchmarks like GSM8K and MATH in third-party evaluations, though actual latency will depend on the chosen instance type and throughput tier.

A key consideration for developers: the open-weight nature of these models allows for fine-tuning and customization, albeit within GovCloud's controlled environment. AWS is likely to extend its SageMaker integration to these models in the coming months, enabling full training workflows without leaving the compliance boundary.

Getting Started and Practical Next Steps

To begin, AWS customers with GovCloud access can navigate to the Amazon Bedrock console and select from the new model catalog. The process mirrors standard Bedrock usage: create an inference profile, choose the model (e.g., nemotron-super-120b or gpt-oss-120b), and configure the endpoint. For developers migrating existing applications, API compatibility with the OpenAI GPT OSS models should minimize code changes.

One notable omission: pricing details are not explicitly listed in the announcement, but given GovCloud's premium pricing model, organizations should budget for higher per-token costs compared to commercial regions. AWS recommends starting with the smaller model variants for proof-of-concept work and only scaling to larger models after validating accuracy requirements.

What This Means for the AI Ecosystem

This release signals a broader trend: cloud providers are no longer treating regulated environments as afterthoughts. By actively porting frontier open-weight models to GovCloud, AWS is effectively creating a parallel AI supply chain for government and defense. This could accelerate adoption of AI in national security, healthcare compliance, and critical infrastructure — sectors that have lagged behind commercial deployments due to trust and control issues.

For developers, the takeaway is clear: building AI applications with compliance requirements no longer means sacrificing model quality. The availability of both OpenAI and NVIDIA models in a single, secure inference service gives teams the flexibility to choose the best tool for each task, whether that's Nemotron for synthetic data generation or GPT OSS for natural language understanding.

As AWS continues to expand its model catalog in GovCloud, expect similar announcements for other open-weight leaders like Meta's Llama and Mistral AI. The next frontier for enterprise AI is not just capability — it's compliance.

Related: AWS Unveils Agentic AI Healthcare Claims Pipeline: Bedrock and HealthLake Integration Cuts Manual Processing

Source: AWS Machine Learning. This article was produced with AI assistance and reviewed for accuracy. Editorial standards.

Avatar photo of Eric Samuels, contributing writer at AI Herald

About Eric Samuels

Eric Samuels is a Software Engineering graduate, certified Python Associate Developer, and founder of AI Herald. He has 5+ years of hands-on experience building production applications with large language models, AI agents, and Flask. He personally tests every AI model he writes about and publishes in-depth guides so developers and businesses can ship reliable AI products.

Related articles