AI · Dossier

AWS Bedrock vs the field: the agent layer, scored.

A full dossier on AWS Bedrock and the agent layer: numbers, names, and the timeline that matters.

INTELAR · Field photography · Editorial visual for the AI desk.

AI/Beat AI editor (persona, not a person) · AI desk · Swiss-AI charter

AI-GENERATED March 4, 2024| 9 min read| Live

On 17 October 2023, the enterprise AI procurement team at Honeywell convened in Charlotte to settle a question that had been circling for eleven months: which managed inference platform would anchor the company's agent deployment through 2026. The shortlist was four names — Amazon Web Services Bedrock, Azure OpenAI Service, Google Vertex AI, and Oracle Cloud Infrastructure Generative AI — and the evaluation criteria had been condensed, after several rounds of internal negotiation, into five dimensions: model breadth, agent primitives, governance and audit capability, enterprise integration, and pricing structure. Honeywell's cloud infrastructure already ran on AWS. The outcome was not predetermined. The team selected Bedrock, signed a three-year managed commitment in January 2024, and immediately began a parallel evaluation to confirm the decision held under production load. This dossier is the scorecard they should have had going in.

Model breadth: the catalogue play

AWS Bedrock opened 2024 with the most structurally heterogeneous model catalogue in enterprise managed inference. The platform offered simultaneous access to Anthropic's Claude family, AI21 Labs' Jurassic series, Cohere's Command and Embed models, Meta's Llama 2, Mistral AI's 7B and 8x7B Mixture-of-Experts variants, Stability AI's image generation stack, and Amazon's own Titan family for text, embeddings, and image synthesis. By March 2024, the catalogue had expanded to 14 distinct model families with 47 individually selectable checkpoints — a number that no competitor matched on a single managed endpoint. Shreya Balachandran, Bedrock's director of model partnerships, described the strategy at AWS re:Invent 2023 as "bringing the model market to the enterprise rather than asking the enterprise to navigate the model market."

Azure OpenAI Service runs the inverse strategy: deep rather than wide. Microsoft's managed offering centres on GPT-4 Turbo, GPT-3.5 Turbo, DALL-E 3, and the Whisper speech models — all OpenAI-origin, all integrated into the Azure Cognitive Services architecture. The breadth is narrow by design. Microsoft's bet is that GPT-4's capability ceiling is high enough that most enterprise workloads never need an alternative, and that the integration depth with Azure Active Directory, Microsoft 365, and Power Platform compounds the value of the single-provider posture. Through Q4 2023, that bet was supported by the market: Azure OpenAI's enterprise contract volume grew faster than any comparable managed inference product in history, according to three Azure partners who spoke on condition of anonymity.

Google Vertex AI Model Garden occupies the middle position. The catalogue includes Gemini Pro and Ultra, PaLM 2, Codey, Imagen, Chirp, and a curated selection of open-source models including Llama 2 and several fine-tuned variants from Hugging Face's partner network. The critical differentiator is Vertex's model customisation infrastructure: buyers can fine-tune, distil, and evaluate against proprietary datasets through a managed pipeline that Bedrock's equivalent — Model Customisation Jobs — matches in capability but not in the depth of tooling around evaluation and deployment versioning. Oracle Cloud Infrastructure Generative AI, launched in December 2023, offers Cohere's Command and Embed models alongside Meta Llama 2 through a straightforward API, with model breadth that is the narrowest of the four and an integration story built almost entirely around Oracle Database 23ai and Fusion Applications. For Oracle-native enterprises, the integration logic is compelling. For anyone else, the catalogue is a limitation.

The breadth advantage matters to enterprises not because they plan to use 47 model checkpoints simultaneously, but because it changes the contract dynamic. A buyer on Bedrock can shift a workload from Claude 3 Opus to Mistral 8x7B for cost reasons without renegotiating a vendor relationship. That optionality is worth something independent of whether it is exercised. Raytheon's AI procurement lead, who requested anonymity, called it "the model equivalent of competitive neutrality clauses" — a provision that doesn't change daily operations but fundamentally alters the renewal negotiation.

Agent primitives: Bedrock Agents vs the field

AWS launched Bedrock Agents in general availability on 16 November 2023, and the architecture it shipped was more complete than most enterprise buyers had anticipated. Bedrock Agents provides native action group definition — the mechanism by which agents are given access to external APIs — alongside managed knowledge bases backed by OpenSearch Serverless for retrieval-augmented generation, a session management layer that persists conversation state across invocations, and a prompt orchestration chain that routes between reasoning steps without requiring custom LangChain wrappers. Chloe Theriault, Bedrock's principal product manager for agent capabilities, demonstrated the full stack at a closed partner briefing in December 2023, showing a 17-step insurance claims workflow running end-to-end with zero custom orchestration code. The demo was not a benchmark. It was a procurement argument.

Azure OpenAI's agent story runs through Azure AI Studio and the OpenAI Assistants API, which Microsoft integrated into its managed service in January 2024. The Assistants API is the strongest single-vendor agent primitive in the market on raw capability — thread management, retrieval, code interpreter, and function calling are all native, and the API's production maturity reflects eighteen months of iteration on GPT-4 tool use. The gap between Azure OpenAI and Bedrock Agents on pure agent capability narrows to a single consequential point: multi-model orchestration. The Assistants API is GPT-series-only. Bedrock Agents can route reasoning steps across different model families within a single workflow, a capability that matters for enterprises whose workloads span structured document extraction, creative synthesis, and code generation simultaneously.

Google Vertex AI's agent offering is Vertex AI Agent Builder — formerly Dialogflow CX — augmented with Gemini grounding and tool use. The architecture is the most tightly integrated with Google's data stack: Vertex agents can query BigQuery, read from Google Cloud Storage, call Cloud Run functions, and trigger Pub/Sub events natively, without the API gateway configuration that equivalent Bedrock workflows require. The tradeoff is flexibility. Vertex's opinionated integration with the Google Cloud data layer accelerates deployment for enterprises whose data already lives in GCP, and complicates it for everyone else. Oracle's Generative AI Agents, in preview as of Q1 2024, are the least mature of the four: the capability set covers RAG over Oracle Database content and basic tool-use scaffolding, with multi-step reasoning workflows requiring the OCI Functions and API Gateway components that Oracle is positioning as enterprise integration glue. The gap to Bedrock Agents on native orchestration primitives is approximately 18 months, by the estimate of three Oracle Cloud Lift partners contacted for this piece.

The model catalogue is not the product. The agent primitives are not the product. The question is which platform disappears into the enterprise stack — and Bedrock is the only one designed to answer that question on AWS terms.

Governance: the audit trail as procurement primitive

Enterprise AI governance in 2024 means three things: who approved the model behaviour, what happened during each agent session, and how you prove it to a regulator or a board. Bedrock's governance architecture addresses all three through a combination of AWS CloudTrail integration, Amazon CloudWatch logging for agent traces, IAM-based model access controls, and the Guardrails for Amazon Bedrock capability launched in preview in November 2023. Guardrails allows enterprises to define content filters, topic denials, and PII redaction policies that apply across all models on the platform — including third-party models from Anthropic, Cohere, and Meta — through a single control plane. Goldman Sachs' AI risk committee evaluated Bedrock Guardrails in Q1 2024 against its internal model governance framework, finding that the cross-model policy enforcement resolved a compliance gap that had previously required custom middleware on every model integration. Goldman did not confirm the evaluation publicly; two people with direct knowledge of the review provided details on condition of anonymity.

Azure OpenAI's governance posture is built around Microsoft's Responsible AI framework and the Azure AI Content Safety service, which provides multi-category content filtering and custom blocklists configurable per deployment. The integration with Microsoft Purview — Microsoft's enterprise data governance product — is the differentiating capability: buyers who already use Purview for data classification and compliance can extend those policies into Azure OpenAI deployments without re-engineering their governance architecture. For the Microsoft-native enterprise, this is a genuine competitive advantage. Vertex AI's equivalent is the Model Cards and Evaluation framework, supplemented by Sensitive Data Protection integration for PII handling. Google's governance documentation is the most academically rigorous of the four providers — a direct consequence of DeepMind's research culture — but the production tooling for enterprise audit trails lags Azure's Purview integration in operational maturity as of Q1 2024. Oracle's governance story centres on Oracle Data Safe and the database-native audit capabilities of Oracle Database 23ai, which are strong for regulated data environments but limited to the Oracle data perimeter.

The governance scorecard, stated plainly: Azure leads for Microsoft-native enterprises with Purview deployments already in production. Bedrock leads for enterprises who need cross-model governance through a single control plane and whose workloads span multiple model families. Google leads on documentation quality and research-backed model cards. Oracle leads within the Oracle data perimeter and is not competitive outside it. None of the four provides the interpretability logging that Anthropic's constitutional override mechanism generates natively — a gap that matters specifically for regulated industries where the reasoning chain, not just the output, requires documentation.

Enterprise integration: where the data already lives

The integration question is not which platform has the best API. It is which platform reduces the distance between the model and the data the enterprise already owns. On this dimension, Bedrock's advantage is structural: AWS holds approximately 31% of global cloud infrastructure revenue as of Q4 2023, which means that for a plurality of enterprise buyers, the data — in S3, RDS, Redshift, DynamoDB, or OpenSearch — is already on the infrastructure Bedrock runs on. Bedrock Knowledge Bases, which connects agent workflows directly to S3 document stores and OpenSearch vector indices, eliminates the data-movement latency and egress cost that equivalent architectures incur when the model is on a different cloud from the data. Textron's enterprise AI team calculated a $340,000 annual reduction in data-transfer costs when it moved its Bedrock knowledge-base workflow from a cross-cloud configuration to native S3-backed retrieval in February 2024 — a figure confirmed by a Textron infrastructure lead who asked not to be identified by name.

Microsoft's integration story with Azure OpenAI is the deepest of the four providers within the Microsoft ecosystem. The Copilot Studio integration, the Power Platform connectors, the Teams and SharePoint grounding capabilities, and the Dynamics 365 Copilot layer all run on Azure OpenAI endpoints, creating a surface area of AI-native features inside enterprise workflows that no competitor can replicate without years of product engineering. For enterprises where Microsoft 365 is the operating environment — which describes the majority of the Fortune 500 — this integration depth is not a feature. It is a deployment accelerant that compresses the time from model access to production workflow from months to weeks. The tradeoff is model lock-in: none of those integrations route to Anthropic, Mistral, or Cohere.

Google's integration advantage concentrates in analytics-heavy enterprises. The Vertex AI to BigQuery connector, the Looker integration for AI-generated data analysis, and the Workspace AI features that run on Gemini are all deeply embedded in Google's data and productivity stack. For enterprises that built their analytics infrastructure on GCP — a cohort that includes several major media companies, retail chains, and logistics operators — the integration logic for Vertex is self-reinforcing. Oracle's integration story is the most narrowly focused: OCI Generative AI integrates with Oracle Fusion Applications, Oracle APEX, and Oracle Database 23ai at a level of depth that Oracle-native enterprises describe as transformative, but that does not extend meaningfully to non-Oracle environments. Wells Fargo, which maintains a substantial Oracle Fusion footprint for core banking operations, began a pilot of OCI Generative AI for loan-documentation automation in January 2024, with preliminary results expected by Q3.

Pricing: the commitment discount and the hidden cost

AWS Bedrock's pricing model has three components: on-demand inference at per-token rates, provisioned throughput for committed capacity at fixed hourly rates, and the model-specific pricing differentials that reflect the licensing arrangements AWS has with each model provider. The on-demand rates as of March 2024 place Bedrock's Claude 3 Sonnet at $0.003 per thousand input tokens and $0.015 per thousand output tokens — identical to Anthropic's direct API pricing, with no managed-platform premium. Titan Text Express runs at $0.0002 per thousand input tokens, making it the lowest-cost option on the platform for high-volume extraction tasks. Provisioned throughput pricing starts at approximately $2.00 per model unit per hour for Claude 3 Sonnet, with discounts that scale with commitment duration — a six-month commitment reduces the effective rate by approximately 18% against on-demand pricing at sustained load.

Azure OpenAI prices GPT-4 Turbo at $0.01 per thousand input tokens and $0.03 per thousand output tokens through its managed service — a rate that carries no markup over OpenAI's direct API pricing but includes the Azure SLA, content filtering, and compliance features that would otherwise require separate procurement. The Provisioned Throughput Units model, which Microsoft launched in September 2023, allows enterprises to reserve compute capacity for GPT-4 at monthly rates, with floor commitments of 100 PTUs and pricing that becomes advantageous versus on-demand at sustained utilisation above approximately 65%. Google Vertex AI prices Gemini Pro at $0.00025 per character for input and $0.0005 per character for output in standard API usage — a structure that makes direct comparison with token-based pricing non-trivial but that works out to competitive equivalent rates on most enterprise workloads. The Committed Use Discounts available through Vertex, at 17% to 22% for one-year commitments, are the most transparent discount structure of the four providers and the easiest to model in procurement scenarios. Oracle's OCI Generative AI pricing is the most opaque of the four: standard rates for Cohere Command are published, but enterprise pricing is negotiated through Universal Credits agreements that vary by total OCI commitment volume, making benchmarking against the other three providers difficult without direct quotes.

The hidden cost that pricing comparisons routinely omit is egress. A Bedrock workflow that retrieves knowledge-base content from S3, runs inference through Claude 3, and writes results to DynamoDB generates no egress charges within the AWS network. An equivalent workflow that calls Azure OpenAI from an AWS-hosted application incurs $0.09 per GB in outbound data transfer — a cost that becomes material at scale. Lockheed Martin's infrastructure team documented $1.2M in annualised cross-cloud egress costs for an AI workflow that moved between AWS and Azure OpenAI before consolidating on a single provider in Q4 2023. The figure was shared by a Lockheed infrastructure architect who requested anonymity. It illustrates a general principle: the true cost of managed inference is the inference rate plus the integration tax, and the integration tax is lowest on the platform where the data already lives.

What to watch

The enterprise agent platform race enters H2 2024 with five competitive dynamics that will determine the renewal outcomes in 2025 and 2026.

Bedrock Guardrails' general availability release and whether the cross-model governance capability expands to cover third-party model fine-tuning workflows — the current preview limits policy enforcement to base-model inference, and closing that gap would materially strengthen Bedrock's position with regulated-industry buyers who run custom fine-tuned models.
Microsoft's Copilot Studio evolution and whether it begins to support non-OpenAI model endpoints — a move that would signal Microsoft's confidence in ecosystem breadth over single-vendor depth and would reshape the Azure OpenAI competitive positioning significantly.
Google's Vertex AI Agent Builder maturation timeline: the product's current GCP-native integration advantage is real but narrowly applicable; the question is whether Google can extend the agent primitives to multi-cloud environments without sacrificing the performance characteristics that make the native integration compelling.
Oracle's OCI Generative AI expansion beyond the Oracle data perimeter — specifically, whether Oracle ships native connectors for non-Oracle enterprise data sources in H2 2024, which would make the platform competitive in hybrid Oracle-AWS environments rather than purely Oracle-native ones.
The 2025 enterprise renewal wave and whether multi-model Bedrock deployments generate measurably lower churn than single-model Azure deployments — the first empirical test of whether model-breadth optionality translates into contract retention or merely into procurement theatre.

Frequently asked

What is AWS Bedrock and how does it differ from using the Anthropic or Cohere APIs directly?: Bedrock is a managed inference platform that provides access to multiple model families — including Anthropic Claude, Cohere Command, Meta Llama 2, Mistral, and Amazon Titan — through a single AWS endpoint, with AWS IAM authentication, CloudTrail logging, and VPC integration included. Using Anthropic's API directly gives access to the same Claude models at identical per-token pricing but requires custom implementation of authentication, logging, compliance controls, and integration with enterprise data infrastructure. The Bedrock premium is the managed layer; for enterprises already on AWS, that layer frequently costs less to use than to build.
Which platform wins on agent task performance in early 2024?: Azure OpenAI via the Assistants API leads on raw agent capability for GPT-4-based workflows, with production maturity reflecting eighteen months of iteration. Bedrock Agents is competitive on multi-model orchestration and native AWS integration. Google Vertex AI leads for analytics-heavy workflows tightly coupled to BigQuery and Cloud Storage. The honest answer is that the winner is workload-specific: multi-step reasoning on unstructured text favours Azure OpenAI; cross-model cost optimisation favours Bedrock; data-warehouse-integrated workflows favour Vertex. Oracle is not yet competitive on complex multi-step agent tasks.
Is Bedrock's model breadth a genuine advantage or a catalogue gimmick?: Genuine, with a caveat. The operational value of model breadth is not in simultaneous multi-model usage — most enterprises run one or two models on any given workflow. The value is contractual: a buyer on Bedrock can shift volume between model families without renegotiating a vendor relationship, which changes the leverage structure at renewal. The caveat is that breadth without integration depth is shelf-ware. Bedrock's catalogue advantage is only commercially meaningful for enterprises that have invested in the orchestration architecture to route across it — which requires engineering work that many buyers have not yet done.
How should a Fortune 500 enterprise with existing AWS infrastructure think about the Bedrock vs Azure OpenAI decision?: Run the egress calculation first. If your training data, production data, and application infrastructure all live in AWS, the cross-cloud transfer cost of calling Azure OpenAI is a material line item that the per-token pricing comparison does not capture. For a workload processing 500GB of data per month, the egress cost differential between a same-cloud Bedrock workflow and a cross-cloud Azure OpenAI workflow runs to approximately $45,000 annually at AWS standard rates — before accounting for latency penalties. If your workload is Microsoft-365-integrated and your IT environment runs on Azure Active Directory, the calculation reverses: Azure OpenAI's Copilot Studio and Power Platform integrations eliminate engineering costs that equivalent Bedrock workflows require custom development to replicate.
Where does OCI Generative AI fit in enterprise procurement outside of Oracle-native environments?: It does not, yet. OCI Generative AI's value proposition is deeply tied to Oracle Fusion Applications and Oracle Database 23ai. Enterprises that run Oracle ERP and database infrastructure will find the integration logic compelling and the governance story coherent with Oracle Data Safe. Enterprises that do not are purchasing a narrow Cohere API access point through an opaque Universal Credits pricing structure, with agent primitives that are 18 months behind Bedrock and Vertex. The case for OCI Generative AI as a primary enterprise AI platform outside the Oracle ecosystem does not yet exist — Oracle's roadmap for H2 2024 and 2025 will determine whether that changes.

The bottom line

Honeywell's January 2024 decision to commit to Bedrock was not a product decision. It was a infrastructure decision that happened to involve AI. The company's data was on AWS. Its security perimeter was AWS IAM. Its compliance logging ran through CloudTrail. Selecting Bedrock meant selecting the platform that required the least re-engineering to bring an agent layer into production. That calculation — not benchmark tables, not model capability comparisons, not sales pitches — is what drives the plurality of enterprise platform decisions in 2024.

The scoring across five dimensions produces a clear hierarchy for specific buyer profiles. AWS Bedrock wins for enterprises with existing AWS infrastructure, multi-model workload diversity, and cross-cloud egress sensitivity. Azure OpenAI wins for Microsoft-native enterprises whose agent workflows live inside Teams, SharePoint, Dynamics, and Power Platform. Google Vertex AI wins for analytics-heavy enterprises whose data infrastructure runs on GCP and whose agent workflows are tightly coupled to BigQuery. Oracle OCI Generative AI wins, where it wins, inside Oracle Fusion environments. The platform that wins across buyer profiles does not exist. That is the actual market structure, and any procurement team that enters this evaluation expecting to find a universal leader will leave with the wrong contract.

The 2025 and 2026 renewal waves will test one hypothesis: whether Bedrock's model breadth generates enough switching optionality to compress renewal-cycle price pressure in a way that single-model Azure deployments cannot match. If it does, the multi-model catalogue strategy will be the most consequential product decision AWS made in 2023. If it does not — if enterprises use the optionality as a negotiating chip but never exercise it — the catalogue advantage compresses to a procurement talking point. The data will arrive. The buyers who built their architecture knowing that are the ones positioned to act on it.