HeroPath

Agentic
stack review.

An architecture advisory for Savvy Loans.

v1 pilot·7th June 2026

01 the brief

What we reviewed, and against what.

We read the architecture Ramee sent over, set it against the conversations we've had with the Savvy team, and built our recommendation from that and what we've learned.

architecture.pdf The logical engine: three agents on one shared loop, human‑in‑the‑loop by default.

system-diagram.pdf Build → ship → run: GitHub Actions, Terraform, a single ECS Fargate task in the AWS data account.

technical-spec.pdf 15 pages, candid: the runtime, the API, and a self‑reported security & roadmap section.

Principle · resident in AWS

Data and inference stay inside AWS. Non‑negotiable for a lender.

Principle · go fast

Be pragmatic and ship value quickly. It's what Bishara, the founder, asked for.

02 the bottom line

You've built the loop.
You're still building the harness.

It's a thoughtful, honest pilot. It's also roughly 3,000 lines of bespoke harness, plus a roadmap that is almost all undifferentiated infrastructure: sandboxing, audit logs, dedupe, a spend guard, auth, least‑privilege IAM.

Our recommendation: don't hand‑build that roadmap. Use AWS‑managed agent infrastructure for sandboxing, isolation, identity, memory and audit, and route the model through Bedrock so inference stays in AWS.

The swappable seam they already have makes this an adoption, not a rewrite.

03 credit where due

What's genuinely strong. Keep these.

✓The swappable seam is real.Tools and prompts have zero dependency on the loop, proven by a live CLI to LangGraph migration.
✓Honesty about risk.The spec lists verified issues, not a sales pitch. Rare, and it earns trust.
✓Human in the loop, to start.The right posture for a lender today, with a clear path to dial it down to the minimum as trust is earned.
✓Structured completion.finish and submit_triage separate what the agent decided from whether it's done.
✓Real tests.The test suite drives the actual graph, not mocks.
✓Clean AWS hygiene.Terraform, OIDC with no static keys, internal‑only ALB, Secrets Manager.

04 risk register

Five things we'd push on.

01
No evals
The tests check the wiring, not the judgement. Nothing measures whether an agent makes the right call, so quality can't be tracked or trusted.
high
02
Thin observability
Transcripts are discarded and there's no run audit log. When an agent does the wrong thing, there's little to debug with, and nothing to show a regulator.
high
03
Single point of failure
One task, no autoscaling. Every deploy is downtime and any crash stops all work.
high
04
Building harness, not agents
Concurrency, memory, dedupe, audit, scaling. Real work, but none of it differentiates Savvy. It's the team's time going into plumbing.
strategic
05
Unbounded cost
The top model runs on every poll, with no spend guard and no model tiering. Spend grows with traffic, uncapped.
cost

05 a worked example

Input you don't control, tools you do.

Untrusted input

Email or Jira content from outside enters the normal data path

→

Powerful tool

An agent can run a shell, with little to stop it

→

Broad access

That shell holds wide credentials and tokens

This isn't a flaw in what they've built today. It's the failure mode to design against. As autonomy grows, the trust boundary has to move from the host to the tool. Worth keeping front of mind while building, and a big reason to put the agents inside a managed, isolated runtime.

06 how we got here

The options, and our recommendation.

1Took Ramee's framing: build on LangGraph, or wrap a coding tool.

2Widened it: AgentCore is now GA, so the runtime no longer ties you to a framework.

3Judged each one on the principles: in AWS, fast to ship, auditable.

Option	Stays in AWS	Harness to own	Isolation & audit	Speed to ship
Status quoLangGraph + direct API, on ECS	No	High	DIY	Built, but stuck
Just move to Bedrockmodel layer only	Yes	High	DIY	Fast, partial
Bedrock + AgentCorelift the agents onto the managed runtime	Yes	Low	Managed	Fast, durable
Re-platform from scratchnew framework, new infra	Depends	High	Varies	Slow

07 the framework question

LangGraph, or something more native?

what they have LangGraph Now 1.0 and stable, no breaking changes until 2.0. Explicit, auditable control flow and first-class Bedrock support. Strong where steps must be deterministic.

AWS-native Strands Agents AWS's open framework. Lowest ceremony, model-driven, one-command deploy to AgentCore. Less code to maintain, less explicit control.

Anthropic-native Claude Agent SDK The loop behind Claude Code. Runs on Bedrock unchanged, with context compaction, subagents and audit hooks. AWS already publishes it on AgentCore.

08 what we're proposing

The target stack, layer by layer.

modelClaude on Bedrock

Opus 4.8, Sonnet 4.6, Haiku 4.5 in-region under an AWS BAA. Inference never leaves AWS. Tier the models: Haiku for triage, Opus for engineering.

runtimeAgentCore Runtime

A microVM per session plus a sandboxed code interpreter. Consumption pricing, per second, nothing during I/O wait. It autoscales, so the singleton goes away.

governanceMemory · Gateway · Identity · Policy

Managed memory and audit, MCP tools, scoped per-tool credentials, and Cedar guardrails on every tool call, authored outside the code. The lender's controls, off the critical path.

ingestionThin event plane

Keep a small listener for Jira, Gmail and the webhook. It only submits work to the runtime, which now does the scaling. Traces flow to CloudWatch.

09 target architecture

One event, into a live agent.

our code AWS managed model external SaaS

AWS · data account · VPC

ingestionJira · Gmail pollers

webhook/emails/inbound

→

AgentCore Runtimean isolated session per task

principal_engineerproduct_manageremail_triage

→

modelClaude on Bedrock

memorywatermarks · audit

identity · gatewayscoped tools

policyCedar guardrails

observabilityCloudWatch · OTel

egress · external saas · outbound onlyJira Cloud · Gmail · Slack · GitHub

10 the path

Staged, low-risk, no rewrite to start.

1
Lift
Containerise the existing LangGraph agents and run them on AgentCore, unchanged. Point the model at Bedrock. Residency solved, isolation gained, the singleton retired.
first
2
Wire
Move state to AgentCore Memory, tools behind the Gateway, guardrails into Policy, traces to CloudWatch. Most of the roadmap becomes config.
next
3
Evolve
Where a framework switch earns its keep, rewrite that agent to Strands or the Claude Agent SDK. Per agent, measured, never big-bang.
later

The honest counter-argument: LangGraph 1.0 is stable, and its explicit, auditable control flow is something a regulated lender may prefer. AgentCore also deepens the AWS commitment. Both are fine here: residency and speed are the brief, and the runtime keeps the framework reversible. Switch on evidence, not fashion.

HeroPath

Claude on Bedrock.
Agents on AgentCore.
The rest is yours to build.

~/savvy-loans/agentic-stack/2026

Agentic stack review.

What we reviewed, and against what.

You've built the loop.You're still building the harness.

What's genuinely strong. Keep these.

Five things we'd push on.

No evals

Thin observability

Single point of failure

Building harness, not agents

Unbounded cost