HeroPath  ·  advisory  ·  savvy‑agents review
HeroPath

Agentic
stack review.

An architecture advisory for Savvy Loans.

v1 pilot·7th June 2026

01 the brief

What we reviewed, and against what.

We read the architecture Ramee sent over, set it against the conversations we've had with the Savvy team, and built our recommendation from that and what we've learned.

architecture.pdf The logical engine: three agents on one shared loop, human‑in‑the‑loop by default.
system-diagram.pdf Build → ship → run: GitHub Actions, Terraform, a single ECS Fargate task in the AWS data account.
technical-spec.pdf 15 pages, candid: the runtime, the API, and a self‑reported security & roadmap section.
Principle · resident in AWS
Data and inference stay inside AWS. Non‑negotiable for a lender.
Principle · go fast
Be pragmatic and ship value quickly. It's what Bishara, the founder, asked for.

02 the bottom line

You've built the loop.
You're still building the harness.

It's a thoughtful, honest pilot. It's also roughly 3,000 lines of bespoke harness, plus a roadmap that is almost all undifferentiated infrastructure: sandboxing, audit logs, dedupe, a spend guard, auth, least‑privilege IAM.

Our recommendation: don't hand‑build that roadmap. Use AWS‑managed agent infrastructure for sandboxing, isolation, identity, memory and audit, and route the model through Bedrock so inference stays in AWS.

03 credit where due

What's genuinely strong. Keep these.

  • The swappable seam is real.Tools and prompts have zero dependency on the loop, proven by a live CLI to LangGraph migration.
  • Honesty about risk.The spec lists verified issues, not a sales pitch. Rare, and it earns trust.
  • Human in the loop, to start.The right posture for a lender today, with a clear path to dial it down to the minimum as trust is earned.
  • Structured completion.finish and submit_triage separate what the agent decided from whether it's done.
  • Real tests.The test suite drives the actual graph, not mocks.
  • Clean AWS hygiene.Terraform, OIDC with no static keys, internal‑only ALB, Secrets Manager.

04 risk register

Five things we'd push on.

  • 01

    No evals

    The tests check the wiring, not the judgement. Nothing measures whether an agent makes the right call, so quality can't be tracked or trusted.

    high
  • 02

    Thin observability

    Transcripts are discarded and there's no run audit log. When an agent does the wrong thing, there's little to debug with, and nothing to show a regulator.

    high
  • 03

    Single point of failure

    One task, no autoscaling. Every deploy is downtime and any crash stops all work.

    high
  • 04

    Building harness, not agents

    Concurrency, memory, dedupe, audit, scaling. Real work, but none of it differentiates Savvy. It's the team's time going into plumbing.

    strategic
  • 05

    Unbounded cost

    The top model runs on every poll, with no spend guard and no model tiering. Spend grows with traffic, uncapped.

    cost

05 a worked example

Input you don't control, tools you do.

Untrusted input
Email or Jira content from outside enters the normal data path
Powerful tool
An agent can run a shell, with little to stop it
Broad access
That shell holds wide credentials and tokens

This isn't a flaw in what they've built today. It's the failure mode to design against. As autonomy grows, the trust boundary has to move from the host to the tool. Worth keeping front of mind while building, and a big reason to put the agents inside a managed, isolated runtime.

06 how we got here

The options, and our recommendation.

1Took Ramee's framing: build on LangGraph, or wrap a coding tool.
2Widened it: AgentCore is now GA, so the runtime no longer ties you to a framework.
3Judged each one on the principles: in AWS, fast to ship, auditable.
OptionStays in AWSHarness to ownIsolation & auditSpeed to ship
Status quoLangGraph + direct API, on ECSNoHighDIYBuilt, but stuck
Just move to Bedrockmodel layer onlyYesHighDIYFast, partial
Bedrock + AgentCorelift the agents onto the managed runtimeYesLowManagedFast, durable
Re-platform from scratchnew framework, new infraDependsHighVariesSlow

07 the framework question

LangGraph, or something more native?

what they have LangGraph Now 1.0 and stable, no breaking changes until 2.0. Explicit, auditable control flow and first-class Bedrock support. Strong where steps must be deterministic.
AWS-native Strands Agents AWS's open framework. Lowest ceremony, model-driven, one-command deploy to AgentCore. Less code to maintain, less explicit control.
Anthropic-native Claude Agent SDK The loop behind Claude Code. Runs on Bedrock unchanged, with context compaction, subagents and audit hooks. AWS already publishes it on AgentCore.

You don't have to choose now. AgentCore runs all three unchanged, so the framework is a reversible, per-agent decision. Lift LangGraph as-is first, switch the high-churn agents later, on evidence.

08 what we're proposing

The target stack, layer by layer.

modelClaude on Bedrock
Opus 4.8, Sonnet 4.6, Haiku 4.5 in-region under an AWS BAA. Inference never leaves AWS. Tier the models: Haiku for triage, Opus for engineering.
runtimeAgentCore Runtime
A microVM per session plus a sandboxed code interpreter. Consumption pricing, per second, nothing during I/O wait. It autoscales, so the singleton goes away.
governanceMemory · Gateway · Identity · Policy
Managed memory and audit, MCP tools, scoped per-tool credentials, and Cedar guardrails on every tool call, authored outside the code. The lender's controls, off the critical path.
ingestionThin event plane
Keep a small listener for Jira, Gmail and the webhook. It only submits work to the runtime, which now does the scaling. Traces flow to CloudWatch.

09 target architecture

One event, into a live agent.

our code AWS managed model external SaaS
AWS · data account · VPC
ingestionJira · Gmail pollers
webhook/emails/inbound
AgentCore Runtimean isolated session per task
principal_engineerproduct_manageremail_triage
modelClaude on Bedrock
memorywatermarks · audit
identity · gatewayscoped tools
policyCedar guardrails
observabilityCloudWatch · OTel
egress · external saas · outbound onlyJira Cloud · Gmail · Slack · GitHub

10 the path

Staged, low-risk, no rewrite to start.

  • 1

    Lift

    Containerise the existing LangGraph agents and run them on AgentCore, unchanged. Point the model at Bedrock. Residency solved, isolation gained, the singleton retired.

    first
  • 2

    Wire

    Move state to AgentCore Memory, tools behind the Gateway, guardrails into Policy, traces to CloudWatch. Most of the roadmap becomes config.

    next
  • 3

    Evolve

    Where a framework switch earns its keep, rewrite that agent to Strands or the Claude Agent SDK. Per agent, measured, never big-bang.

    later

The honest counter-argument: LangGraph 1.0 is stable, and its explicit, auditable control flow is something a regulated lender may prefer. AgentCore also deepens the AWS commitment. Both are fine here: residency and speed are the brief, and the runtime keeps the framework reversible. Switch on evidence, not fashion.

HeroPath

Claude on Bedrock.
Agents on AgentCore.
The rest is yours to build.

~/savvy-loans/agentic-stack/2026

navigate  ·  space next  ·  f fullscreen