</>
{ }
[ ]
=>
( )
$
~/jerry-ngo $ whoami

|

Senior DevOps / Site Reliability Engineer

I build reliable cloud platforms on AWS and help teams ship safely at scale.

~/about $ cat README.md

About Me

DevOps/SRE with 3 years experience maintaining system architecture and achieving 99.99% SLO. Focused on system design, reliability, and continuous improvement.

For me, system design is only the start — real value comes from tracking production behavior, learning from incidents, and continuously improving availability, performance, and cost.

~/expertise $ cat capabilities.md

What I Do

Reliability & SLO-driven operations

Define SLOs/SLIs, on-call support, incident response, postmortems, DR automation

Platform Engineering on AWS

ECS, EC2, RDS, Lambda, DynamoDB, API Gateway, S3, CloudFront, multi-tenant Landing Zone with cross-region automation

Observability at scale (Datadog)

Platform patterns for effortless monitoring, standardized agent rollout, streaming optimization

Kafka / MSK

Design and operate Self-managed Kafka & MSK for high availability, day-2 operations tooling

CI/CD Pipeline Migration

Migrate pipelines from GitLab to GitHub Actions with AI-driven standardization, creating repeatable workflows and automation

~/impact $ grep -r "achievements"

Featured Impact

99.99% HA SLO for Kafka/MSK with 100% data reliability
100% Services monitored via Datadog platform
70%+ Reduction in Datadog log streaming costs
95% Services following automated deployment standards
AI-driven Applied AI-driven standardization for GitLab → GitHub migration, creating repeatable workflows and reducing repo-by-repo migration effort
~/toolbox $ ls -la

Tech Stack

AWS
Kubernetes
Datadog
Terraform
Docker
Kafka
~/contact $ echo "Let's connect"

Get In Touch

Open to Site Reliability Engineer / DevOps Engineer roles. If you're building systems that need to be reliable, observable, and continuously improved — let's connect.