September 2025 - Present
DevOps and Cloud Infrastructure Consultant for an AI based Fraud Detection and AML Monitoring Startup
- Collaborating on performance optimization for workflows running on Netflix Conductor deployed on GKE.
- Automated software delivery for multi-tenant architectures using Terraform, reducing new environment setup time from 1 week to 1 day.
- Fixed CI/CD pipelines, reduced rollback times from 3 min to 30 seconds for GKE and GCE workloads.
- Achieved monthly savings of $4,000 through K8s resource and limit optimization to prevent node overprovisioning.
- Implemented Zero Trust architecture with RBAC using Cloudflare Zero Trust.
- Migrated publicly exposed back-office portals to private access through Cloudflare Zero Trust.
- Led the organization through a successful SOC2 Type II compliance process.
- Currently working with enterprise customers to deploy our SaaS product into their own AWS/on-prem environments, including network architecture design, secure IAM setup, and delivering Helm/Terraform-based installation and upgrade flows.
April 2025 - September 2025
Site Reliability Engineer for a Mid-sized Fintech (Payments Platform similar to PayTM)
- Owned end-to-end reliability for 45+ services across 6 teams and 5 AWS environments; led migration from monolith to microservices for core payment functions.
- Provided production support for services built in Ruby on Rails, Spring Boot, and Go.
- Contributed to the design and rollout of an SLI/SLO framework for critical services and ETL pipelines.
- Implemented canary deployments for 27 services, improving release safety and reducing incident risk.
- Collaborated in decoupling Terraform IaC from CI/CD, supporting migration from AWS CodePipeline and reducing technical debt.
- Built Python CLI tools and GitHub Actions for configuration drift detection, reduced pre-deployment checks from 60 min to 10 sec to enable self-service for deployment owners.
- Reduced alert fatigue from 100 alerts per day to 4.
- Created incident runbooks and Datadog synthetic monitors for OTP and SMS reliability.
- Automated DB Lock resolution in Aurora Postgres using cron jobs.
- Developed custom Terraform PR Automation with Github Actions.

