Skills Services About Contact

Building Reliable
Infrastructure at Scale

Senior DevOps & Site Reliability Engineer specializing in AWS cloud environments, observability tooling, and infrastructure automation.

8+
Years Experience
AWS
Cloud Expert
24/7
On-Call Ready
$ docker ps --format "table {{.Names}}\t{{.Status}}"
NAMES STATUS
api-gateway Up 3 days
redis-cache Up 3 days
prometheus Up 3 days
$ terraform plan
No changes. Infrastructure is up-to-date.
$ _

Skills & Technologies

Comprehensive expertise across the full DevOps and SRE stack, from infrastructure provisioning to observability.

Cloud & Infrastructure

Designing and managing large-scale cloud environments with focus on reliability and cost optimization.

AWS EKS ECS Lambda S3 CloudFront RDS Route 53 VPC Azure GCP
📈

Observability & Monitoring

Building comprehensive monitoring solutions for proactive incident detection and performance optimization.

Splunk Prometheus Grafana Nagios Enhanced Logging Alerting
🛠

Infrastructure as Code

Automating infrastructure provisioning and configuration with modern IaC tools and best practices.

Terraform Ansible CloudFormation Python Bash Scripting
🚀

CI/CD & Containerization

Implementing robust deployment pipelines and container orchestration for seamless delivery.

Jenkins GitLab CI/CD GitHub Actions Kubernetes Docker
🔧

SRE Practices

Applying SRE principles to improve system reliability, reduce toil, and manage incidents effectively.

Incident Management Root Cause Analysis Post-Mortems ServiceNow Jira
🔒

Security & DevSecOps

Integrating security best practices throughout the development lifecycle and infrastructure.

Security Observability DevSecOps Compliance Secure Deployments

What I Do

Helping teams build and maintain reliable, scalable infrastructure.

Cloud Architecture

Design and implement scalable AWS infrastructure with focus on high availability, cost optimization, and security best practices.

📈

Observability Setup

Build comprehensive monitoring solutions with Prometheus, Grafana, and Splunk. Create dashboards that surface actionable insights.

🛠

Infrastructure as Code

Automate infrastructure provisioning with Terraform and Ansible. Version-controlled, repeatable, and auditable deployments.

🚀

CI/CD Pipelines

Design and implement deployment pipelines with Jenkins, GitLab CI/CD, or GitHub Actions. Faster releases with confidence.

🔧

Incident Response

Establish on-call processes, runbooks, and post-incident review workflows. Reduce MTTR and prevent recurring issues.

🔒

Platform Reliability

Improve system uptime through SRE practices, capacity planning, chaos engineering, and proactive performance tuning.

About Me

With over 8 years dedicated to designing and optimizing large-scale AWS cloud environments, I've built a proven track record of implementing and enhancing SRE frameworks that improve system reliability, scalability, and performance. My expertise spans the full observability stack, with particular depth in Splunk, and I drive automation through Python and scripting to eliminate toil and improve efficiency.

I thrive in high-pressure environments, participating in 24/7 on-call rotations and leading incident response with thorough root cause analysis. Working across diverse industries has given me a deep understanding of what it takes to keep mission-critical systems running at scale.

When I'm not ensuring systems are running smoothly, you'll find me watching aviation content (there's something fascinating about airport operations and aircraft), exploring new places, or connecting with people from around the world. I believe the best engineers are also lifelong learners, and I'm always eager to pick up new technologies and perspectives.

Location
Toronto, Canada
Focus
Platform Reliability
Specialization
AWS Cloud & Observability
Interests
Aviation, Travel, Technology

Get In Touch

Looking for a reliable SRE to strengthen your platform? Let's talk.