Welcome to my Website
Principal Data Engineer

Rajeev Jasti

Architecting low-latency distributed systems and massive-scale ETL pipelines. I build the robust infrastructure that powers real-time analytics for global enterprises.

10+
Years Experience
50+
Projects Delivered
1M+
Events/Second
30%
Cost Reduction
40%
Latency Improvement
99.9%
Uptime

Technical Skills

Data Processing

Apache Spark Kafka Apache Airflow Apache Flink dbt

Cloud & Infrastructure

AWS Kubernetes Terraform Docker GCP

Data Warehouse

Snowflake AWS Redshift BigQuery Delta Lake

Languages

Python SQL Scala Bash

Databases

PostgreSQL MongoDB Redis Cassandra

Observability

Prometheus Grafana DataDog OpenTelemetry

Experience

2022 — Present Remote

Principal Data Engineer @ Tech Corp

Architected a multi-region data mesh serving 50+ internal teams. Reduced cloud infrastructure costs by 30%.

  • Designed and deployed Kafka-to-Snowflake CDC pipelines handling 1M+ events/sec
  • Led migration from monolithic ETL to modular, domain-owned data products
  • Reduced P95 query latency by 40% through Spark optimization and partition tuning
  • Mentored a team of 6 engineers across 3 time zones
KafkaSparkSnowflakeKubernetesTerraform
2019 — 2022 Hybrid

Senior Data Engineer @ DataFlow Inc.

Built and owned the company's core data platform, enabling self-serve analytics for 20+ business teams.

  • Delivered Airflow-orchestrated ETL pipelines processing 500GB+ nightly
  • Migrated legacy Oracle DWH to Snowflake, cutting storage costs by 40%
  • Implemented data quality framework reducing production incidents by 60%
AirflowSnowflakePythondbtAWS
2016 — 2019 On-site

Data Engineer @ Analytics Co.

Developed ETL pipelines and data models supporting product analytics for 5M+ daily active users.

  • Built real-time dashboards using Spark Streaming and Kafka
  • Designed dimensional data models in PostgreSQL and Redshift
  • Automated pipeline monitoring, reducing MTTR from hours to minutes
SparkKafkaPostgreSQLRedshiftPython

Projects

Real-time Analytics Platform

Kafka & Spark streaming pipeline processing 1M events/sec with fault-tolerant event ingestion.

Cloud ETL Modernization

Airflow-orchestrated Snowflake pipelines that reduced nightly batch runtime by 45%.

CDC Data Mesh Enablement

Built Kafka-to-Snowflake CDC templates for domain teams to self-serve compliant datasets.

Let's talk.

Have a project in mind or want to connect? Send me a message.