Senior Data Engineer | Cloud & Big Data Specialist | AWS | Azure | Snowflake | Hadoop | Spark
Senior Data Engineer with 9 years of experience in big data, cloud computing, and ETL pipeline development. Strong expertise in Hadoop ecosystem (HDFS, MapReduce, Spark, Hive, Kafka) and cloud platforms (AWS, Azure, Snowflake). Proficient in real-time data streaming, data warehousing, and performance optimization for large-scale datasets.
Skilled in building and automating batch and real-time data pipelines using Spark, Python, and SQL. Experience in AWS (Glue, Lambda, Redshift, S3) and Azure (Data Factory, Synapse Analytics, Data Lake, Databricks) for data migration, ETL workflows, and analytics. Adept at integrating NoSQL databases (Cassandra, MongoDB, HBase) with Hadoop clusters to process structured and unstructured data.
Hands-on expertise in DevOps tools like Kubernetes, Jenkins, Terraform, and Docker to support CI/CD workflows. Passionate about designing scalable, high-performance data architectures and driving cloud transformations. Strong collaborator in Agile environments, with experience in stakeholder engagement and cross-functional teamwork.
Committed to leveraging cutting-edge data engineering technologies to deliver actionable insights and enhance business decision-making. Always eager to explore emerging trends in AI-driven analytics and cloud data platforms.
Spencer Health Solutions - (2023 - Now)
Developed complex Azure Databricks and Azure Data Factory (ADF) data pipelines.
Implemented real-time streaming using Kafka and stored data in HDFS.
Migrated data from on-premise systems to Snowflake.
Built Spark-based ETL processes to transform structured & semi-structured data.
Automated workflows using Apache Airflow.
Developed Power BI dashboards for data insights
Homesite Insurance - (2022 - 2022)
Designed Azure Data Lake solutions with Azure Data Factory for ETL workflows.
Built streaming pipelines with Kafka and Azure Stream Analytics.
Created Power BI and SQL-based dashboards.
Developed real-time data ingestion frameworks for processing large datasets.
Migrated databases from SQL Server to Azure SQL.
Lending Tree - (2019 - 2021)
Developed ETL pipelines using PySpark, Hive, and Presto.
Worked with AWS Glue and Lambda for serverless data processing.
Built real-time data processing pipelines using Spark Streaming and Kafka.
Created data visualizations with Tableau and automated CI/CD with GitHub & Maven.
IBM - (2016 - 2018)
Built scalable Hadoop clusters with Hortonworks Data Platform.
Implemented data ingestion using Kafka and Spark Streaming.
Developed Hive-based data warehousing solutions.