Profileimage by Harsh Yadav Data Engineer | AWS, Snowflake, Pyspark, Sql, Python, ETL, Stored Proc from NOIDA

Harsh Yadav

available

Last update: 22.09.2024

Data Engineer | AWS, Snowflake, Pyspark, Sql, Python,ETL,Stored Proc

Graduation: BTech Computer Science
Hourly-/Daily rates: show
Languages: English (Full Professional)

Keywords

Amazon Web Services SQL Databases Snowflake Automation Role-Based Access Control Data Validation Data Migration Workflows Apache Spark Pyspark + 22 more keywords

Attachments

Harsh-Latest-5_220924.pdf

Skills

Cloud Data Engineering: Extensive experience with AWS services including Glue, Lambda, DynamoDB, S3, API Gateway, and Step Functions. I automate workloads and orchestrate complex data workflows to streamline business operations. Data Pipeline Optimization: Proven ability to optimize SQL code, even with massive datasets (up to 4.5 billion rows), and shift processes from monthly to weekly runs, enhancing performance and reducing processing time by up to 50%. Data Migration and Integration: Expert in migrating complex systems to the cloud using AWS Data Migration Service, Snowpipe, and Snowflake. I’ve successfully transitioned hundreds of tables and critical business modules, ensuring seamless data validation and integrity. Automation and CI/CD: Skilled in automating data workflows using Bitbucket, Jenkins, and Apache Airflow, including dynamic DAG creation for new business logic deployments. Big Data and Spark: Developed robust Spark applications and wrote complex business logic in PySpark to handle large-scale data processing tasks efficiently. Role-Based Access Control (RBAC): Created custom RBAC models in Snowflake for secure and compliant data access management tailored to client needs. Advanced SQL and Data Validation: Expertise in writing and validating SQL for data transformation, including CDC processes and ensuring data consistency across platforms like Oracle, Snowflake, and AWS.

Project history

02/2022 - Present
Data Engineer
Lumiq (Banks and financial services, 250-500 employees)

I am a Data Engineer with expertise in converting SAS business logic to SQL, optimizing it for Apache Hive and Impala. I developed Spark applications, automated CI/CD pipelines using Bitbucket and Jenkins, and improved loan processing time by 50% for a major financial institution. Skilled in AWS services like Data Migration Service, Glue, Lambda, DynamoDB, API Gateway, and Step Functions for data loading and automation. I’ve built complex PySpark logic, established RBAC in Snowflake, migrated 100+ tables via Snowpipe, and transitioned modules to dbt-Snowflake with thorough validation. I optimized SQL handling 4.5 billion rows, automated Airflow DAG creation, and validated data with Snowpark. I also recommended using AWS SageMaker for customer insights. My technical skills include Python, SQL, PySpark, DBT, Airflow, AWS, and Snowflake, with strong teamwork, time management, and communication abilities.

Certifications

aws developer associate
2023

Local Availability

Only available for remote work
Profileimage by Harsh Yadav Data Engineer | AWS, Snowflake, Pyspark, Sql, Python, ETL, Stored Proc from NOIDA Data Engineer | AWS, Snowflake, Pyspark, Sql, Python,ETL,Stored Proc
Register