Overview
Responsibilities:
Assist in developing scalable data pipelines using GCP tools such as Dataflow, BigQuery, Cloud Composer, and Pub/Sub Write and maintain SQL and Python scripts for data ingestion, cleaning, and transformation Support the creation and maintenance of Airflow DAGs in Cloud Composer for orchestration Collaborate with senior data engineers and data scientists to implement data validation and monitoring checks Participate in code reviews, sprint planning, and cross-functional team meetings Help with documentation and knowledge base creation for data workflows and pipeline logic Gain exposure to medallion architecture, data lake design, and performance tuning on BigQuery
Requirements :
2-4 years of relevant experience in data engineering, backend development, or analytics engineering Strong knowledge of SQL and working-level proficiency in Python Exposure to cloud platforms (GCP preferred; AWS/Azure acceptable) Familiarity with data pipeline concepts, version control (Git), and basic workflow orchestration Strong communication and documentation skills Eagerness to learn, take feedback, and grow under mentorship Bonus Skills: Hands-on experience with GCP tools like BigQuery, Dataflow, or Cloud Composer Experience with dbt, Dataform, or Apache Beam Exposure to CI/CD pipelines, Terraform, or containerization (Docker) Knowledge of basic data modeling and schema design concepts