Data Engineer II

Job Description

We are looking for a skilled Data Engineer II to join our Data Platform team. You will
play a key role in building and optimizing our next-generation data infrastructure.
Operating at the scale of Flipkart (Petabytes of data), you will design, develop, and
maintain high-throughput distributed systems, bridging traditional big data
engineering with modern cloud-native and AI-driven workflows.
Key Responsibilities
Data Pipeline Development & Optimization
● Build Scalable Pipelines: Design, develop, and maintain robust ETL/ELT
pipelines using Scala and Apache Spark/Flink (Core, SQL, Streaming) to process
massive datasets with low latency.
● Performance Tuning: optimize Spark jobs and SQL queries for efficiency,
resource utilization, and speed.
● Lakehouse Implementation: Implement and manage data tables using modern
Lakehouse formats like Apache Iceberg, Hudi, or Delta Lake, ensuring efficient
storage and retrieval.

Data Management & Quality
● Data Modeling: Apply Medallion Architecture principles (Bronze/Silver/Gold) to
structure data effectively for downstream analytics and ML use cases.
● Data Quality: Implement data validation checks and automated testing using
frameworks (e.g., Deequ, Great Expectations) to ensure data accuracy and
reliability.
● Observability: Integrate pipelines with observability tools to monitor data health,
freshness, and lineage.
Cloud Native Engineering
● Cloud Infrastructure: Deploy and manage workloads on GCP DataProc and
Kubernetes (K8s), leveraging containerization for scalable processing.
● Infrastructure as Code: Contribute to infrastructure automation and deployment
scripts.

Collaboration & Innovation
● GenAI Integration: Explore and implement GenAI and Agentic workflows to
automate data discovery and optimize engineering processes.
● Agile Delivery: Work closely with architects and product teams in an
Agile/Scrum environment to deliver features iteratively.
● Code Reviews: Participate in code reviews to maintain code quality, standards,
and best practices.

Required Qualifications
● Experience: 3-5 years of hands-on experience in Data Engineering.
● Primary Tech Stack:
○ Strong proficiency in Scala and Apache Spark (Batch & Streaming).
○ Solid understanding of SQL and distributed computing concepts.
○ Experience with GCP (DataProc, GCS, BigQuery) or equivalent cloud
platforms (AWS/Azure).
○ Hands-on experience with Kubernetes and Docker.
● Architecture & Storage:
○ Experience with Lakehouse table formats (Iceberg, Hudi, or Delta).
○ Understanding of data warehousing and modeling concepts (Star schema,
Snowflake schema).
● Soft Skills:
○ Strong problem-solving skills and ability to work independently.
○ Good communication skills to collaborate with cross-functional teams.

Education Qualification
● Bachelor’s or Master’s degree in Computer Science, Information Technology,
Engineering, or a related quantitative field.

Preferred Qualifications
● Machine Learning Background: Familiarity with ML concepts, feature
engineering, or experience building data pipelines for ML models is highly
preferred.
● Experience with workflow orchestration tools (Airflow, Azkaban, etc.).
● Familiarity with real-time analytics databases (Druid, ClickHouse, HBase).
● Experience with CI/CD pipelines for data applications.
Requirement Details:

Notice Period: Immediate to 20 Days
Work Location: Client Location (Bangalore)
Experience: 3–6 Years
Role Type: Full-Time
Work Mode: Work from Office

Why Join Us?
● Work on petabyte-scale challenges that define the industry standard.
● Collaborate with top-tier engineers in a high-growth environment.
● Opportunity to work with cutting-edge technologies like Iceberg, K8s, and GenAI

Thank you for your interest in this role. Please also share your CV at Vedika@lsarecruit.co.uk and if suitable, we will get in touch with you to discuss further.

About Us

Contact Info

Job Description

About Us

Contact Info

Job Description

Apply for this Position