- Develop and maintain data pipelines using Java, PySpark, and SQL
- Design and optimize data architectures and data warehousing solutions
- Hands-on experience with Big Data technologies: Spark, Hive, Hadoop
- Utilize GCP managed services, especially BigQuery and DataFlow
- Implement version control using Git and support CI/CD pipelines
- Strong problem-solving skills with a focus on scalable, efficient data processing