Job Description
Seamless.AI is seeking a Freelance Principal Data Engineer to design, develop, and maintain scalable ETL pipelines. The ideal candidate will have expertise in Python, Spark, AWS Glue, and other ETL technologies. They should have a proven track record in data acquisition and transformation, as well as experience working with large data sets and applying methodologies for data matching and aggregation methodologies.
Responsibilities:
- Design, develop, and maintain scalable ETL pipelines.
- Work with stakeholders to understand data requirements.
- Implement data transformation logic using Python.
- Utilize AWS Glue to create and manage ETL jobs.
- Optimize ETL processes for performance and scalability.
- Apply data matching, deduplication, and aggregation techniques.
- Ensure compliance with data governance, security, and privacy best practices.
- Provide recommendations on emerging technologies.
Requirements:
- Strong proficiency in Python.
- Hands-on experience with AWS Glue or similar ETL tools.
- Solid understanding of data modeling and data warehousing principles.
- Expertise in working with large data sets and distributed computing frameworks.
- Strong proficiency in SQL.
- Familiarity with data matching, deduplication, and aggregation methodologies.
- Excellent communication and collaboration skills.
- Fluency in English and Spanish.
- Bachelor's degree in Computer Science or related field.
- 7+ years of experience as a Data Engineer.
- Professional experience with Spark and AWS pipeline development.
Seamless.AI offers:
- A 6-month or 12-month contract with the possibility of renewal.
- Compensation based on project milestones, deliverables, or a fixed contractual rate.