We are seeking a Senior Python Data Engineer to design, develop, and support scalable cloud-native data solutions. The ideal candidate will have strong experience with Python, PySpark, SQL, and AWS, building high-performance data pipelines and distributed processing systems in enterprise environments.
Key Responsibilities
- Design and develop scalable data pipelines and processing solutions.
- Build batch and real-time data workflows using Python and PySpark.
- Develop AWS-based applications and services using serverless and containerized architectures.
- Build APIs and backend services to support data integrations.
- Optimize performance, reliability, and scalability of applications and data platforms.
- Collaborate with cross-functional teams to deliver business solutions.
- Support testing, deployment, monitoring, and production operations.
Required Qualifications
- 5–10 years of software development or data engineering experience.
- Strong hands-on Python development experience.
- Experience with PySpark and distributed data processing.
- Advanced SQL skills and experience working with large datasets.
- Hands-on experience with AWS services including:
- S3, Redshift, RDS
- Lambda, EC2, ECS, EMR, AWS Batch
- Experience building scalable data pipelines and ETL/ELT solutions.
Preferred Qualifications
- Experience with NumPy, Pandas, or Polars.
- Experience building REST APIs and microservices.
- Knowledge of cloud-native and event-driven architectures.
- Experience with Agile and CI/CD practices.
- Experience working with enterprise-scale data platforms.
Technical Environment
Python | PySpark | SQL | AWS (S3, Redshift, RDS, Lambda, EC2, ECS, EMR, AWS Batch)