← Back to opportunities
About the Role
Responsibilities
Data Pipeline Development: Design, develop, and maintain highly scalable and optimized ETL pipelines using PySpark on the Cloudera Data Platform, ensuring data integrity and accuracy.
Data Ingestion: Implement and manage data ingestion processes from a variety of sources (e.g., relational databases, APIs, file systems) to the data lake or data warehouse on CDP.
Data Transformation and Processing: Use PySpark to process, cleanse, and transform large datasets into meaningful formats that support analytical needs and business requirements.
Performance Optimization: Conduct performance
#J-18808-LjbffrReady to Join Through a Referral?
Apply now and get connected directly with the hiring team
Apply for this Position