Cloudera Data Engineer - Remote

Jobgether
United States
On-site
Full-time
Posted 18 days ago

Job Description

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Cloudera Data Engineer - Remote in the United States.

We are seeking a skilled Cloudera Data Engineer to lead the migration and ongoing operation of a Medicaid Data Warehouse within an AWS environment. In this role, you will ensure the seamless transfer of Cloudera/Hive/Scala-based data pipelines between AWS accounts while maintaining operational reliability and data integrity. You will collaborate closely with the infrastructure and project teams to optimize cluster performance, validate data, and maintain scheduling and job dependencies. This is a hands-on role that offers the chance to work on complex data engineering tasks, enhance system efficiency, and support enterprise-scale data operations in a dynamic, collaborative environment.

Accountabilities:

·         Replicate, configure, and optimize Cloudera clusters (HDFS, YARN, Hive, Spark) in new AWS environments.

·         Reconfigure cluster connectivity, job dependencies, and metadata stores for seamless migration.

·         Deploy, test, and operate Hive and Spark (Scala) jobs post-migration.

·         Monitor job performance, troubleshoot failures, and implement recovery/alerting mechanisms.

·         Manage user roles, access, and maintain cluster security within the Cloudera environment.

·         Implement routine data housekeeping, archiving, and operational maintenance processes.

·         Document configurations, migration steps, and maintain detailed operational runbooks.

Requirements

·         Bachelor’s degree in Computer Science, Information Systems, or related field.

·         7+ years of experience in data engineering or big data development.

·         4+ years’ experience with Cloudera platform (HDFS, YARN, Hive, Spark, Oozie).

·         Hands-on experience deploying and managing Cloudera workloads on AWS (EC2, S3, IAM, CloudWatch).

·         Strong programming skills in Scala, Java, HiveQL; Python or Bash scripting preferred.

·         Proficiency in Apache Spark for data processing and transformation.

·         Experience implementing business-rules processing using Drools.

·         Ability to collaborate with infrastructure, DevOps, and data governance teams.

·         Preferred: Cloudera certification (CDP Data Engineer or Administrator), experience with Cloudera upgrades or AWS-to-AWS migrations, and public-sector or large enterprise data environments.

Disclaimer: Real Jobs From Anywhere is an independent platform dedicated to providing information about job openings. We are not affiliated with, nor do we represent, any company, agency, or agent mentioned in the job listings. Please refer to our Terms of Services for further details.