AI Infrastructure Engineer (Remote - US)

Jobgether
United States
On-site
Full-time
Posted 19 days ago

Job Description

This position is posted by Jobgether on behalf of a partner company. We are currently looking for an AI Infrastructure Engineer in the United States.

In this role, you will be instrumental in designing and scaling the infrastructure that powers advanced machine learning initiatives. You’ll collaborate closely with cross-functional teams to build reliable, high-performance systems that support large-scale AI training and deployment. This position offers the opportunity to shape the foundations of an innovative AI ecosystem, enabling data scientists and ML engineers to move faster, automate processes, and ensure seamless integration from research to production. It’s an ideal environment for someone who thrives on solving complex infrastructure challenges in a fast-paced, distributed, and mission-driven organization.

Accountabilities

  • Design, implement, and maintain scalable, high-performance ML training and inference platforms.
  • Develop and enhance MLOps tools to enable rapid model deployment and iteration.
  • Optimize system performance to ensure reliability, cost-efficiency, and scalability of AI workloads.
  • Collaborate with research and product teams to transition experimental models into production environments.
  • Establish and enforce best practices for model versioning, testing, monitoring, and governance.
  • Automate data and model pipelines to accelerate experimentation and reduce manual intervention.

Requirements

  • 3+ years of experience in software engineering or ML infrastructure roles, focusing on distributed systems or MLOps.
  • Proficiency in Python (or similar), Docker, Kubernetes, and CI/CD pipelines.
  • Strong experience with major cloud platforms (AWS, GCP, or Azure) and infrastructure-as-code tools (Terraform, CloudFormation).
  • Familiarity with ML frameworks such as TensorFlow, PyTorch, and orchestration tools like Kubeflow, Airflow, or MLflow.
  • Deep understanding of model deployment, real-time inference, and scalable data pipelines.
  • Strong collaboration and communication skills across technical and non-technical teams.
  • Bonus: Experience with GPU optimization, vector databases, feature stores, or real-time audio/streaming systems.

Disclaimer: Real Jobs From Anywhere is an independent platform dedicated to providing information about job openings. We are not affiliated with, nor do we represent, any company, agency, or agent mentioned in the job listings. Please refer to our Terms of Services for further details.