MLOps Engineer

PEOPLE

PEOPLE

Salary: Gross salary $5000 - 7000
Type: Full time

Tags: DevOps Virtualization Docker Continuous Deployment

Role Summary
We are looking for an experienced MLOps Engineer with a strong background in implementing robust infrastructure to support the full lifecycle of machine learning models. This role is key to ensuring that models developed by data science teams can be trained, deployed, monitored, and maintained in production in a secure, scalable, and automated manner.

Send CV through getonbrd.com.

Responsibilities

  1. Design and implement training, validation, and deployment pipelines for machine learning models (CI/CD for models).
  2. • Manage and optimize infrastructure for distributed training, parallel processing, and efficient storage.
  3. • Automate model versioning, dataset management, environment configuration, and dependency management.
  4. • Collaborate with Data Scientists to operationalize experimental notebooks and turn them into production-ready services.
  5. • Integrate models with APIs and backend architectures, ensuring performance and security.
  6. • Establish processes for monitoring data drift and model performance in production.
  7. • Define standards for reproducibility, validation, and automated testing throughout the ML lifecycle.

Profile Requirements

  1. At least 4 years of experience as an SRE, DevOps, or Platform Engineer in ML projects.
  2. Knowledge of model monitoring frameworks such as Evidently, Arize AI, WhyLabs, or similar.
  3. Proficiency with tools like Prometheus, Grafana, ELK/EFK, OpenTelemetry, or Datadog.
  4. Experience with orchestration tools like Airflow, Kubeflow, or experiment tracking platforms (MLflow, Weights & Biases) Strong skills in Kubernetes, Docker, Helm, and infrastructure-as-code tools (Terraform, Pulumi). 
  5. Experience with CI/CD for ML pipelines (testing, validation, rollback).

Nice-to-Haves (not mandatory)

  1. Experience running ML models on Alibaba Cloud and configuring observability within that environment.
  2. Familiarity with canary deployment, shadow testing, and controlled experimentation strategies
  3. Knowledge of explainable AI frameworks and model auditing practices.
  4. Previous experience in high-transaction environments such as banking, payroll, accounting, or logistics.

Condiciones

Work modality: Remote
Project duration: 1 year
Salary: To be negotiated

Computer provided PEOPLE provides a computer for your work.

Source: GetOnBoard | Main Category: Other