hero

Companies you’ll love to be a part of

Calibrate Ventures
Calibrate Ventures
18
companies
44
Jobs

Lead AI/ML Infrastructure Engineer

aiXplain

aiXplain

Software Engineering, Other Engineering, Data Science
Dubai - United Arab Emirates
Posted on Thursday, January 11, 2024

Come join a team of industry and science leaders to achieve a vision of empowering innovation through state-of-the-art artificial intelligence and machine learning. We are addressing exciting challenges for our customers, at the intersection of AI/ML and cutting-edge cloud infrastructure with ML being both a core enabler for and a major feature of, our platform.

We are looking for candidates adept at implementing and researching AI/ML engineering and infrastructure engineering capabilities.

Key responsibilities

  • AI/ML infrastructure management: Architect, deploy, and maintain scalable AI/ML infrastructure leveraging Kubernetes and KFserve for model hosting and management.
  • Model deployment and optimization: Implement efficient deployment pipelines for AI/ML models, focusing on optimization, scalability, and reliability.
  • Performance monitoring and tuning: Monitor model performance metrics, identify bottlenecks, and implement improvements to enhance efficiency and accuracy.
  • Team leadership and collaboration: Lead a small team of engineers, fostering a collaborative environment and ensuring effective communication and knowledge sharing.
  • Cross-functional collaboration: Work closely with data scientists, software engineers, and other stakeholders to understand requirements, translate them into scalable solutions, and ensure successful deployment.
  • Continuous integration / Continuous deployment (CI/CD): Implement and maintain CI/CD pipelines for AI/ML models to ensure rapid and reliable model updates and releases.
  • Documentation and best practices: Develop and maintain documentation, best practices, and standard operating procedures related to ML infrastructure and deployment processes.

Required skills and qualifications

  • Proficiency in KFserve, Large Language Models (LLMs), Kubernetes, and Flyte for AI/ML model deployment and management.
  • Strong background in managing AI/ML infrastructure at scale.
  • Experience with CI/CD pipelines for AI/ML models.
  • Proven ability to lead and manage small teams effectively.
  • Excellent problem-solving skills with a focus on scalability and reliability.
  • Strong communication and collaboration skills, ability to work effectively in cross-functional teams.
  • Bachelor’s or Master’s degree in Computer Science, Engineering, or related field (or equivalent experience).

Preferred qualifications

  • Experience with additional ML frameworks and tools beyond KFserve and Flyte.
  • Certifications in Kubernetes or related technologies.
  • Previous experience in deploying and managing Large Language Models (LLMs).
  • Familiarity with cloud platforms (AWS, GCP, Azure) for ML model hosting.