Nr ref.: LP/MLOPSE/POZ/05
MLOps Engineer - to support and enhance the delivery of a Machine Learning Platform. 100% remote
Technology stack
- ML
- Seldon technologies
- MLFlow
- Python
- Public cloud
- Azure – preferred
- (nice to have) Google Cloud Platform
- Containers
- Kubernetes
- Istio or similar
- Prometheus/Grafana/Alert Manager/Elastic Stack
- Infrastructure as Code:
- (nice to have) Terraform
Automation and other tools:
- Python, Bash
- Helm
- Azure DevOps
- Argocd or similar
- GitHub, JIRA & Confluence or similar
Project description, typical tasks and duties:
- Discover, incubate and showcase MLOps tools, frameworks, and platforms to enable our AI/ML and Data Science teams
- Deliver and maintain production-grade machine learning platforms
- Collaborate with R&D to educate around DevOps mindset to AI/ML
- Write code to integrate solutions into seamless workflows - CI/CD and (potentially) some workflow systems/solutions
- Work with System Engineers and DevOps Engineers to fine-tune and improve the infrastructure Proactively look for ways and solutions to improve effectiveness of your own work
- Deliver solutions with integrated "safety switches" to reduce human error factor, provide traceability, roll back capabilities and reduce IT security risk (AKA shift left security)
- Tasks assignment and tracking - JIRA
- Write documentation in Atlassian Confluence - we need to document our work to submit it for approval and to be able to turn our deliverables into "live" systems.
- No on-call duties, no weekend work
- Salary: 160 - 190 PLN per hour (B2B)
- Regular & Senior - 3+ years of experience
- Able to work autonomously for most of the day, creating solutions (code) that are part of a larger project
- Able to not only follow proposed design but also to present options, optimize and improve
- Extensive Kubernetes experience and familiarity with Cloud Native tools and the CNCF landscape
- Comfortable with DevOps processes and automated deployments, Infrastructure as Code, declarative deployments/GitFlow.
- Good understanding of AI/ML concepts. Familiarity and understanding of ML experimentation and inference processes
- Experience maintaining/deploying ML models in production
- Experience managing enterprise wide software deployments and Day-2 Ops
- System architecture knowledge/experience (nice to have)
- Self-learner eager to do deep dives into unknown areas of technology
- Able to document his work pre and post implementation, create high level diagrams
- B2 - C1 English