Senior DevOps Engineer
As a Senior DevOps Engineer you will be deeply involved in various high-impact projects for the company. This typically means hands-on work on systems such as in-house CI/CD, on-prem ML data center infrastructure, AWS, Azure, Google Cloud based solutions and more. You’ll be involved in server and tool selection processes and configuration for ML and production projects. You’ll also work on infrastructure that powers our ML research, from on-prem GPU clusters to multi-cloud training pipelines.
What you'll do
- Design architectures and lead server selection and on-prem setup
- Administer Linux servers, networking, and storage across our on-prem infrastructure
- Diagnose and resolve intermittent or hard-to-reproduce issues on on-prem tower servers, especially cases that vendor technicians are unable to replicate
- Troubleshoot and fix issues across cloud infrastructure
- Administer identity and device management platforms
- Work closely with our software architects to develop and maintain cloud architectures across projects
- Own cloud workloads end-to-end — deployment, scaling, high availability, and monitoring — across platforms such as ECS and EKS
- Design, implement, and maintain CI/CD pipelines (Jenkins, GitHub Actions)
- Develop scripts to automate routine tasks
- Manage patching and security across systems, and integrate security tools into our infrastructure.
What we are looking for
Must-have:
- 5+ years of hands-on DevOps experience across on-prem and cloud infrastructure
- Strong Linux system administration skills
- Experience designing CI/CD pipelines with Jenkins and GitHub Actions
- Proficiency with scripting languages (bash, Python, or similar) for automation
- Hands-on experience with Infrastructure-as-Code tools (Terraform and/or Ansible)
- Production experience with containers and orchestration (Docker, Kubernetes)
- Working experience with at least one major cloud provider (AWS, Azure, or GCP)
- Solid fundamentals in networking (TCP/IP, DNS, subnetting, VPNs, firewalls). Experience with MP-BGP, EVPN, VXLAN, or similar technologies is a strong plus.
- Strong analytical and troubleshooting skills, with the ability to communicate clearly in English
- Strong proficiency of using Claude Code and AI assisted coding workflows.
Nice-to-have:
- Knowledge of HPC and job schedulers (Run:ai, Determined, etc.)
- Familiarity with ML infrastructure or supporting ML / data science teams
- Experience with server hardware selection and procurement
- Experience hosting and managing application testing services (Citrix, VMware Horizon, Squid Proxy, etc.)
- Security tooling integration and patch management at scale
- BSc / MSc in Computer Science or related field.
How to apply
All interested candidates are encouraged to apply through this form.
We highly appreciate all applications, however, only shortlisted candidates will be contacted for the next stages.
Krisp is an Equal Opportunity Employer:
All applicants are considered regardless of race, color, religion, national origin, age, sex, marital status, ancestry, physical or mental disability, veteran status, gender identity, or sexual orientation. We do not tolerate discrimination or harassment of any kind. All employees and contractors of Krisp treat each other with respect and empathy.