Drive the development of new capabilities and processes to support delivery, and ensure that the tools and approaches used are effective. Specific duties include:
1. Acute drive to automate, to constantly replace manual operations with automated solutions and enhance automation around configuration management and tooling and continuous delivery of software.
2. Design, code, test and deliver automation tools for production applications deployment, maintenance and high availability.
3. Develop Terraform modules and CloudFormation templates for infrastructure creation and develop Ansible roles and playbooks for infrastructure provisioning and configuration changes.
4. Improve existing monitors developed by SDET’s and add new monitors to track the health of the backend systems along with datadog’s synthetic tests to monitor the critical infrastructure reliability and availability.
5. Migrate docker compose and docker swarm applications to Kubernetes using Helm Charts, Ansible, SOPS, shell scripting.
6. Design and develop applications using Java, Spring Boot, Datadog, Elasticsearch and Kubernetes technologies.
7. Design highly available, auto scalable, secure, reliable, resilient, efficient and cost optimized solutions.
8. Clearly and precisely communicate day-to-day operations to ensure thorough hand-off to regional SRE teams.
9. Build a mature platform organization to help the rest of IT in the Observability/SRE journey.
10. Develop monitoring capabilities, strategy and roadmaps in partnership with other technology groups.
11. Migrate applications, Elasticsearch, Postgres, MongoDB and MySQL databases between cloud platforms from AWS to Azure, OCI, Rackspace and GCP effectively reducing the operation costs.
12. Troubleshoot priority incidents, facilitate blameless post-mortems and ensure permanent closure of incidents while reducing incident response time and highest availability of the system.
13. Engage with development and extended teams throughout the life cycle to help develop reliable and scalable software, ensuring minimal refactoring or changes.
14. Participate in architecture review meetings along with security and audit reviews.
Bachelor’s degree (or foreign equivalent degree) in Computer Science, Engineering, or a related field plus 4 years of post-baccalaureate experience in software design, development and testing.
Experience must include the following, which may be gained concurrently:
1) 4 years of experience with Java, HTML, Gradle or Maven, Spring or Spring Boot for development, testing, monitoring, and deploying enterprise applications.
2) 4 years of experience developing web services and RESTful APIs and working with Swagger, Kong dashboard, Kafka, RabbitMQ, Postgres, MySQL Kibana, Splunk, and MongoDB or Elastic search.
3) 4 years of experience working on identifying issues and performing root cause analysis to identify performance bottlenecks and application bugs.
4) 4 years of experience working with test automation tools including selenium, postman, and Appium.
5) 2 years of experience working with Kubernetes, Docker, Terraform, Ansible Helm, Jfrog, Shell scripting, and creating CI/CD pipelines using Teamcity.
6) 2 years of experience working with Internet (HTTP, FTP) and network security protocols such as SSL/TLS, TCP/IP, UDP, ICMP, certificates and security infrastructure.
7) 2 years of experience working with low level security features like SELinux and with security, container scanning solutions and integrating vulnerability scanners including SonarQube, Checkmarx.
8) 2 years of experience gathering and analyzing metrics using Datadog and creating monitors, synthetics and dashboards for production applications and systems maintenance.
M-F, 9 a.m. to 5 p.m., after hours work may be required
Salary $139,050. Apply online at Zimperium.com, or mail to Staffing, Zimperium, Inc., 4055 Valley View Lane, Suite 300, Dallas, TX 75244. Reference 047550-037 in reply.
This position is part of Zimperium Inc.’s employee referral program and is eligible for an employee referral incentive.