Kafka Site Reliability Engineer (Data Infrastructure)
Who We Are:
Bandwidth lives for innovation! Our technology powers brands like Google, Microsoft, GoDaddy, Arlo, Netgear, Zoom, Rover and more of the most exciting leaders in technology. Our intelligent voice, messaging, 9-1-1 access, and phone number services— all backed by Bandwidth’s own nationwide, all-IP voice network—allow us to power the way people communicate, connect, and do business.
At Bandwidth, your music matters when you are part of the BAND. We celebrate differences and encourage BANDmates to be their authentic selves. #jointheband
What We Are Looking For:
We are looking for a Kafka Site Reliability Engineer (Data Infrastructure) to work as a senior member of the Data Infrastructure team, taking a leading role on our streaming data platform. Works directly with many of the core components of our nationwide voice and messaging network, and in the operational internals of major OSS and commercial applications. Solves difficult challenges, thinks about architecture, and manages supporting infrastructure applications.
What You’ll Do:
- Design, build and maintain a stable and secure Kafka Infrastructure , on-prem and in the cloud.
- Be instrumental in finding the right solutions to critical needs, often involving massive amounts of data and the
- need to scale.
- Bake appropriate metrics, analysis, and alerting into every solution.
- Develop and refine our standards and processes to continue to grow our ability to develop world-class products.
- Take an architectural role, with administration and on-call responsibilities for critical application infrastructure.
- Design efficient processes, automation, etc., to support Bandwidth and a large developer community.
- Understand the world of data at rest and in motion, with particular attention to security, SOX, and other
- requirements along with developer needs within that world.
- Work with a high-functioning agile development team in sprints.
What You Need:
- Bachelor’s Degree in computer science or a related field
- 4+ years experience in running large scale data platforms, specifically Kafka clusters (Apache Kafka and/or Confluent Platform).
- Significant experience in infrastructure/application observability technologies such as Datadog, Elasticsearch and time series metrics and monitoring.
- 4+ experience in Software Engineering, particularly with Java and automation technologies like Ansible, Terraform etc ; Strong understanding of Kubernetes and Docker.
- 3+ years of significant responsibility in the architecture or design of software and infrastructure solutions.
- Architecture, configuration, administration and development with Kafka, Kafka Connect, KSqlDB and other Kafka-related technologies.
- Experience with metrics and monitoring applications such as DataDog, Elasticsearch, LogicMonitor, etc. Significant experience with automation technologies such as Ansible, Terraform etc.
- Strong knowledge of Linux environments, Docker, Kubernetes, and OpenShift.
- Experience with source control , automation and deployment tools (GitHub, Jenkins).
- Considerable experience in deploying infrastructure applications to on-prem and cloud environments, especially AWS.
- Code primarily in Java, using Spring / Spring Boot, with advanced Java skills and numerous libraries and supporting technologies.
- Experience in working with other languages and scripting languages, specifically React, Python, Perl, and bash scripts.
Are you excited about the position and its responsibilities, but not sure if you’re 100% qualified? Do you feel you can work to help us crush the mission? If you answered ‘yes’ to both of these questions, we encourage you to apply! You won’t want to miss the opportunity to be a part of the BAND.
Little Extra For You
At Bandwidth, we’re pretty proud of our corporate culture, which is rooted in our “Whole Person Promise.” We promise all employees that they can have meaningful work AND a full life, and we provide a work environment geared toward enriching your body, mind, and spirit. How do we do that? Well…
- You’re guaranteed to be given “work that matters.” If you ain’t feelin’ that, let us know. We’ll fix it.
- With our Hacker Hours program, you get protected time to learn, experiment, and build with the technology you want to! You can even submit projects to our yearly Hack-of-the-Year competition complete with awards and prizes.
- Your health insurance premiums are 100% paid by Bandwidth. So are your dental insurance premiums. So are your life insurance premiums. It’s pretty nice.
- When you’re on vacation, nobody is allowed to contact you. No calls, texts, emails… Enjoy
your time off.
- We provide you with a gym membership, a lunchtime shuttle to the gym, and give you 90
minutes for lunch if you work out.
- Not enough time to hit the gym? There’s a locker room and cardio equipment on the first