Technical Support Engineer for Tier 2/3 Kafka

Nottinghamshire  ‐ Onsite
This project has been archived and is not accepting more applications.
Browse open projects on our job board.

Description

Technical Support Engineer for Tier 2/3 Kafka

The Support Engineer is responsible for supporting and managing the various applications, tools, analytic data related platforms/software and other business related to the Kafka platform as part of a Meter Data Management Solution (MDMS).

The Support Engineer will be responsible to manage tickets, requests, incidents and other issues related to the MDMS application. The Support Engineer will work closely with all of the other teams as part of the Managed Applications and Services

to provide robust solutions, maintain a stable and optimal environment and coordinate all the different platforms, the data, the related applications and the tools used.

You will have the opportunity to impact data driven decisions for our utility customers effecting business decisions and outcomes.

This position is based in Nuremberg, Germany or Nottingham, UK however remote applicants are welcome and considered.

Our shift is standard is Monday to Friday 8:30 am - 5:00pm (CEST), but the position will require standby and planned work outside the standard hours.

You must have experience with Kafka data streaming technologies and experience with mainstream distributed system (eg hadoop database technologies) to be successful in this role.

Tier 2/3 Kafka Support Engineer responsibilities:

- Administration of Kafka Platform deployed in various customer environments

- Design, Implement and manage monitoring solutions and baseline statistics reporting

- Monitoring Kafka Platform services using Control Center, JMX, Prometheus, Grafana, ELK

- Perform regular health checks of the Kafka Platform

- Perform daily tasks on generated tickets, incidents, changes and problems

- Triage customer reported issues and respond to them via ticketing system, phone or remote sessions

- Provide operational support (diagnose, reproduce, and resolve customer issues)

- Monitor and manage performance and capacity, perform optimization and tuning where possible

- Ensure application availability and reliability as per customer SLAs

- Monitoring of different communication channels to receive critical customer enquiries as well as initial assessment and validation as per response SLA

- Working with customers to resolve a wide range of issues with their Kafka deployments

- Respond to user-reported issues in adherence to established Service Level Agreements for Level 3 application support services

- Perform advanced troubleshooting at the application level and OS level, using your knowledge and relevant expertise

- Facilitate root cause investigations and manage the implementation of corrective and preventative measures

- Identify the area of fault (code, environment, or configuration) and work with the appropriate team(s) implementing the fix

- Resolve problems independently and understand the correct escalation procedure

- Communicating with our core engineering team to provide Real Time experience from customer deployments

- Improving product documentation and creation/update of knowledge base articles

- Maintain and update the Kafka configuration, keep records in CMDB

- Work on architecture tasks for improving and developing existing Kafka platform services and components

- Liaison with other global support teams to get tickets resolved within the SLA's

- Provide timely feedback into the development process on customer-reported product problems

- Document actions to effectively communicate information internally and to customers

- Execute backup and (disaster) recovery procedure if needed

- Maintenance and archiving of old data and logs

- Testing and validating of bug fixes, updates, upgrades, enhancements and/or configuration updates

- Deploy/Perform Kafka bug fixes, updates, upgrades, enhancements and/or configuration updates

- Starting and stopping of Kakfa services as required

- Backup and restore of Kakfa services as required

- Identify security issues and improve policies and security measures in relation to the Kafka Platform

- User guide development and training overviews for further global support teams

- Support for development teams that are utilizing the platform

- Provide troubleshooting and best practices methodology for development teams

- Further development of policies, processes and customer procedures

Required skills and experience:

- Contributing to process development - we're a small team, so we're looking for people who want to help us lay the foundation for growing efficiently and with a best-in-class culture

- Excitement in learning about streaming data and being the domain expert in Apache Kafka

- Experience troubleshooting applications running on Linux through diagnosis and reproduction of issues (resource contention, network bottlenecks, etc.)

- Desire to make customers successful through direct interaction

- Strong operational knowledge/experience in Kafka

- Operational knowledge/experience of at least one mainstream distributed system (eg Kafka, Hadoop, Cassandra, etc.) and Java applications (jstack, jmap, etc.)

- Solid Scripting and Programming Skills (Python, JavaScript, PowerShell, VBScript, PerlScript, C#, TSQL, XML) advantageous

- Experience with Cloud Technologies (Azure, AWS, Google) preferred

- Minimum 5 years of relevant experience in a similar role

- Able to troubleshoot technical issues in a structured approach, problem solver, analytical proficiency

- Cyber security knowledge and experience

- Attention to details, fast learner and excellent communication skills

- Excellent customer service skills

- Self-managed and team oriented

Main interfaces:

- Internal global Managed Applications & Services organization

- Customer, Customer representatives

- Product development and support

- System integrators

Location:

- Nuremberg, Germany or Nottingham, UK (remote applicants are welcome and considered)

Working hours:

- Monday to Friday 8:30 am - 5:00pm (CEST), required standby and planned work outside the standard hours

Contract term:

- 1 year with possibilities of further extensions, starting 1st January 2021

Start date
01/01/2021
Duration
12 months
From
Boss Professional Services
Published at
15.10.2020
Project ID:
1983139
Contract type
Freelance
To apply to this project you must log in.
Register