Description
Infra Engineer (Hadoop Admin)
One of my leading Retail clients is urgently looking for a Infra Engineer (Hadoop Admin) for a 6 month contract with possible extension in Stockholm, Sweden.
The Infra Engineer (Hadoop Admin) Must-Have:
- Completed BS or MS in Computer Science/Electronics or equivalent
- Hadoop Admin experience with 4 to 6 years
- Cluster maintenance, creation and removal of nodes using tools like Cloudera Manager Enterprise
- Performance Monitoring of Hadoop job and Hadoop MapReduce routines.
- Manage and review Hadoop log files, provide support during Hadoop patch updates, version upgrades
- Monitor Hadoop cluster connectivity, security and Hadoop Distributed File system management
- Troubleshooting Linux and Solaris hosts deployed in production
- Develop BASH/Shell/Python scripts to automate processes where appropriate
- Review and monitor cluster sizing and recommend changes as needed
- Create and/or upgrade clusters, or add new nodes as needed
- Create/update documentation for new/existing processes, installation or maintenance
- Provide database technology consultation and support to the engineering team
- Excellent troubleshooting skills and Scripting knowledge
- Monitoring software's like nagios, NmSysy, Service Now etc.
Good-to-Have:
- Provide SRE L3 support for all storage services
- Need to be a good team player and be able to provide accurate technical feedback to developers to identify and resolve complex issues
- Provide recommendations and implement solutions for scalability and performance improvement
- Effectively communicate with the Datacenter teams in getting the hardware fixed/replaced.
- Effectively communicate with all the stakeholders with accurate data and the analysis.
- Good understanding of the production support SLA's and their priorities.
Role descriptions/Expectations from the Role:
- Provide the 24x7 L3 level support for the infrastructure of the client with in the agreed SLA timelines
- Updating the tickets periodically and following up on the issue till closure (take ownership in resolution)
- Reporting on the status of the tickets/metrics to the management team
- Work closely with L2, SRE L3, Data center, operations, and project management in resolving the issues, reporting the status, highlighting the risks/issues
- Adhering to the ITIL processes for the stability of production environments.
- Updating and maintaining the SOPs, run book, and other documents in the client repository
- Participation and interaction in the weekly, VCON with clients
Apply now for immediate interviews!