Description
Hardware Validation, Edge - Responsibilities
Example projects:
- Server Component Validation
- Validation of NICs, Flash drives, PSUs, Storage backpanes, etc.
- Work with Vendors and OEMs on RCA, writing FA reports.
- Firmware Validation
- Validation of new versions of BIOS, and other firmware types
- Customization of test scenarios for new components.
- System debug and RCA
- Investigation and troubleshooting of hardware failures in prod
- Work with Vendors and OEMs on RCA, writing FA reports.
- Problem escalation to vendor/manufacturer and resolution
- Reporting discovered new types of failures to Vendors and OEMs
- Work on reproduction scenarios in prod/test environment.
- Communication with vendor/manufacturer.
Example activities:
- Making changes to automation infrastructure to monitor and support fleet health at scale.
- Conducting investigations and creating tooling to detect and diagnose hardware health issues.
- Working with partner teams on identification, validation, and implementation of remediations across the software and hardware stack.
- Publishing updates on resolutions and communicating findings internally.
- Troubleshooting, diagnosing, and root causing system failures while isolating components and failure scenarios, working with internal and external stakeholders.
- Employing data visualization techniques to highlight issues and implementing systemic solutions to hardware health issues.
Required Skills
- hands-on experience of debugging and resolving hardware and systems issues
- facility with Python or equivalent development/Scripting language
- troubleshooting and analytical skills
- strong experience working with Linux systems
- experience with server GPU solutions
- experience with HPE server (x86) solutions