Description
Role and task description -
Demonstrate good practices for ETL and product development standards
Skills and experience and must have
Pentaho
Experience
Working in a mixed supplier team, ingesting data into a Cloudera Hadoop cluster, utilising Pentaho Data Integration for ETL.
Tasks -
Ability to work within defined standards and job frameworks.
Ensure clear understanding of requirements
Work with Architects and Lead Developers to gain high level understanding of solution architecture
Should actively participate in stand-ups and sprint meetings
Experience in troubleshooting Pentaho Data Integrator server including platform and Tools issues
Responsible for unit testing their own work and peer reviews where required to ensure accurate completion of development task
Familiar with GIT source code repository for code version management and branching.
Experience with using PDI with relational databases
Technologies - PDI AWS (S3)
General ETL knowledge
Cloudera Apache, Hadoop, hive, Impala, hdfs etc.
Berlin sos Jobscheduler
Vault
Jenkins
Ansible
General Scripting
Mandatory technical skills -
Experience working with Cloudera Hadoop platforms (eg EDH)
Knowledge of the Data Acquisition Ingestion Pipeline (at least good awareness and understanding of the stages the data goes through, so able to pick up and understand how the spreadsheets work)
Good knowledge of Pentaho
Data Integrator development skills
They must have Pentaho experience
Experience with using PDI with relational databases.
Oracle and MySQL desirable. Familiar with GIT source code repository for code version management and branching.
Operational support of system components
Software configuration management/Version control
Software release management/Release management of service improvements
Candidate must be SC Eligible/SC Cleared