Description
My client is an IT Global Consultancy who are onsite with one of the largest banks in Ireland and they are looking for a Big Data Engineer to join them on an initial 6 month contract.
Skills:
- Hadoop-L3 (Mandatory)
- Scala programming-L3
- Apache Spark-L3
Good understanding of
- Data Formats: Parquet, Avro, Json, CSV, XML, OR, Hadoop Architecture, Distributed parallel processing concepts, Semi and unstructured data handling, Real Time processing, No SQL and Data Lake creation
Hands on technical skill set.
- Java, Python Sqoop, Scala, Devops, Jenkins, Bit bucket, Sonar Qube Spark, Kafka, hive, Hbase
The key accountabilities/responsibilities are as follows:
- General Programming (regex, functions, loops, data structures).Big Data architecture mind-set eg data can be de-normalised on Hadoop it is not such an issue because storage is cheap
- Data formats and Big Data formats (Text file, avro, json, XML, etc.) .SPARK - Scala (could be Java or python I guess) .Bash/Shell Scripting (Job scheduling) - this could be python and is OS agnostic so might be better choice.
- Hadoop File System Shell (File Movement) understanding of Javascript (host job configuration details).ETH (Jenkins/NEXUS/Maven/GIT/JIRA).
- High level Groovy Scripting (Used primarily for Jenkins deployments) (our pipeline is still in design will have completed in next few weeks).
- Hive/Big SQL (these are different to Teradata)(TEXTFILE, SEQUENCEFILE, RCFILE, AVRO, ORCFILE, PARQUET).
- SQL Hbase (will need in Real Time solutions).Kafka (will need in Real Time solutions) Cyber arc (security) .Remedy(need to raise tickets before going live with changes)Python/Notebooks Data profiling solution currently