Data Engineer with Hadoop and ETL Banking Brussels

Brussel

‐ Onsite

This project has been archived and is not accepting more applications.
Browse open projects on our job board.

Keywords

Databases Linux Support Python SQL Big Data Hadoop Analysis etl Design

Description

NK088 Data Engineer with Hadoop and ETL Banking Brussels

Background

The department ensures the Bank's competitiveness by delivering reliable and sustainable IT solutions for the financial securities markets.
Our technical teams deliver new IT solutions and improve existing applications for both our internal and external clients. We deploy changes into the production environment in a controlled and structured way that doesn't compromise production stability and we ensure applicative production support.

With the department the Big Data Analytics team supports the needs for advanced analytics from all the entities of the banking Group. As a competency centre for analytics, the team helps to transform data into insight using techniques such as text mining, process mining, network analytics or predictive modelling.

The team is currently looking for a Data Engineer whose core objectives will be:
Collect, clean, prepare and load the necessary data - structured or unstructured - onto Hadoop, our Big Data analytics platform, so that they can be used by the data scientists to create insights and answer business challenges
Act as a liaison between the team and other stakeholders and contribute to support the Hadoop cluster and the compatibility of all the different software that run on the platform (Spark, R, Python)
Experiment new tools and technologies related to data extraction, exploration or processing (eg. OCR engines)
Depending on his/her skills, the new data engineer may also be involved in the analytical aspects of data science projects

Role is to:
Identify the most appropriate data sources to use for a given purpose and understand their structures and contents, if necessary with the help of SMEs
Extract structured and unstructured data from the source systems (relational databases, data warehouses, document repositories, file systems), prepare such data (cleanse, re-structure, aggregate) and load them onto Hadoop.
Actively support data scientists in the data exploration and data preparation phases. Where data quality issues are detected, liaise with the data supplier to do root cause analysis
Where a use case is meant to become a production application, contribute to the design, build and launch activities
Ensure the maintenance and support of production applications (watch duty)
Liaise with teams to address infrastructure issues and to ensure that the components and software used of the platform are all consistent
Where the skills allow for it, perform advanced data analysis on a selection of business use cases, supported by data scientists

Skills required:

Experience with understanding and creating data flows, with data architecture, with ETL/ELT development (MS SQL Server SSIS, Datastage, ) and with processing structured and unstructured data
Proven experience with using data stored in RDBMSs and experience or good understanding of NoSQL databases
Ability to write performant SQL statements
Understanding of the Hadoop ecosystem including Hadoop file formats like Parquet and ORC
Experience with open source technologies used in Big Data analytics like Spark, Pig, Hive, HBase, Kafka,
Ability to write MapReduce & Spark jobs
Knowledge of Cloudera
Ability to analyze data, to identify issues like gaps and inconsistencies and to do root cause analysis
Knowledge of Java
Experience with Linux Redhat and Linux Scripting
Experience delivering scripts
Experience in working with customers to identify and clarify requirements
Ability to design solutions that are fit for purpose whilst keeping options open for future needs
Strong verbal and written communication skills, good customer relationship skills

Will be considered as assets
Knowledge of R, Python and Scala
Knowledge of IBM Mainframe
Knowledge of or experience in classic and new/emerging Business Intelligence methodologies
Knowledge of statistics, data mining, machine learning and predictive modelling, data visualization and information discovery techniques

Ref:NK088

Location: Brussels

Language: English

Rate: 400 euros per day

Duration: 6 months +

Start date: ASAP
Duration: 6 months
From: Computer Recruitment Services
Published at: 12.08.2016
Project ID:: 1184797
Contract type: Freelance

To apply to this project you must log in.

Data Engineer with Hadoop and ETL Banking Brussels

Keywords

Description

Report project

Recommend this project

Application limit reached

Welcome to freelancermap!