Description
Data Engineer - Hadoop/ETL/SQL/Spark/Scala
For a prestigious customer, we currently searching for a Big Data Engineer to join their cutting-edge environment for a year long contract with the visibility for extension in Brussels, Belgium!
Main Objectives
- Collect, clean, prepare and load the necessary data onto Hadoop
- Act as a liaison between the team and other stakeholders and contribute to support the Hadoop cluster and the compatibility of all the different software that runs on the platform (Spark, R, Python)
- Experiment new tools and technologies related to data extraction, exploration or processing (eg. OCR engines)
Job Description
- Identify the appropriate data sources to use and understand their structures and contents
- Extract structured and unstructured data from the source systems prepare the data and load them onto Hadoop
- Actively support data scientists in the data exploration and data preparation phases. Assist with data quality issues and do root cause analysis
- Where a use case is meant to become a production application, contribute to the design, build and launch activities
- Ensure the maintenance and support of production applications
- Ability to write MapReduce & Spark jobs
- Ability to analyze data, to identify issues like gaps and inconsistencies and to do root cause analysis
- Experience in working with customers to identify and clarify requirements
- Strong verbal and written communication skills
- English speaking site
Tech Stack
- Hadoop
- Spark
- Kafka
- Python
- Scala
- ETL
- SQL
- HBase
Beneficial, but not critical:
- Knowledge of Cloudera
- Experience with Linux and Shell Scripting
- Knowledge of Java
- Knowledge of statistics, data mining, machine learning and predictive modelling and data visualization.
If you have right skill-set and are interested in applying, please send over your up to date CV for immediate consideration.