Description
Required skills:
- Experience with analysis and creation of data pipelines, data architecture, ETL/ELT development and with processing structured and unstructured data
- Proven experience with using data stored in RDBMSs and experience or good understanding of NoSQL databases
- Ability to write performant Scala code and SQL statements
- Ability to design with focus on solutions that are fit for purpose whilst keeping options open for future needs
- Ability to analyze data, identify issues (eg gaps, inconsistencies) and troubleshoot these
- Have a true agile mindset, capable and willing to take on tasks outside of her/his core competencies to help the team
- Experience in working with customers to identify and clarify requirements
- Strong verbal and written communication skills, good customer relationship skills
- Strong interest in the financial industry and related data.
Will be considered as assets:
- Knowledge of Python and Spark
- Understanding of the Hadoop ecosystem including Hadoop file formats like Parquet and ORC
- Experience with open source technologies used in Data Analytics like Spark, Pig, Hive, HBase, Kafka,
- Ability to write MapReduce & Spark jobs
- Knowledge of Cloudera
- Knowledge of IBM Mainframe
- Knowledge of AGILE development methods such as SCRUM is clearly an asset.
Job description:
- Identify the most appropriate data sources to use for a given purpose and understand their structures and contents, in collaboration with subject matter experts.
- Extract structured and unstructured data from the source systems (relational databases, data warehouses, document repositories, file systems, ), prepare such data (cleanse, re-structure, aggregate, ) and load them onto Hadoop.
- Actively support the reporting teams in the data exploration and data preparation phases.
- Implement data quality controls and where data quality issues are detected, liaise with the data supplier for joint root cause analysis
- Be able to autonomously design data pipelines, develop them and prepare the launch activities
- Properly document your code, share and transfer your knowledge with the rest of the team to ensure a smooth transition into maintenance and support of production applications
- Liaise with IT infrastructure teams to address infrastructure issues and to ensure that the components and software used on the platform are all consistent