Python Developer - Python Data Engineer

London  ‐ Onsite
This project has been archived and is not accepting more applications.
Browse open projects on our job board.

Description

Core Consultants are a fresh, innovative start-up consultancy that is applying a different approach to how consulting is sourced and delivered. Our Core is made up by former management consultants from some of the world's leading data consultancies, brought together by a shared sense of frustration that ego's, the bottom line and outright stupidity were stopping them from delivering real innovation and creativity to clients using data.

Our offering is simple, we are committed to delivering the most value to our clients.

Due to winning several new engagements, we are currently looking to aggressively grow our associate team. As a consequence, we are looking to recruit two experienced Python Developers/Data Engineers to join one of out project teams within one of the UK's foremost Data Innovation teams within this leading Global Services Company.

Overview of the role

We are looking for an experienced Python developer/data engineer to stand up fully configurable (ie database driven) data ingestion pipelines orchestrated by Apache Airflow that takes any input source (filestream, external API, s3 bucket), runs validations against the source and loads the cleaned output to a defined target endpoint (Sql server, Network share, data streaming broker)

You will catalogue and tag data ie create a metadata store to better guide data readiness and business decisions on client submitted data for a given time period (year, quarter, month )

You will be a self starter someone comfortable with problem solving and capable of finding out the answer to a problem by researching and reading around the subject. Focusing on the abstraction of the problem and most importantly understand the difference between data and metadata.

Experience

- 3+ years Python + pyTest, UnitTest, numpy, pandas

- Understanding of functional/modular development

- TDD

- Schema/Database design

- Pipelines: ETL/ELT (and understanding when to use one over the other)

- Storage: Relational (RDBS) vs Non-Relational (NoSql)

- Formats: Xml, Json, Parquet, Excel, csv, txt

- Git/SourceControl

Desirable/Awareness of

- Airflow

- DevOps CI/CD - using, not deploying

- Docker

- previous ETL experience - eg SSIS

- Spark

- Rabbit

    Start date
    ASAP
    Duration
    12 months
    From
    Core Consultants
    Published at
    14.01.2022
    Project ID:
    2290878
    Contract type
    Freelance
    To apply to this project you must log in.
    Register