Data Manager/Data Wrangler

London

‐ Onsite

This project has been archived and is not accepting more applications.
Browse open projects on our job board.

Keywords

Support Python Analysis Documentation Mathematics git Manager

Data Manager/Data Wrangler - Databricks, R/R-Studio, Python, Data Management, Data Manipulation - Contract, Remote until normal working resumes.

Responsibilities:

Curate data from multiple datasets and prepare for analysis by others via a dashboard presentation
Organise a working research structure within the TRE service environment for practical and easy use, supporting users undertaking research
Carry out technical validation checks on the linked data sources (eg duplicates, linkage errors)
Identify appropriate existing code lists and algorithms and apply to derive a set of priority variables from the linked datasets
Write, organise and curate support documentation for the linked data resources (eg Data dictionaries, variable mapping tables, data access process documentation, Git repositories)
Anticipate, communicate and solve any potential problems that may arise with data curation for various research projects and use cases
Be the point of contact for researchers and clinicians to address queries about how to work with the linked data resources

Key Skills:

Strong experience around Databricks, R/R-Studio and Python.
Experience working with mega data sets/flows
Data management and manipulation expertise with a background in one of bioinformatics, biostatistics, computer science, mathematics or statistics.
Knowledge of commonly used terminologies in health data, such as ICD10 and SNOMED.
Experience in preparing data extracts for analysis by others, working closely with end users etc.

Data Manager/Data Wrangler - Databricks, R/R-Studio, Python, Data Management, Data Manipulation - Contract, Remote until normal working resumes.

To apply to this project you must log in.