Description
Data Manager/Data Wrangler - Databricks, R/R-Studio, Python, Data Management, Data Manipulation - Contract, Remote until normal working resumes.
Responsibilities:
- Curate data from multiple datasets and prepare for analysis by others via a dashboard presentation
- Organise a working research structure within the TRE service environment for practical and easy use, supporting users undertaking research
- Carry out technical validation checks on the linked data sources (eg duplicates, linkage errors)
- Identify appropriate existing code lists and algorithms and apply to derive a set of priority variables from the linked datasets
- Write, organise and curate support documentation for the linked data resources (eg Data dictionaries, variable mapping tables, data access process documentation, Git repositories)
- Anticipate, communicate and solve any potential problems that may arise with data curation for various research projects and use cases
- Be the point of contact for researchers and clinicians to address queries about how to work with the linked data resources
Key Skills:
- Strong experience around Databricks, R/R-Studio and Python.
- Experience working with mega data sets/flows
- Data management and manipulation expertise with a background in one of bioinformatics, biostatistics, computer science, mathematics or statistics.
- Knowledge of commonly used terminologies in health data, such as ICD10 and SNOMED.
- Experience in preparing data extracts for analysis by others, working closely with end users etc.
Data Manager/Data Wrangler - Databricks, R/R-Studio, Python, Data Management, Data Manipulation - Contract, Remote until normal working resumes.