Profileimage by David Hillmann Data Scientist from Lauterstein

David Hillmann

available

Last update: 11.04.2024

Data Scientist

Company: DLT-Innovation GmbH
Graduation: Economics
Hourly-/Daily rates: show
Languages: German (Native or Bilingual) | English (Full Professional) | French (Elementary) | Czech (Native or Bilingual)

Attachments

Coursera-LLM-Certificate_261123.pdf
David-Hillmann-Project-History-2020-03_100424.pdf
DavidHillmannResume-220124_100424.pdf
davidhillmann-Azure-DS-Certificate_110424.pdf

Skills

Many years of extensive experience as a Data Science Consultant
  • External consultant in global operating enterprises
  • Method consulting
  • Programming of prototypes
  • Programming of Data Science / Machine Learning Pipelines
  • Projects in international, English speaking teams
Broad knowledge in Machine Learning tools and application with R and Python for many years
  • Supervised Learning:  Deep theoretical knowledge in classification- and regression methods and long experience in practical use
    • Decision Trees, Random Forests, Gradient Boosted Trees, Support Vector Machines, Deep Learning (ANN, CNN), logistic – and linear regression methods, Ridge Regression, Lasso, Linear Discriminant Analysis, kNN
  • Unsupervised Learning: Extensive experience in application of clustering/pattern recognition of unstructured data
    • DBSCAN, OPTICS, k-means, hierarchical clustering
  • Rule based Learning: Considerable experience with algorithms for learning patterns and derive rules from unstructured data
    • apriori, eclat, FP-growth
  • Dimensionality reduction methods: Experienced with application of
    • Principal Component Analysis, Linear Discriminant Analysis, Non-negative matrix factorization
Good knowledge of Deep Learning concepts and application with TensorFlow & keras
  • Artificial Neural Networks
  • ConvNets
  • RNNs, LSTMs, GRUs
Deep knowledge in statistical methods and modeling
  • Econometric models
  • Time series models (ARMA, ETS)
     
Many years of experience in development of data science pipelines, for proof of concept and implementation in productive systems
  • Data exporting and merging from different sources (HANA; MSSQL)
  • Data preparation, cleaning- and pre-processing for final analysis
  • Data post-processing of final results and insights (automated reporting, visualization, interaktive applications like Shiny or Tableau)

Project history

06/2018 – 11/2018     Inventory Health Forecasting (DAX Enterprise, Industrial sector), DLT
Goals & Questions:
  • Which materials, products or product groups have an increased risk of becoming ‚unhealthy‘ (being too long in inventory)?
  • Which are the main factors of the risk of getting unhealthy?
  • Is it possible to forecast inventory KPI’s of individual products?
  • Can we achieve a satisfactory forecast accuracy?
Role:
  • Self-realiant work on proof of concept, coding and consulting on choosing most powerful and suitable models
  • Collaboration with internal IT colleagues
  • Knowledge transfer on methods and workflow
Methods/Approach:
  • Test of different models and select the best performing with cross validation: Decision Trees, Gradient Boosted Trees, Support Vector Machines and Neural Networks
  • Implement automated method for hyperparameter optimization with bayesian optimization and grid seach techniques
  • Developing a memory efficient machine learning pipeline, including automated optimization techniques to allow run by non-technical users
Tools: R (packages: mlr, caret, xgboost, rpart), Python (libraries: sklearn), MSSQL, SAP HANA
Results:
  • Inventory risk KPI’s can be forecasted with excellent accuracy by tree ensemble models one and three months in advance
  • Individual driving factors can be filtered from the applied models for each product forecast (avoiding black-box problem)
  • Providing an automated machine learning pipeline which is implemented into a productive system as an early warning system of products at risk
03/2018 – 10/2018     Inventory Management (DAX Enterprise, Industrial sector), DLT
Goals & Questions:
  • Analysis of decisions taken by inventory management planners on products that remain in stock for too long
  • Is there correlations between certain decisions, like fireselling products or allocate them to alternative uses, and product features or additional factors? Do patterns exist that could be exploited in predictive models?
  • Is it possible to build a predictive model that accurately predicts decisions?
Role:
  • Self-realiant work on proof of concept, coding and consulting on choosing most powerful and suitable models
  • Collaboration with internal IT colleagues
  • Knowledge transfer on methods and workflow
Methods/Approach:
  • Test of different models and select the best performing with cross validation: Artificial Neural Networks, Random Forests, Gradient Boosted Trees, Support Vector Machines
  • Implement automated method for hyperparameter optimization with bayesian optimization and grid seach techniques
  • Developing a memory efficient machine learning pipeline, including automated optimization techniques to allow run by non-technical users
Tools: R (packages: mlr, caret, xgboost), Python (libraries: Tensorflow with keras, sklearn), MSSQL, SAP HANA
Results (on-going project):
  • A selection of ensemble models performs well in predicting decisions
  • Machine learning pipeline is being impemented in a productive system
  • Automated classification models are provided to planners to help them with recommended decisions and support their decision making
07/2016 – 11/2018     Growth Finder, Recommender System follow-up project (DAX Enterprise, Industrial sector), DLT
Goals & Questions:
  • Can recommendations be further improved by more advanced machine learning methods?
  • Which performance can be achieved with different models, from simple and intepretable to very advanced and rather black-box models?
  • Is it possible to improve model performance by adding additional features into more advanced models?
  • Are more performant models black-boxes or can explanation/interpretation methods be applied?
  • Generate ready to use R packages to apply different business scopes
Role:
  • Consulting on method selection and coding embedded in a data science team
  • Close interaction with peers and communication with stakeholders and internal IT staff to share knowledge and enable them to use the methods
  • Coaching of internal IT colleagues to enable them to adopt the methods in practice and transfer knowledge to peers
Methods/Approach:
  • Testing of several machine learning algorithms to generate recommendations: Decition Trees, Gradient Boosted Trees, Convolutional Neural Networks
  • Evaluate tested models in cross validation settings and implement into existing algorithmic package
  • Developing an automated pipeline from data import from databases over pre-processing and modeling to post-processing with prepared recommendations and if possible explanation layers
  • Developing memory efficient functions and implement them in an R package for all pre- and post-processing steps for each implemented method
  • Further development of an algorithmic R package based on S4 framework from recommenderlab to implement enseble methods and neural networks
Tools: R, Tensorflow mit keras (R), recommenderlab (R), xgboost (R), rpart (R), Python, SAP HANA, R Markdown
Results (project on-going):
  • Provided workflow for generating product recommendations enables training with large amounts of data, which improves performance of most models significantly
  • The developed R packages and the prepared workflow are used by internal IT department to supply recommendations to business units
  • Successful trials and positive feedback from large business units and operating regional departments
07/2017 – 03/2018     Process Minining (DAX Enterprise, Industrial sector), DLT
Goals & Questions:
  • Analysis of delivery reliability (DR) in order to discover potential reasons for delayed delivery on grouped product level
  • Is it possible to identify recurrent patterns which are associated with a low DR?
  • Evaluation of potential improvement of the DR and other KPIs
  • Which factors are associated with a drop in DR in certain periods?
Role:
  • Support an internal Data Science team to set up a data mining pipeline (on-site and remote) and enhance existing methods
  • Collaboration with peers and communication and presentation to business stakeholders from supply chain management
Methods/Approach:
  • Select appropriate machine learning methods to discover rules
  • Cluster rules into problems with similar context to ease analysis by business domain experts. Evaluation of different clustering algorithms and distance measures, like dbscan, k-means, hierarchical clustering
  • Connect existing pipeline to databases to automate data import and export of results
  • Test additional features in modeling to improve derived rules: feature selection/engineering
  • Application of causal inference algorithms (e.g. causal trees) to extract and evaluate importance of time dependant factors
  • Enhance an interactive platform (R Shiny) to ease its application by business experts with non-technical backgounds
  • Proper documentantion of the whole workflow
Tools: R, arules (R), dbscan (R), SAP HANA, SQL, R Shiny, R Markdown, Jupyter
Results (on-going):
  • Provide an R package that enables an easy to use pipeline to discover rules in an automated way
  • Provide an enhanced version of an interactive platform to be utilized by non-technical users
  • Successful test of the platform in one business unit
  • Evalutation of the results in other business units (on-going)
03/2015 – 06/2016     Recommender System (DAX Enterprise, Industrial sector), STAT-UP
Goals & Questions:
  • Segmentation/clustering of customers according to volume of sales, potentials and purchasing behaviour
  • Proof of concept: Is it possible to derive patterns from purchasing behaviour from historical data and to utilize them to create product recommendations?
  • If possible: How good is the accuracy of recommendations and how high are the potentials in terms of sales and contribution margins?
Role:
  • Support as external consulant for conceptual work, methods selection and coding (on-site with remote periods)
  • Close collaboration with peers from Data Science, Machine Learning and IT
  • Frequent communication of results to peers and internal management consulting colleagues in order to spread the concept in the enterprise
Methods/Approach:
  • Clustering methods to group customers on customer features, sales and purchasing patterns
  • Selection of appropirate algorithms to discover patterns of frequently purchased products (itemset mining) and to derive rules (association rule mining)
  • Post-processing and interpretation of results for stakeholders
  • Create interactive platform in Shiny to visually explore and explain purchasing patterns, rules and recommendations
Tools: R, arules + arulesViz (R), ggplot2, SAP HANA, SQL, R Shiny, R Markdown
Results:
  • High potential as measured by expected contribution margins if a fraction of recommendation are realized
  • Test and validation of derived rules and recommendations of sales force of several business units in practice
  • Application of the developed interactive platform to
    • Discover interesting patterns and rules
    • Reationalize and explain recommendations
    • Prioritize recommendations according to expected success and contributin margin potential
    • Present and spread idea within the enterprise in a workshop
06/2014 – 12/2017     Price Outlier Analysis (DAX Enterprise, Industrial sector), STAT-UP
Goals & Questions:
  • Analysis of statistical outliers to determine pricing opportunities for marketing and sales force
  • Which factors have signigicant influence on pricing?
  • Is it possible to determine statistical price outliers in an automated way?
  • How high is the potential in terms of contribution margins if you could avoid prices that the model identified to be too low?
Role:
  • External support for development of proof of concept, coding and consulting on choosing most powerful and suitable methods
  • Collaboration with internal peers and permanent communication with and presentations to business stakeholders
Methods/Approach:
  • Develop several linear pricing models with intertion effects on a granular level to get benchmarks and discover pricing opportunities
  • Interpretation and visualization of model estimates in business context
  • Development of an interactive platform for reporting and monitoring in marketing & sales
Tools: R, R Shiny, R Markdown, SAP HANA, SQL, ggplot2
Results:
  • Application in several business units trying to exploit willingness to pay of customers from pricing insights
  • Provision of an interactive tool for marketing & sales staff
05/2014 – 11/2014     Benchmark analysis for sales forecasts (BASF SE), STAT-UP
Goals & Questions:
  • Model-based estimation of sales forecast accuracies in order to set up benchmarks of methods in use (Demand Planning follow-up project)
  • Can we identify factors that significantly improve or deteriorate forecasting accuracies?
Role:
  • Self-reliant development of proof of concept, coding and consulting on choosing most suitable method
  • Communication and presentation of results to peers at BASF
Methods/Approach:
  • Literature research to find most suitable method for out setting
  • Test and compare appropriate methods like PCA and linear regression
  • Test methodology on enterprise data
Tools: R, ggplot2
Results:
  • Forecast accuracies can be reliably estimated with a few extracted factors which enables creating useful benchmarks
  • Benchmarks can be applied and compared accross methods, business units and planners to identify over- and underperformers in an automated way and discover room for improvement
04/2014           Hierarchical Forecasting (BASF SE), STAT-UP
Support in project aiming at generating and visualization of hierarchical sales forecasts.
Role: Support of Data Science Colleagues in coding for reports and visualization
Tools: R, ggplot2, Excel
11/2013 - 03/2014      Demand Planning (BASF SE), STAT-UP
Goals & Questions:
  • Can sales quantities be forecasted  on product level with satisfactory accuracy?
  • With the induced forecasts, is it possible to reduce storage costs and improve just-in-time production goals?
  • With implementation of modern state-of-the-art methods, can forecasting accuracy be significantly improved compared to methods already in use?
Role:
  • External support for development of proof of concept, coding and consulting on choosing most powerful and suitable methods
  • Collaboration with internal technical colleagues and permanent communication with business stakeholders
Methods/Approach:
  • Estimation of time series models with historical sales data
  • Implementation and validation of statistical forecasts with several time series modeling approaches (ARMA, ETS, Croston) and development of an R package in order to automate sales forecasts
Tools: R, forecast (R package), Excel
Results:
  • Provide a ready to use R package for automated sales forecasting and visualizations
  • Provide a methodology for forecast accuracy benchmarking, enabling comparisons of methods over time and business units and show room for improvement
2013    Demographic study on data from Switzerland (CIELO-MATH AG), STAT-UP
Consulting on methods and analysis of socio-demographical factors.
Methods/Approach: Hypothesis testing, econometrical models, causal inference with instrumental variables, visualization of aggregated data
Tools: R, ggplot2
2012    Portfolio evaluation with Probabilistic Utility Models, STAT-UP
Collaboration on a research project with Prof. Walter Krämer, Technische Universität Dortmund
2012 - 2014    Clinical study of cancer patients (Klinik, Düsseldorf), STAT-UP
Support of a statistical evaluation of different surgical methods, conducting statistical tests and assessment of early diagnosis and treatment effects.
Methods/Approach: survival analysis, statistical hypothesis tests, use of parametric and non-parametric methods, logistic regression, ROC, AUC
Tools: SPSS, R, Excel
2012 - 2013    Study on behalf of a german HR association: Bundesverband der Personalmanager, STAT-UP
Provide support for an analysis of potential determinants of job satisfaction of personnel managers.
Methods/Approach: econometric modeling, causal inference
Tools: R, SPSS
2012 – 2013   Study of graduate students (Landkreise Altötting und Mühldorf a. Inn), STAT-UP
Create concept  for questionnaire, finalize a ready to implement questionnaire, implement in an online tool. Support colleagues with data analysis and interpretation and in writing a final report for a public local institution.
Methods/Approach: analysis of correlations, statistical hypothesis testing and regression analysis
Tools: R
2012    Small aircraft market research (Diamond Aircraft Industrial sectors GmbH), STAT-UP
Support in a project for generating market- and sales forecasts for products of Diamond Aircraft Industrial sectors and of its relevant competitors on basis of internal sales data and sector databases. Provide interpretation and derive recommended actions for sales force.
Methods/Approach: time series forecasting
Tools: R, Excel
2012    Study for modeling sales in medical technology, STAT-UP
Provide support for a study analyzing economic factors and cyclical leading indicators on demand for certain mecial implants with time series models allowing for epidemiological and demographic factors.
Methods/Approach: econometrical modeling, time series models
Tools: R

Local Availability

Open to travel worldwide
available from 22. April 2019, 100%
Profileimage by David Hillmann Data Scientist from Lauterstein Data Scientist
Register