* IBAN FRAUD DETECTION :
1. Stakes: Several million euros of damage due to fraud via the addition of fraudulent
IBANs explained mainly by phishing.
2. Objectives : Development of a solution using Machine Learning to identify risky
IBANs using customer information, IBAN history, banking transactions and connection
data. The business expectation 1) a decrease in the false positive rate 2) an increase
in coverage.
3. Approach adopted: A binary classification was used. The Random Forest model
retained as optimal.
4. Result and contribution of the project: The coverage went from 20% to 32% and the
rate of false positives from 40% to only 5%. This met the expectations of the business.
The solution was already in place and based on traditional IT rules has deteriorated
over time. Our solution based on data science and more consistent featurization
with data updating and automated performance score control to avoid possible
detection degradation.
5. Deployment: The solution was developed in Python and made available to the
client under a web application developed with the FLASK framework. The graphical
interface allows the customer to intuitively consult the detection results with a daily
data refresh
6. The team was made up of a completely independent consultant on this project
and a project manager.
7. Technologies used: Python, Jupyter, Teradata, SQL, Dash, Flask, Scikit-Learn, Pandas.
* Video analysis. Confidential R&D project
1. Objective : The project consists of in analyzing the visual content of the videos and
extracting the information allowing the identification of abnormal behavior.
2. Tasks performed: Automatic anonymization of faces in the video.
Motion and object detection. Estimation of time spent by a person in front of a target,
Tracking, Etc. The confidentiality of the project requires us not to give more details on the
project.
3. Models and technologies used: Pre-trained Deep Learning models have been adopted
for this need. Namely: YoloV3 and MobilNet-SSD, ResSSD. OpenCV: Computer Vision's
Python reference library. The algorithm has been implemented in a GPU environment.
4. The team was made up of a completely independent consultant on this project and a
project manager.
* Robotization of the analysis of credit files
1. Project objective: Design of an automation solution for processing loan application files.
2. The team: 2 consultants and a project manager
3. My tasks:
a. Automated scraping of financial reports on the sites of the establishments
concerned.
b. Extraction of structured data using the OCR technique.
4. Models and technologies used: Python, Beautiful Soup, OCR, Tesseract, OpenCV.
* Analysis of Check Images (R&D project):
1. Objective: Development of a model using the automatic analysis of check scans
intended to lift, or not, the cashing reserve for check remittances placed in reserve.
2. The team was made up of two consultants and a project manager.
3. My tasks:
- Recognition and extraction of handwriting on checks.
- Creation of a dataset of images of numbers and handwritten words
- Similarity tests with target images
4. Models and technologies used: Python, Azure, OCR, Deeplearning, Autoencoders,
OpenCV, Tensorflow, Keras.
* E-reputation (NLP project)
1. Objective: Identification and analysis of online content referring to BPCE and its
subsidiaries
2. The team was made up of a consultant, 2 trainees and a project manager.
3. Tasks:
- Web scraping of target content from press and social media sites
- Exploration and analysis of text content
- Sentiment analysis and criticism detection
- Categorization of content into topics
4. Technologies used: Python, Azure Databricks, Jupyter Notebook, NLP