Profileimage by Sahishnuta Tosh Data Scientist from Villejuif

Sahishnuta Tosh

available

Last update: 25.05.2024

Data Scientist

Graduation: not provided
Hourly-/Daily rates: show
Languages: English (Full Professional) | French (Elementary)

Keywords

Software Engineering Data Pipeline Artificial Intelligence Airflow Amadeus CRS Amazon Web Services Information Engineering Web Scraping Python (Programming Language) PostgreSQL + 23 more keywords

Attachments

Sahishnuta-ENJ_250524.pdf

Skills

Airflow, Amadeus, AWS, Artificial Intelligence Systems, Data pipeline, Data Pipelines, Deep learning, Docker, FastAPI, Flask, GIT, Heroku, data engineering, Jupyter Notebook, Machine Learning, Matplotlib, MongoDB, NumPy, Plotly, PostgreSQL, PowerBI, Prophet, PySpark, Python, PyTorch, Random Forest, SQL, Scikit-learn, Application Development, spaCy, TensorFlow, web application, web scraping, XGBoost

Project history

01/2023 - 07/2023
Data Science Intern
Amadeus IT Group

* Business Objective: Enhanced the existing passenger traffic forecasting
process that predicts market shares and total revenues.
* Data preprocessing and analysis using PySpark and SQL.
* Supervised ML models such as Random Forest and CatBoost and model
explanation using SHAP.

01/2020 - 08/2021
Data Scientist
Infosys Pvt Ltd

* Business Objective: Build 'Price Prediction' regression model for
recommending quote price for their product wise at the line-item level.
* Data collection using web scraping.
* Data preparation, analysis, preprocessing and created visualizations.
* Developed ML models such as Decision Tree, Random Forest, boosting
algorithms and applied Hyperparameter tuning to optimize the model.
* Value Creation: Infosys won the Multi-Vendor Hackathon for "Request for
Proposal" Project

03/2018 - 09/2019
Data Scientist
L&T InfoTech

* Business Objective: Created 'Package Pricing' model for prediction of the
cost to hospitals.
* Developed a Classification model backed by Regression model and
clustering like Decision Tree, Random Forest, KNN, SVM, XGBoost

* Created a product for precautionary measure to understand if a data acquisition system is behaving as expected
or unusual and therefore seek the attention of data engineering team immediately to investigate and take
immediate actions i.e., developed univariate time series model to predict the amount of data required to be
acquired for multiple business process.
* Deployment of ML models using Docker and Flask
* Value Creation: The client reported getting 7% increase in revenue on launch of this innovative product in Q1 of
2019

04/2015 - 03/2018
Application Development Analyst
Accenture Solution Pvt Ltd

* Business Objective: Close Loop Automation for efficiency improvement.
* Developed and configured an application which identify and validate the users for account creation, maintenance,
and closure of the account in the UK.
* Maintained proper documentation for the solutions, test procedures.
* Trained, and mentored summer student interns on the Automation tools.
* Value Creation: It helped the clients to mitigate the workforce reducing human errors by giving 80% efficiency.


Projects (EPITA)

* Created a Streamlit web application which generates images of child using Pix-to-Pix and Family GAN
* Used Car Prices Predicting System:
* Prediction of the used car prices and deployment of the ML models
* Created Data pipeline from scratch for preprocessing and modeling
* Used Random Forest Regressor to Predict the prices
* Served the model online using FastAPI and Streamlit as webapp,
* MLFlow for model tracking and retraining
* Implemented a data- ingestion and prediction job using Airflow
* PostgreSQL to save the prediction
* Employee Performance Analysis:
* Classification Model to predict the features according to the performance Rating
* Created Data Pipelines for Preprocessing and Building model
* Implemented an End-to-End EDA and Performed Visualization for the Business KPI
* Build MLflow for model tracking and retraining
* Used SHAP library for Model explanation
* Anomaly detection of pump sensor data using heuristic and statistical approach and using Isolation Forest,
Prophet libraries. Also used stationarity testing, and implemented forecasting using AR, Moving Average, ARIMA,
Auto-ARIMA and others

Local Availability

Only available for remote work
Profileimage by Sahishnuta Tosh Data Scientist from Villejuif Data Scientist
Register