Parul S. -Data Scientist | Machine Learning | Python Developer

ABOUT ME

As a highly motivated and hardworking Data Scientist.
Coming to my work, currently I have developed algorithm for biometric authentication
which has features like face recognition, speaker recognition, liveness detection, gender
detection etc. I have also done a research project titled “Speaker Recognition” where I
developed a new technique to increase the accuracy of Speaker Recognition. I have also
worked on credit underwriting model to provide loans to customers with 86% accuracy for
logistic regression model for the given testing and validation data.

I have passion for data science and machine learning.
The process that I follow before making machine learning models:
1) Data fetching
2) Data cleaning(checking if data has missing values, if dataset is very much diverse with very less event rate in case of supervised learning )
3) Feature engineering(Good features are nucleus of a good machine leaning model)
4) Check if generated features lead to data leakage.
5) PCA
6) Visualizing data points in 2-d plane use tSNE
7) Plotting graphs and checking data patterns
8) Checking correlation between different features to make sure there is no data redundancy.
9) Checking partial dependency of various features to increase explainability of the model.
10) Trying different models and tuning the hyperparameters using grid search approach
11) Checking model performance using Confusion matrix, Gain and lift chart, AUC, Cross validation.

For version control I use Github and Bitbucket (Open to changes)

I have a thing to use boosting methods to get the best accuracy without overfitting so most of the times you will find me using XGBOOST or LightGBM.

Other than these I am very comfortable working with following models as well: Logistic Regresssion, PCA, SVM, Naive Bayes ,Clustering, KNN, Random Forest, LSTM, RNN, DBN, RBM, AR, MA, ARMA and ARIMA models.