Data Leakage, Part I: Think You Have a Great Machine Learning Model? Think Again

Waqas Habib October 29, 2018

Got insanely excellent metric scores for your classification or regression model? Chances are you have data leakage.In this post, you will learn:What is data leakageHow to detect it andHow to avoid itYou were presented with a challenging problem.As a driven, gritty, aspiring data scientist, you used all tools that were within your reach.You gathered a reasonable amount of data. You have got a considerable amount of features. You were even able to come up with many additional features through feature engineering.You used the fanciest possible machine learning model. You made sure your model didn't overfit. You properly split your dataset in training and test sets.You even used K-Folds validation.You had been cracking your head for some time, and it seems that you finally had that "aha" moment.Chances are data leakage took on you.You were able to get an impressive 99% AUC (Area Under Curve) score for your classification problem. Your model has outstanding results when it comes to predicting labels for your testing set, properly detecting True Positives, True Negatives, False Positives and False Negatives.

✔ Read More...

I guess you came to this post by searching similar kind of issues in any of the search engine and hope that this resolved your problem. If you find this tips useful, just drop a line below and share the link to others and who knows they might find it useful too.

Stay tuned to my blog, twitter or facebook to read more articles, tutorials, news, tips & tricks on various technology fields. Also Subscribe to our Newsletter with your Email ID to keep you updated on latest posts. We will send newsletter to your registered email address. We will not share your email address to anybody as we respect privacy.

This article is related to

crossvalidation,towards-data-science,machine-learning,dataleakage

Data Leakage, Part I: Think You Have a Great Machine Learning Model? Think Again

Posted by Waqas Habib

Post a Comment

0 Comments

Subscribe Us

Facebook

Tags

Categories

Footer Menu Widget

Contact form

Data Leakage, Part I: Think You Have a Great Machine Learning Model? Think Again

Posted by Waqas Habib

You may like these posts

Post a Comment

0 Comments

Social Plugin

Subscribe Us

Facebook

Tags

Categories

Footer Menu Widget

Contact form