samedi 19 septembre 2015

Facebook : Find the robots



Each year, Facebook using Kaggle platform to find their future data scientist.


The best competitor participants could pass a job interview. The interest of this contest is the members of kaggle couldn't exchange trick and models, its imply that your rating is the real rating and not a copy/paste program, method process.


We must find if a bidder is a human or a robot with bidding data provided by Facebook.
The score is an area under the ROC curve which quantifies the overall ability of the test to discriminate between human and robot.

At the end of the competition, my score is a 0.92, near to the best score of 0.94. My rank is 198 of 989, not bad.

lundi 18 mai 2015

Using python to build a CART algorithm

In this article, I described a method how we can code CART algorithm in python language. I think it is a good exercise to build your own algorithm to increase your coding skills and tree knowledge. It's not the fastest algorithm implementation but it's enough to understand CART and object oriented programming.


Cart algorithm

In 1983, Breiman et al first described CART algorithm like classified and Regression Tree. It could be used for regression and classification of binary target.
It's a recursive algorithm, at each iteration it finds the best splitting of data which could increase the probability of predicting the target values.


Tree definition

A tree is composed of nodes. Each node could have one, two children or could be a leaf node, a node with no children.

In a list, I stored each object Node.
Tree=[node1, node2, node3, node4….]