The Africa Soil Property Prediction Challenge

Brian Naughton // Mon 22 September 2014 // Filed under data // Tags open data kaggle data science

(Or, How To Kaggle)

The Africa Soil Property Prediction Challenge is a Kaggle competition where you are supposed to try to predict various soil measurements (like Calcium levels) in various parts of Africa using infrared spectroscopy readings. Seems like a worthwhile thing to do!

The Kaggle forums are a great place to pick up information on some of the practicalities of applied machine learning. In that spirit I thought I would share the (moderately successful) IPython/scikit-learn code I used.

Here is the notebook of code I used to make my predictions: (HTML version, IPython notebook.) When I originally ran this code it was in the top 10% of entries, but now it's way down the leaderboard. The winning regression method in my tests, by quite a distance, was Support Vector Regression.


Boolean Biotech © Brian Naughton Powered by Pelican and Twitter Bootstrap. Icons by Font Awesome and Font Awesome More