Kaggle Housing challenge, my take

In this article, I’m doing the Kaggle Housing challenge, which is probably the second most popular after Titanic. This was very much a “keeping track of what I’m doing for learning/my own sake” thing, but by the end I’ve gotten a ranking of 178/5419 on the public leaderboard (LB). That said, this isĀ super long because I tried a million things and it’s kind of a full log of my workflow on this problem.

I’ve really learned a bunch from going through this very carefully. What I did here was to try the few techniques I knew when I started, and then I looked at notebooks/kernels for this challenge on Kaggle. A word on these kernels: even the very most top rated ones vary in quality immensely. Some are excellently explained and you can tell they tried different things to try and get an optimal result. Others are clearly people just trying random stuff they’ve heard of, misapplying relatively basic techniques, and even copying code from other kernels. So I viewed these as loose suggestions and guideposts for techniques. read more

Grouping IMDb top movies by runtime

Howdy!

This is a fun lil one. For an upcoming article, I need to know a list of (hopefully good) movies I haven’t yet seen, with similar runtimes. Now, I could have just scrolled down the list of IMDb.com’s top 250 movies, ctrl + clicking on the ones I haven’t seen, and then compared them by eye, because, to be honest, I think I’ve seen many (/most?) of them (we’ll see shortly). read more

Dimensionality reduction via Principle Component Analysis in python on face images

Hey there! It’s been a while since I wrote anything other than stuff about travel (oh, don’t you worry, there’s still more of that coming!), so it feels good to write about something like this.

Right now, I’m almost finished with the Andrew Ng Machine Learning course on Coursera. Maybe I’ll write about it sometime, but it’s really, really solid and I’m learning a lot. He’s pretty great at explaining concepts and the course is constructed pretty well. What I really like is that, for the assignments, he’ll take the concept from that week and demonstrate a really interesting application of it (even if it’s a little contrived and may not actually be a practical use for it). Either way, it just gets me to think about the breadth of what this stuff can be applied to. read more

Getting back on the horse…er, Python

As of this writing, I just defended and I’m considering various options for what I’ll do next. That’s a whole other story, but the important part for this post is that, probably for whatever I do, I’ll be coding.

I’ve coded a decent amount in my life. I started with dinky web stuff wayyy back, then picked up a now-tattered and probably outdated “C++ for Dummies” book in highschool. I did small programs with that, as well as some silly things for crappy AVR projects I did. In college, I used mostly Java because that’s what the computer science classes I took asked for. Towards the end of college, though, I was working on my own research, and used C++ instead (why? I honestly don’t remember. Wait, I just did! My advisor had heard of some multiprocessor module for C++ that he wanted me to try, so that’s why I didn’t stick with Java). read more