Getting back on the horse…er, Python

As of this writing, I just defended and I’m considering various options for what I’ll do next. That’s a whole other story, but the important part for this post is that, probably for whatever I do, I’ll be coding.

I’ve coded a decent amount in my life. I started with dinky web stuff wayyy back, then picked up a now-tattered and probably outdated “C++ for Dummies” book in highschool. I did small programs with that, as well as some silly things for crappy AVR projects I did. In college, I used mostly Java because that’s what the computer science classes I took asked for. Towards the end of college, though, I was working on my own research, and used C++ instead (why? I honestly don’t remember. Wait, I just did! My advisor had heard of some multiprocessor module for C++ that he wanted me to try, so that’s why I didn’t stick with Java).

I didn’t code a ton for my first couple years of grad school. When I began again, I don’t remember exactly how, but I got into using Mathematica (I think because I had seen a few people do what seemed like ~~magick~~ at the time during my first year of grad school; stuff I stupidly spent a while doing with pencil and paper).

Oh, Mathematica (M). What a blessing and a curse. Let me just briefly tout its blessings: it’s very fully featured. A common response I’ve gotten is “but you can do <<XYZ thing>> in any language!”, and that’s usually true — but it’s not always really easy, like it is with M. The documentation (with a few rare exceptions) is pretty excellent. What I (and I suspect most users) want most of all in a manual/doc page is examples. It drives me nuts when I go to the man page for a bash command, and it gives the command syntax in some general form; yeah, I can figure it out if I spend a few minutes, but why make me waste time parsing some command syntax? M gets this, and if you look at the doc for a function, there’s a really solid chance that the exact application you want is already one of the examples and you can just copy and paste.

The other thing is that, because it’s all part of a central program (aside from using user-generated packages, which I’ve almost never done), it follows the same syntax, is generally coherent, and works together. I’ve just been amazed time and time again when I’ve wanted to do something fairly complex, googled “Mathematica <<complex thing>>”, and found that there’s already a pretty fully featured set of functions for it: graph theory stuff, FEA stuff, 3D graphics, etc.

Here’s the thing: a lot of this is essentially just lauding the benefits of any huge, expensive piece of software. Almost all of the things I just said would apply equally well to something like Adobe Photoshop: thorough, well documented, easy to use, etc.

And this brings me to the curse of M: it is a great piece of software in many respects, but it’s proprietary and huge. The proprietary part is a big deal, because a company or school has to pay for a large number of potential users, and site licenses can be really expensive (I tried to find a number but all they have online is “request a quote”; anyone have a rough estimate?). So this eliminates a lot of potential users, like startups that was to be “lean” or whatever. Additionally, I’m guessing that for a company, having a ton of their framework depending solely on another company is something they’d like to avoid, if possible.

Briefly looking into a few career options (data science is the best example of this) and talking to people, I quickly realized how un-used Mathematica is outside of academia. I’m sure there are some users, but it’s definitely not something employers are looking for, from what I gather.

Data science seems like a very possible route to take, so I was looking into the most commonly used languages in it, and the consensus seems to be: Python and R. I went with Python for a few reasons: 1) a couple videos said that if you’re starting new to both (which I essentially am), go with Python, 2) to contradict that first point, I’m actually not starting totally fresh with Python; my experience with it is definitely minimal but I’ve used it a tiny bit, and 3) it seems like, and correct me if I’m wrong here, Python is used for lots of applications outside of data science/stats, such as application building, machine control, etc, whereas R isn’t (true? eh?).

So I’m getting back on the Python. I’m a fairly firm believer that the best method to learn a coding language (or maybe anything, really) is to just start using it. Pick a task, and try doing it. When you run into something you don’t know how to do, Google it.

(Obviously, this doesn’t work at extremes; at some level of ignorance of a subject you simply don’t know what questions you should be asking. But by now, I’ve used bits of enough languages to know concepts that tend to span all languages, to search for.)

The thing I’m starting with is good ol’ Project Euler. If you’re not familiar with it, it’s a series of coding challenges that start very simple and get harder. For example, they list the number of people who have successfully done each problem. The first few problems are in the several hundred thousand range; the later ones are in the ~100 range (you could argue that that’s more about most people just not being that into spending a decent amount of effort on a task with essentially no outward reward, but they actually are a lot harder). The first bunch of them are really simple, being things like string manipulation, slightly tricky sums, and little counting tasks, where you really just need to think about how you’d do it in the most naive way, and then code it (perfect for getting back into a language!)… but they quickly get devilish and far from trivial. One type they’re a fan of, when you get to the slightly trickier ones, are problems where the naive, brute force approach is obvious, but would take an impossibly long time to calculate. However, there’s a trick involved that allows it to be calculated “on any modern computer in under a minute”, I believe is their promise.

So I’ve done the first 25 or so problems using python. I’m definitely going about it in a better way than I did before, trying to use neater notation (like list comprehension rather than a for loop, when I can). I think I’ve definitely benefited from my time with Mathematica, which has a strong emphasis on that type of stuff (for example, using inline functions and the Map shorthand /@).

Overall, it’s going pretty well and I’m liking it. I remember not liking whitespace-based syntax (or whatever it’s called), but I’m finding that with even a basic text editor like Notepad++ or Atom, it’s actually pretty neat.

But of course I have a couple complaints, so let me kvetch a bit.

First, there seems to be a dearth of simple solutions for simple problems that I’d expect to be kind of common. For example, in a few PE problems, I had a list of lists of the same length (so, a matrix), that I wanted to transpose. Now, in M, you’d literally just do Transpose@mat. However, I was having trouble finding how to do it neatly in Python. Basically, the exact problem being asked about here. Now, what I’m looking for is something nice and simple like one of the answers given:

import numpy as np

a = np.array([(1,2,3), (4,5,6)])

b = a.transpose()

But unfortunately, if you notice, for the same reason the OP in that post didn’t choose that answer, your matrix has to be in a different form for np.array (with parentheses, not square brackets, as they would be for a list). Now, I could recreate the matrix into an np array, but… now we’re talking about another operation, and I’d have to also do it back that way if I wanted it in square brackets at the end. I guess I could have built it as an np array from the get go, but you might not always have the option.

The solution that works for this is:

>>> lis  = [[1,2,3], 
... [4,5,6],
... [7,8,9]]
>>> [list(x) for x in zip(*lis)]
[[1, 4, 7], [2, 5, 8], [3, 6, 9]]

but…cmon. I’m willing to accept though, that this is Python’s style: very barebones by default, and if you want to do stuff like matrix manipulation, you either have the choice of doing it with the syntax of some package (like numpy), or doing it yourself in a somewhat ugly way.

Another little annoyance was when I wanted to find the divisors of an integer. Again, maybe it’s silly to compare to M, but in M it’s… Divisors@myInt. For Python, my searching led me to this thread (and this one), which has some clever solutions, but are all ultimately something they had to design themselves. In this case, I’m not sure there’s even a simple function in a package like numpy you can use (is there? let me know if so!). I mean, it’s not a huge deal since I saved my favorite of those solutions and put it in a file with other helpful little functions that I can now import, but it’s still not ideal.

Anyway, that’s enough complaining for today. All that aside, Python is a pleasure to use and those are relatively minor annoyances. I’ll be updating again soon with some better practical uses of Python.

Leave a Comment