This little rant is going to piss off a lot of people in my professional circles but, well, I’m known for doing that sort of thing.
So, today, I announce my support for Donald Trump for president.
KIDDING!!! Though what follows will probably be just about as popular.
I just started tweeting at the beginning of this year, and I’m still trying to figure out—along with, I’m guessing, no small part of those in the twittersphere who both blog and tweet but don’t make a living from either —when to blog and when to tweet. Though I sense that there is some threshold, probably around three or four, where multiple tweets probably mean I should be blogging instead.
And such was the situation this morning where I went on a tweet-rant against the political science “math camp” concept, albeit in response to an innocent posting on the utility of computer programming from John Beieler and in no small part from being stuck at a garage for an extended period of time while it was ascertained whether the tires on my Vespa would pass state inspection.  But in fact, I’d actually had a note to write an essay on this topic back in August—now that would have really been cruel, eh?—so it’s not like this hadn’t occurred to me before.
So I’ll write it now, when it will presumably be forgotten by August, though not before some subset of you, dear readers, decide among competing graduate school offers. Cue maniacal laughter, “Fools, I will destroy them all!…”
But first, two clarifications. By “math camp” I mean the hazing exercises conducted on the part of an increasing number of political science graduate programs which, in one or two weeks, purport to impart to students, many of whom have probably had little classroom instruction taught in mathematics departments since their first year in college or even AP courses in high school, a crash course in “mathematics” which, if the material I’ve seen on syllabi and texts is any indication, goes through about the level of the first year of a graduate degree in mathematics. Having completed both an undergraduate and master’s degree in mathematics I am, to put it mildly, skeptical.
Not to be totally, negative, I’ll actually just give three reasons why “math camp” is a terrible idea, and then four reasons why the time would be better spent on basic computer programming, the gist of the inspirational tweet of @johnb30. Who bears no responsibility for the rant to follow and for all I know, loved “math camp.” Though somehow I doubt this.
1. The basic “math camp” concept is ludicrous
My mathematics education—effectively in applied mathematics and mathematical statistics—involved a total of about 50 semester credit hours, half at the undergraduate level and half graduate. That’s roughly 750 classroom hours, and once beyond the introductory level, probably at least two hours of homework (often more) for every hour in the classroom, so let’s round the total to 2000 hours, and say fully half of it wouldn’t be considered part of a “math camp” curriculum—very conservative estimate, based on what I’ve seen— so we’re left with material that experienced instructors using a curriculum that has been refined over the past three centuries believe requires at least 1,000 hours to master.
Political science programs claim to be able to teach this same material in 40 to 80 hours. Yeah, right.
2. Even if you know the basics, you can’t learn the rest of mathematics on your own because it is a complex culture
No, I’m not going post-modern on you here, since that culture is subject to a highly constrained set of rules, not “Wow, that feels good, hand me another of those candies you brought back from Denver, and wow, isn’t Derrida sooooo like cool!!!” But mathematics is an intricately linked set of rules, idioms and norms which one slowly learns through a progressive sequence of purely mental exercises that has been refined over, literally, centuries. Mathematics can certainly be taught badly—alas in the secondary education system in the US, that’s virtually the only thing one encounters —but teaching it properly is a very gradual process requiring constant feedback, attention, inculturation into professional norms and hundred of hours of intensive practice involving oftentimes intense concentration which must also be learned. After the basic level, this is done almost entirely through the mastery of mathematical proofs, which while dependent on algebra, are largely extended exercises in formal logic, the sorts of things where—yes, this actually happens, regularly—you spend hours staring at something and running around in mental circles until, finally, the step forward is completely obvious.
3. Most of what passes for “mathematics” in political science is just very bad algebraic notation.
The first couple of years as a naive assistant professor, I actually tried to write articles in a mathematical style for political science journals. Thanks to a very accommodating committee, I’d gotten away with that in my dissertation  but it went nowhere in the journals. The grounds for rejection was real subtle: one review from a four-letter journal literally just said “Too much math, no one will understand it.” So I switched first to statistical analysis (plus some field work) and eventually to mostly doing software and data development, and did okay.
Though, I suppose “no one will understand it” was an accurate appraisal, and perhaps the reviewer was doing the four-letter journal, and maybe even me, a favor. The apparent exception to this rule were the algebraic “proofs” of the “rational choice” school, which I never really warmed to because it looked like really bad social science masquerading as even worse algebra—sort of the formal equivalent of post modernism, which is really bad social science masquerading as even worse exposition—and in subsequent years rational choice has been thoroughly dispatched back to the netherworld by the likes of Kahneman, Tversky, Thalin and now a generation or two of skilled behavioral economists. The four-letter journals would publish long proofs that were nothing more than convoluted algebraic identities, tied together with the cookbook invocation—sort of an extended shamanistic ritual, minus (I presume) the animal sacrifices, though one occasionally wondered—of a couple complex theorems the author almost certainly could not even begin to prove on their own, and I suppose one can still get away with some of that. 
[Political science involvement in statistics, on the other hand, has taken a very different route in the decades after political methodologists, initially led by Chris Achen and John Jackson, set up their own organization and established journals that could enforce a high level of standards without penalizing authors for complexity. Full development of this took a couple of decades but it has now reached a point where some of the methodological developments which either originated in or saw extensive practical development by political scientists are at the cutting edge of applied statistical work. A possible downside of this has been that individuals trained to state-of-the-art political science methods can oftentimes find for more attractive employment outside of political science, either in more methodologically-friendly academic departments or in industry. Meanwhile in mainstream political science, “Too much math, no one will understand it” lives on.]
So, from the perspective of actually learning any mathematics, you are completely wasting your time in “math camp.” That said, you will presumably get up to speed with some remedial algebra and learn a bit of new notation , though I cannot understand where anyone got the idea that these are more effectively conveyed in isolation than in the context where they are actually used. And, of course, as is the nature of hazing exercises, you will share the first of many, many WTF moments with a group of strangers who over the next decade will almost certainly become some of the most important people in your life, and perhaps this is all that “math camp” is really supposed to accomplish.
However, if you are in a quantitatively-oriented political science program  what you should be doing, per @johnb30, is learning more computer programming. For at least four reasons:
4. Contemporary quantitative political science is data science
And data science is now recognized (alas, I forget who first came out with this formulation) as pretty much equal amounts of statistics, machine learning, data wrangling via some toolkit of general purpose programming languages, and data visualization. You need all four, though probably not quite equally: I’d go lighter on the visualization and make sure your statistics training is both frequentist and Bayesian (and to the extent you can get away with it, your statistical practice is mostly Bayesian).
5. The journal referees won’t penalize you for the complexity of computational methods
Or at least there are now a set of journals with the high impact ratings that will get you jobs, tenure, promotions, grants and happy deans and deanlets that will not penalize you. Nowadays complex material can not only be put on the web, but due to replication standards, it will probably be required to be on the web. But as long as your code does what you say it does, so you are unlikely to be penalized for the fact that your work is complicated. Or involves math. The emergence of R and Python as open source data processing lingua franca also has helped a lot.
6. Once you’ve got the basics, you can—and will—learn more programming on your own.
This is a fundamental difference between mathematics and computer programming: mathematics is a highly formal and complex means of communication between mathematicians, whereas programming languages are a highly formal and complex means of communication between humans and machines. But it is a two-way communication: mess up a program syntactically and the machine will let you know, or (well, when the data fairy is being uncharacteristically kind) this will be evident in the results. Furthermore, unless the pace of development slows dramatically, in ten years, or certainly twenty years, you will be doing most of your formal work in a completely different system than you are using today, and you will have learned those new skills outside of any formal educational context.  You will be able to do this easily because of the vast and ever-evolving array of open resources available on the Web: work regularly with the Web as, effectively, your assistant and technical go-fer sufficiently long and you almost start to believe in this “singularity” stuff at least in some sense.
Programming is learned by writing programs, reading code, and, critically, re-writing (“re-factoring”) your own code as you become more skillful. This is a life-long process. Or should be.
7. It is worth going beyond the basics
Anyone who has programmed an Excel spreadsheet has done, well, programming at a basic level. I’ve seen political science graduate students with little or no formal programming coursework developing scripts in R or Stata at very high levels of complexity, albeit needlessly high because they could have done the tasks a lot more easily in perl or Python. Web programming used to be sort of a trailer-trash backwater, suitable for the likes of UFO-worshipping suicide cults, but after a couple of decades has evolved to high levels of sophistication that require knowledge of some underlying theoretical concepts to use effectively.
The problem with just focusing on self-taught (and peer-taught) programming is it is easy to by-pass (or only partially learn) some important concepts that go well beyond little rules of thumb like making sure every line ends with a semi-colon and absolutely never write something where white space is syntactically important. First and foremost, data structures beyond arrays, and object-oriented programming concepts. These do need to be learned, IMHO, because you can program without them—for the first couple of decades of computer programming, the entire community did—but you can be far more efficient if you know how to use them. At the secondary level, learn correct idiomatic programming in your languages of choice, both because script-based systems like R and Python are implemented with common idioms in mind (that is, idiomatic code will be better optimized), and you need to know idioms to read code, and as almost everything you will use will be open source, you will read a lot of code.
So Phil, this sounds great—can’t you make it just a little less snarky  and put it in the form of a departmental memo to the graduate curriculum committee…oh, too late for that…well, why didn’t you put it in the form of a memo and try to get it implemented at one of those graduate programs that tolerated your lack of faith and notoriously bad attitude for decades?
Been there, did that: tried to get something like this adopted for a good quarter century to no avail, and finally just slunk out into the sunset. Or something.
Leaving you poor bastards to face, alas, “math camp.”
1. If you do make a living from blogging, like Dan Drezner, Ross Douthat or Ezra Klein, the synergy with Twitter is obvious and effective.
2. They didn’t: of the myriad ways one could discover that scooter tires need replacing, an annual inspection is probably the least painful and expensive. Despite being another one of those horrible ways that damn gov-mit intrudes on our private lives! Horrible, horrible. Running over a pedestrian or pancaking into a semi would have accomplished the same thing without all that useless bureaucracy. Damn gov-mit… I digress…
3. I got very lucky to have a junior high school math teacher who pushed beyond this, though I think he only lasted in the system a few years before departing to work in the financial sector, which I somehow suspect paid better than teaching in southern Indiana. The mathematics department at Indiana University, where I did both my undergraduate and graduate work, took pedagogy very seriously, and I also had several very skilled teachers, including Daniel Maki and Maynard Thompson, pioneers in teaching about mathematical models of social processes, Pesi R. Mansani, a student of Norbert Weiner who taught a decidedly rigorous year-long probability theory course, and during a one-year visiting gig after he retired from Berkeley, the inimitable statistician Henry Scheffé who, like George E.P. Box, emphasized that in statistical analysis, it is foolish to be concerned about mice when there are tigers about.
4. And kiddos, what was worse than running programs from punched cards?: typing publication-quality equations before LaTeX. Damn whippersnappers don’t know how good they’ve got it…get off my lawn…
5. This is probably an urban legend but one of the mathematicians at Northwestern who helped develop NU’s Mathematical Methods in the Social Sciences program—which when I was involved took around nine quarter-length courses to cover what a typical “math camp” tries to do in two weeks, and this for a highly selective cohort of students—was said to have been sitting watching some rational choice dude, probably a hapless job candidate, going through a “proof” that filled three blackboards (in the era when actual blackboards still existed and would cover three or four walls of mathematics classrooms). When the “proof” was finished, he got up, erased all but the first two and last two lines, wrote two more lines in the middle, and said “That’s all you need for this proof.” Though a in later era, of course, he would have said “That’s not a proof, this is a proof.”
6. If you aren’t going for quantitative training, just spend lots and lots of time in the field, whatever “the field” corresponds to in your subject domain. Get yourself somewhere people really wish you weren’t—I think that advice can apply to pretty much everything worth studying qualitatively in political science—though please don’t get yourself killed, which could happen in places like Egypt and sooner or later probably Trump rallies. But also minimize your time in seminar rooms: they are toxic.
8. Python joke[s]…we’re just bundles of laughs, programmers…
9. An instructive example: the TABARI automated coder, written largely in C, was exceedingly fast for its time (ca. 2000) because it worked quite close to the machine level, largely doing its own memory management and working almost entirely with nested pointers. When I wrote the initial version of the PETRARCH coder as the successor to TABARI in Python, which is at a higher level of abstraction, the program was quite slow. Then last summer Clayton Norris, a computer science and linguistics student working as a summer intern at Caerus Associates, re-wrote the core of PETRARCH using contemporary data structures appropriate to both Python and the Treebank parsed input it uses, and increased the speed of the program by about a factor of ten.
10. Moi?…dream on…
11. But as I tweeted, notation alone doesn’t get you very far: knowing the common notation from mathematical statistics is like saying you’ve learned Arabic because you know the alphabet, can recognize a few common words on shop signs and restaurant menus, and will toss in a few instances of “insh’allah” (“p-value”) and “yani” (“significant”) into your conversation.