So, punk, think ya can start a data science program??

This is the second part of a two-essay series addressing some of the features one might wish to include in a contemporary “data science” program using resources in existing quantitative “social science” programs. The first, a rather rambling polemic, addressed a series of more general questions about the state of the “social sciences”, concluding that, as sciences, they are quite mature, and have been so for decades: People who have played “bet your career” on the systematic study and prediction of human behavior are, shall we say, generally doing just fine.

This essay moves on to some more specific questions on how social science approaches might be adapted given the rapid developments in analytical methods that have been occurring to a large degree elsewhere, typically under the titles “machine learning” or “data science.”

These observations should be taken with, well, maybe a truckload of salt as they are based on an assortment of unsystematic primary and secondary observations of the current “data science” scene as viewed from an infinitesimally minor player located in the southern-most suburb of Washington, DC: Charlottesville, Virginia (or as we call it locally, CVille [1]). Despite having not a lot of skin in the game, dogs in the fight, or monkeys in the circus—merely an obnoxious penchant for clichéd metaphors—the rapid evolution of this “space”, as the contemporary phrasing goes, has been interesting to observe.

So, let’s get this show on the road, and our butt in gear. I’m generally addressing this to those who might be considering modifying one or more parts of an existing [implicitly academic] social science curriculum to be more [implicitly applied] “data science friendly.” As the number of people in that situation is rather limited—albeit I’m going to be talking to some in the near future—these observations may also be of some utility to the much larger number of people who are wondering “hey, what differentiates a well-trained social science statistician from a data scientist?” [Answer: mostly the title on their business card…and probably their compensation.]

As usual, seven observations, in this instance phrased as sometimes nuanced pairs moving from existing social science “statistics” approaches in the direction of a newer “data science.”  

1. “Science” vs “Engineering”

I am very much a child of the “Sputnik generation” and some of my earliest technical memories were discussions of the actual Sputnik, orbiting overhead, then watching U.S. rockets launching (and, presumably, blowing up) on our small television, then the whole space-race thing culminating in watching the first moon landing live. This was a [more civilized] period where elites, whether conservative or liberal, revered “science” and as a duly impressionable youngster, I imagined myself growing up to be “a scientist.”

Which I did, though in ways I definitely hadn’t imagined while watching rockets blow up, utilizing the fact that I was pretty good at math and was eventually able to combine this with my love of history and a good intuitive understanding of politics.  All in all, it made for a nice academic career, particularly during a period of massive developments in the field of quantitative political science.

But despite my technical credentials, I was always a bit uncomfortable with the “science” part of things, and all the more so because I don’t have an intuitive understanding of philosophy: when you hang around people who do, you quickly realize when you don’t. Sure, I can piece together an argument, much as one pieces together a grammatically correct sentence in a language one is just beginning to learn, but it’s not natural for me: I’m not fluent.

“Science” is nonetheless the prestige game in the academic world—particularly for Sputnik-generation Boomers and their slacker “Greatest Generation” overlords—and it was only after getting out of that world four years ago, and in particular into the thriving software development ecosystem in CVille, did I finally realize what I’m actually good (and intuitive) for: the problem-solving side of things. Which is to say, engineering rather than science. [2]

What’s the difference? Science is pursuing ultimate truths, engineering is solving immediate problems. I’m going to go seriously geek on you here, but consider the following two [trust me, more or less equivalent] issues

Considering the issues of both programmer and execution time, and software maintainability, are concurrent locks or transactional memory the better solution for handling multiple threads accessing memory?

or

If I need to scale access to this database, will I screw it up?

The first is a scientific problem—mind you, probably unanswerable—and the second is an engineering problem.

Everything in an academic culture is going to point towards solving scientific problems. [3] And very slowly. Students seeking employment outside of academia need to know general scientific principles, but ultimately they are going to be hired, compensated, and promoted as engineers.

2. Statistics vs Machine Learning

As the table below shows in some detail, the statistical and machine learning (ML) approaches appear to be worlds apart, and the prospect of merging these would appear at first to be daunting:

Feature Statistics Machine Learning
Primary objective Determining if a variable has a “significant” effect Classification
Theoretical basis Probability theory “Hey, I wonder if this will work??”
Feature space Variable limited Variable rich
Measurement Should be careful and consistent Whatever
Cases labeled? Usually [4] Maybe, maybe not (supervised vs unsupervised learning)
Heterogeneous cases? Nooooo…. Bring it on…
Explicit data generating process? Ideally Rarely
Evaluation method Usually full sample Usually split sample (training/test)
Evaluation metrics Correlation ROC AUC; accuracy/precision/recall
Importance of good fit? Limited: objective is accurately assessing error given a model How else will I get published?
Time series Specialized models covered in 800-page books Just another classification problem
Foundational result Central limit theorem Web scraping
Sainted ancestor Carl Friedrich Gauss Karen Spärck Jones
Distribution naming conventions Long dead European males Distributions?
Software naming conventions Dysfunctionally abbreviated acronyms [7] impossible to track on Google Annoyingly cute Millennial memes
Secret superpower Client is totally dependent on us to interpret the numbers Client doesn’t know we just downloaded the code from StackOverflow
Logistic regression is embarrassingly effective? Yes Yes

But from another perspective, these differences may actually be a good thing: it is quite possible that ML, like a rising tide, an invasive species finding empty ecological niches, coffee spilled on a keyboard, or whatever your preferred metaphor, has simply occupied the rather substantial low ground that the more constrained and generally analytically-derived statistical approaches have left vacant. Far from being competitors, they are complements.

Thus suggesting that the answer to “statistics or machine learning?” is “both.” I think this is particularly possible because while ML would be in addition to the statistical curriculum which has been carefully refined over the better part of a century [8], in the applied work I’m seeing, the bulk of practical ML really comes down to four methods

  • clustering, usually with k-means
  • support vector machines
  • random forests
  • neural networks, most recently in their “deep learning” modes

These are the general methods and do not cover more specialized niches such as speech and image recognition and textual topic modeling, but the degree of focus here is truly extraordinary given the time, effort and machine cycles that have been thrown at these problems over the past fifty years. Each method has, of course, a gadzillion variations—this is the “hyperparameter” issue for which there are also ML tools—but typically just going with the defaults will get you most of the way to the best solution you can obtain with any given set of data. Which is to say, the amount of ML one absolutely has to learn to do useful work in data science is quite finite.

3. Analytical mathematics vs programming

I have [fruitlessly] addressed this topic in much greater detail elsewhere but the bottom line is that for data science applications, you certainly need to be able to comprehend algebaic notation, ideally at a point where you can rapidly and idiomatically skim equations (including matrix notation) in order to figure where a proposed new method fits into the set of existing techniques. It certainly doesn’t hurt to have the equivalent of a couple semesters of undergraduate calculus [9], though [I think] I read somewhere that something like two-thirds of college graduates have this anyway (that figure probably includes a lot of AP credits). But beyond that, the return on investment in the standard analytical mathematical curriculum declines rapidly because most new methods are primarily developed as algorithms. [10]

The same can probably also be said for most of the formal coursework dealing with computer programming: it is very helpful to learn more than one computer language in some depth [11], learn something about data structures, including objects, and get some sense of how computers work at a level close to machine language (C is ideal for this, as well as useful more generally), but beyond that point, in the applied world, it’s really just practice, practice, practice.

While the rate of computer language and interface innovation has certainly slowed—C and Unix are now both nearing their half-century mark—it remains the case that one can be almost certain that in 2025 there will be some absolutely critical tool in wide use that only a handful (if that) of people today have even heard of. This is a completely different situation from analytical mathematics, where one could teach a perfectly acceptable calculus course using a textbook from 1850. As such, the value of intensive classroom instruction on the computer side is limited.

4. R vs Python

So, let the flame wars begin! Endlessly!

Well, maybe not. Really, you can manage just fine in data science in either R or Python (or more to the point, with the vast panoply of Python libraries for data analytics) but at least around here, the general situation is probably captured by a pitch I heard a couple weeks ago by someone informally recruiting programmers: “We’re not going to tell you what language to use, R or Python, just do good work. [long pause] Though we’re kinda transitioning to Python.”

So here’s the issue: Python is the offspring of a long and loving relationship between C and Lisp. Granted, C is the modest and dutiful daughter of the town merchant, and Lisp is the leader of a motorcycle gang, but it works for them, and we should be happy. Java, of course, is the guy who was president of student council and leaves anonymous notes on the doors of people who let their grass grow too tall. [12]

R is E.T. 

I will completely grant that R has been tremendously important in the rapid development of data analytics in the 21st century, both through its sophistication, its incredible user community, and of course the fact that it is open source. The diffusion of statistical, and later machine learning, innovation made a dramatic leap when R emerged as a lingua franca to displace most proprietary systems except in legacy applications. [13]

But to most programmers, R is just plain weird, and at least I can never escape the sense that I’m writing scripts that run on top of some real programming language. [14] Whereas Python (and Java) are modern general purpose languages with all of the things you expect in a programming language—and then some. Even though, like R, both are scripted and also running on top of interpreters (in Python, once again C), when you are writing code it doesn’t feel like this. At least to me…and, as I watch the ever-increasing expansion of Python libraries into domains once dominated by R, I don’t think it’s just me.

Do not show any of this to any group of programmers without being aware that they will all disagree vehemently with all of it. Including the words “and” and “the.” Shall we move on?…

5. Small toy problems vs big novel problems

It’s taken me quite a while, and no small amount of frustration, to finally “get it” on this but yeah, I think I finally have: academic computer scientists like small “toy” problems (or occasionally, big “toy” problems) because those data sets are already calibrated, and the only way you can tell whether the new technique you are trying to get published is actually an improvement (or, more commonly, assess the various tradeoffs involved with the technique: rarely is anything an improvement on every dimension) is by testing it on data sets that lots of other methods already have been tested on, and which are thoroughly understood. Fair enough.

Unfortunately, that’s not what the applied world looks like—we’re back to that “science” vs “engineering” thing again—where the best opportunities, almost by definition, are likely to be those involving data that no one has looked at before and which are not well understood. If the data sets that are the focus of most computer science development and evaluation covered all of the possibilities in the novel data we’d still be okay, but almost by definition, they won’t, so we aren’t.

I realize I’m being a bit unreasonable here, as increasing the corpora of well-understood data sets is both difficult and, if computer science attitudes towards data collection are anything like those in the social sciences, largely unrewarded, but (easy for me to say…) could you at least try a bit more? Something other than irises and the emails between the long-incarcerated crooks at Enron?

6. Clean structured data vs messy unstructured data

This looks like the same problem as the above, but is a bit different, and more practical than theoretical. As everyone who has ever done applied data science work will confirm, most of one’s time (unless you’re at a shop where someone else does this for you) is spent on data preparation. Which is never exactly what you expect and, if you are dealing with data generated over an extended period of time, it probably has to be done for two or three subtly different formats (as well as coping with the two or three other format changes you missed). Very little of this can be done with standard tools, much of the work provides little or no intellectual rewards (beyond building software tools you at least imagine you might use at a later date), and it’s mostly just a long slow slog before you get to the interesting stuff.

This does not translate well to a classroom environment:

Welcome to my class. Before we begin, here’s a multi-gigabyte file of digital offal that is going to take a good six weeks of tedious work before you can even get it to the point where the groovy software you came here to learn can read it, and when you finally do, you’ll discover about ten assumptions you made in the initial data cleaning were incorrect, leading to two additional weeks of work, but since eight weeks is well past the drop date for this class, you’ll all be gone by that point, which is fine with me because I’d rather just be left alone in my office to work on my start-up. Enjoy.  

No, if you want to learn about technique, better to work with data you know is okay, and devote your class time to experimenting with the effects of changing the hyper-parameters.

Again, I don’t have an obvious solution to this: Extended senior and M.A. projects with real data, possibly with teams, may be a start, though even there you are probably wasting a lot of time unless the data are already fairly clean. Or perhaps the solution is just to adopt the approach of law schools, which gradually removed all of the boring stuff to the point where what is taught in law school is all but completely irrelevant to the actual practice of law. Worked for them! Well, used to…

7. Unicorn-aspiring start-up culture vs lean and sustainable business culture

This one isn’t really an academic issue, but a more general one dealing with practical preparation for students who will be venturing into the realm of applied—which is to say, commercial—data analytics. As those who follow my tweets know, here in innocent little CVille we recently completed an annual affair I referred to as the hip hipster fest, a multi-day taxpayer-subsidized [15] celebration of those who are born on third base and going through life acting like they hit a triple. It was, to a remarkable degree, a parody of itself [16]—the square-jawed hedge fund managers holding court in invitation-only lunches, the Martha Stewart wannabee (and well on her way) arguing the future of the city lay in the creation of large numbers of seasonal, minimum wage jobs catering to the fantasies of the 1%, the venture capitalist on stage in flip-flops who couldn’t complete a paragraph without using the F-word at least once. [17] Everywhere the same monotonously stereotypical message: aim big, don’t be afraid to fail!

Yeah right. Don’t be afraid to fail, so long as you come from a family of highly educated professionals, went to a private high school, Princeton, Oxford, Harvard, and married someone who had tenure at Harvard. Under those circumstances, yeah, I’d say you probably don’t need to be afraid to fail.

Everyone else: perhaps a little more caution is in order. And oh by the way, all those people telling you not to be afraid to fail?: if you succeed, they are going to get the lion’s share of the wealth you generate. And if you do fail—and in these ventures, the odds are absolutely overwhelming this will be the outcome—they’ll toss you aside like a used kleenex and head out looking for the next wide-eyed sucker “not afraid to fail.” Welcome to the real world.

So over the past few years I’ve come around to seeing all these things—endlessly celebrated in the popular media—as more than a bit of a scam, and have formulated my alternative: train people to create businesses where, as much as possible, you can sustainably be assured that you can extract compensation roughly at the level of your marginal contribution. [18] That’s not a unicorn-aspiring start-up—leave those for the sons and daughters of the 1%—and it is certainly not the “gig economy,” where the entire business plan involves some company making sure you are getting far less than your marginal contribution, buying low (you) and selling high (them). Stay away from both and just repeat “Lean and sustainable: get compensated at the rate of your marginal contribution.”

It’s an old argument—Monique Tilford and Vicki Robin’s 1990s best-seller Your Money or Your Life focused on exactly this principle, and in his own hypocritical fashion, it was the gist of Thomas Jefferson’s glorification of the [unenslaved] yeoman farmer as the foundation of a liberal democracy. Mind you, in point of fact Jefferson was born into wealth and married into even more wealth, and didn’t have the business acumen to run a 10-cent lemonade stand, whereas his nemesis Alexander Hamilton actually worked his way up from the utter dregs of poverty so, well, “it’s complicated,” but—as with so much of Jefferson—there is a lot useful in what he said, if not in what he actually did.

Getting back to the present, if you want to really help the data scientists you are planning to send out into the world, tell them not to get suckered into the fantasies of a fairy-tale start-up, and instead acquire the practical skills needed to create and run a business that can sustain itself—with minimum external funding, since banks and hedge funds are certainly not going to loan to the likes of you!—for a decade or more. Basic accounting, not venture capital; local marketing, not viral social networking; basic incorporation, payroll and tax law [19] , not outsourcing these tasks to guys in expensive suits doing lines of cocaine. And fundamentally, finding a viable business niche that you can occupy, hold, and with the right set of skills and luck, expand over a period of years or decades, not just selling some half-baked idea to the uncle of one of your frat brothers over vodka shots in a strip club.[20]

A completely different approach than promoting start-up culture, and it’s not going to get on the front pages of business magazines sold in airline terminals but, you know, it might just be what your students actually need. [21][22]

And such an approach might also begin to put in dent in the rise of economic inequality through a process more benign than revolution, war, or economic catastrophe. That would be sorta nice as well, eh?

Footnotes

1. Or C-Ville or Cville.

2. One of the [few] interesting conversations I had at the recent CVille hip hipster fest—mocked in extended detail below—was with a young—well, young compared to me, though that’s true of most things now other than sequoia trees—African-American woman working as a mechanical engineer at a local company. Well, contractor. Well, in a SCIF, and probably developing new cruise missile engines, but this is CVille, right? Still, it’s Hidden Figures, MMXVII. Anyway, our conversation was how great CVille was because of the large community of people who work on solving complex technical problems, and how helpful it was to be surrounded by people who understood doing that for a living, even though the applied domains might vary widely. A very different sort of conversation than I would have had as an academic.

3. More of an issue for the previous essay, as the organization-which-shall-not-be-named is obsessed with this, but a somewhat related issue here is the irrelevance of “grand theory” in applied social science.

Let’s start with a little thought experiment: You’re shopping for a car. What is “the grand theory of the car” of General Motors compared to Toyota? Honda versus Volkswagon? And just how much do these “grand theories of the car” factor into your buying decision?

Not in the least, right? Because there are no grand theories of the car, but instead multiple interlocking theories of the various components and systems that go into a car. Granted, the marketing people—most of whom probably couldn’t even handle the engineering challenges involved in changing a tire—would dearly love you to believe that, at astonishingly fundamental levels, a Toyota is fantastically distinct from a Honda and this is because of the deep cultural and, yea, spiritual differences between the Toyota way—the Dao of Toyota—and the Honda way, but that’s all it is: marketing. Which is to say, crap. They’re just making cars, and cars are made of parts.

And so it is with humans and human societies, which have evolved with a wide variety of components to solve a wide variety of problems. Some of these solutions are convergent—somewhere in one of Steven Pinker’s books, I think The Language Instinct, he has a list some anthropologist put together of characteristics of human societies that appear to be almost universal, and it goes for about four pages—and in other cases quite divergent (e.g. the dominant Eastern and Western religious traditions are diametrically opposed on both the existence of a single omnipotent deity and whether eternal life is something to be sought after or escaped from). There are very useful theories about individual components of human behavior, just as one can more or less identify a “theory of batteries”, “theory of tires”, or even—about as close as we come to something comprehensive—a “theory of drive trains”, notably those based on electricity and those based on internal combustion engines. These various theories overlap to limited degrees, but none is a “theory of the car,” and one doesn’t need a grand “theory of society” to systematically study human behavior.

Such theories are, in fact, generally highly counter-productive compared to the domain-specific mid-level theories. The other dirty little secret which I’ve found to be almost universally shared across disciplines involved in the study of humans—the humanities as well as the social sciences—is that individuals obsessed with grand theories are typically rather pompous, but fundamentally sad, people who don’t have the intelligence and/or experience to do anything except grand theory. And as a consequence eventually—or frequently after just one or two years in grad school—they don’t get to play in any reindeer games. Maybe for a dozen people in a generation this is not true—and the ideas of only half of even those survive beyond their generation—but for the rest: losers.

There’s a saying, I’m pretty sure shared by researchers on both sides of the political divide, about studies of the Israeli-Palestinian conflict: “People come for a week and they write a book. They come for a month and they write an article. They come for a year and they can’t write anything.” Yep.

4. There’s a fascinating article to be written—how that for a cop-out?—on the decline of clustering methodology (or what would now be called unsupervised models) in quantitative political science. Ironically, when computer analyses first became practical in the 1960s, one actually saw a fair amount of this because there had been extensive, and rather ingenious, methods developed in the 1930s and 1940s in the fields of psychology (notably Cattell) and educational testing in order to determine latent traits based on a large number of indicators (typically test questions). These techniques were ready and waiting to be applied in new fields once computers reduced the vast amount of labor involved, and for example some of the earliest work involving clustering nation-states based on their characteristics and interactions by the late Rudolph Rummel—ahead of the curve and out of the box throughout his long career—would now seem perfectly at home as an unsupervised “big data” problem.

But these methods didn’t persist, and instead were almost completely shunted aside by frequentism (which, one will observe throughout these two essays, I also suspect may be involved in causing warts, fungal infections in gym shoes, and the recent proliferation of stink bugs) and by the 1990s had essentially disappeared in the U.S. Why?

I suspect a lot of this was the ein Volk, ein Reich, ein Führer approach that many of the proponents of quantitative methods adopted to get the approach into the journals, graduate curricula and eventually undergraduate curricula. This approach required—or was certainly substantially enhanced by—simplifying the message to The One True Way of doing quantitative research: frequentist null hypothesis significance testing. Unsupervised methods, despite continuing extensive developments elsewhere [5], did not fit into that model. [6]

The other issue is probably that humans are really good at clustering without any assistance from machines—this seems to be one of the core features of our social cognition. As I noted in the previous essay, Aristotle’s empirically-based typology of governance structures holds up pretty well even 2,400 years later, and, considerably better than Aristotle’s observations on mechanics and physiology. Whereas human cognition is generally terrible at probabilistic inference, so in this domain systematic methods can provide substantial added value.

5. In 2000, I went to a large international quantitative sociology meeting in Cologne and was amazed to discover—along with the fact that evening cruises on the Rhine past endless chemical plants are fun provided you drink enough beer—a huge research tradition around correspondence analysis (CA), which is a clustering and reduction-of-dimensionality method originally developed in linguistics. It was almost like being in one of those fictional mirror worlds, where everything is almost the same except for a couple key differences, and in this case all of the sophistication, specialized variations and so forth that I was seeing in North America around regression-based methods were instead being seen here in the context of CA. I was actually quite familiar with CA thanks to a software development gig I’d done earlier—at the level of paying for a kitchen remodel no less—with a Chicago-based advertising firm, for which I’d written a fairly sophisticated CA system to do—of course—market segmentation, but few of the other Society for Political Methodology folks attending (as I recall, several of us were there on a quasi-diplomatic mission) had even heard of it. I never followed up on any of this, nor ever tried to publish any CA work in political science, though in my methodology classes I usually tossed in a couple examples to get across the point that there are more things in heaven and earth, Horatio, dudes, than are dreamt of in your philosophy copy of King, Keohane and Verba.

6. This may be an urban legend—though I’m guessing it is not—but factor analysis (easily the most widely used of these methods) took a hit when it was discovered that a far-too-popular routine in the very early BMDP statistical package had an bug which, effectively, allowed you to get whatever results you wanted to from your data. Also making the point that the “reproducibility crisis” is not new.

7. “LDA”: latent Dirichlet allocation or linear discriminant analysis? “NLP”: natural language process or nonlinear programming? “CA”: content analysis or correspondence analysis?

8. Even if still uniformly detested by students…hey, get over it!…and just last night I was at yet another dinner party where yet another nursing student was going into some detail as to how much she hated her required statistics class: I have a suspicion that medical personnel disproportionately prescribe unusually invasive and painful tests, with high false positive rates, to patients who self-identify as statisticians. Across the threshold of any medical facility, I’m a software developer.

9. Or more realistically, a semester of differential calculus and a semester of linear algebra, which some programs now are configured to offer.

10. The very ad hoc, but increasingly highly popular, t-SNE reduction of dimensionality algorithm is a good example of this transition when compared to earlier analytically-derived methods such as principal components and correspondence analysis which accomplished the same thing.

11. While not strictly needed for data science, I think there’s much to be said for getting reasonable competence in basic interface tools: currently this would be SQL, javascript and php. More generally, the fundamental split in “IT” jobs is “front-end”—user interfaces, or UX—and “back-end”, which are the analytics which data science deals with; in most situations, databases (SQL and its successors) are the bridge between these two.

12. Encounters of this type are one of the reasons I no longer live in Pennsylvania, though the township minders, not the neighbors, were the culprit.

13. Like those government contracts where you sense that what they’d really like to require is that all computations be done in cuneiform on clay tablets. Because clay tablets have a track record for being very stable and were more than adequate for the government of Ashurbanipal.

But they don’t require cuneiform. Just MatLab.

14. Yes, sophisticated R programmers will just write critical code in C, effectively using R as a [still very weird] data management platform, but that’s basically C programming, not R.

15. Said subsidies furtively allocated by the city council in a process outside the normal review for such requests, which would have required the hip hipster fest to be audited, but that’s someone else’s story. Ideally someone doing investigative journalism.

16. Okay, some folks who clearly knew what they were doing put together an impressive 4-hour session on machine learning. Though I’m still not clear whether the relevant descriptor here was symbiosis or parasitism.

17. Dude, they’ve got drugs to treat that sort of condition now, don’t you know?

18. Caveat: I’ve now supported myself at a comfortable income level for almost four years as a software developer and data analyst—pretty much a successful business model by most reasonable definitions—but in that time I have not once managed to create a company whose capitalization exceeded a billion dollars! Alas, my perspective on business practice is undoubtedly affected by this.

19. But please do not expose students to the Pennsylvania corporate tax form[s] RCT-101  or they will immediately decide it would be way more pleasant to, say, become a street performer who chews glass light bulbs for spare change. Small business: just say no to Pennsylvania. I digress.

20. In light of the extended discussions over the past few days about just how psychologically challenging the academic world can be—teaching usually is rewarding but the rest of the job is largely one of nearly endless rejection and insult—another decided advantage of dealing with relatively short-term engineering problems—assuming, of course, that one can solve these successfully—is that one gets a lot of immediate gratification. And in the U.S. at least, running a small business has generally positive popular connotations even if, in practice, The Establishment, and both political parties (small business ≠ pay-to-play), and certainly Mr. Stiff-the-contractors President are very hostile, though probably no more so than they are towards academics. So individuals following this path are likely to be happier.

21. As it is partially paid for by tax dollars, I’d like to see the CVille hip hipster fest showcase some guy who graduated from the local community college and now runs his own welding shop, or some women from similar circumstances who are starting up a landscaping company. I’d also like to see pigs fly, which I’m guessing will happen first.

22. A very concrete example of this problem arose about a week later when I was attending a local geek-fest whose purpose is largely, if unofficially, recruiting, and while I’m not [currently] recruiting, I had some interesting chats with some folks who expressed interest in what I’m doing (and I’ve realized that compared to what most people in data science end up doing, the tasks typically undertaken by Parus Analytics, a really, really small little operation, actually are quite interesting…), so I asked them with they had business cards.

They didn’t have business cards.

Look, I am not going off on an extended screed about stupid Millennials, how could you not have business cards! (“Scotty, are the snark generators fully charged and engaged?” “Ay, Capt’n, and I’ve got them set at 11” “Excellent, so….GET OFF MY YARD!!!” I digress…) No, my point is that those sweet and helpful people who are telling Millennials to network, lean-in, and don’t-be-afraid-to-fail-because-you-have-a-large-trust-fund,-don’t-you? should also be advising “Don’t go to a networking/recruiting event without business cards,” and, while you’re on the topic, point out that the cost of 200 nicely individualized (from the zillion available templates) business cards runs about the same as a growler of craft beer, and take five minutes to set up and order, and if you hit a sale (likely common around graduation time), about the cost of a pint of craft beer.

But in the absence of a business card—yes, they seem very archaic but it’s hard to beat the combination of cost, bandwidth and specificity—the person you are trying to impress is far less likely to remember your name, and will probably misspell the URL of that web site you so carefully developed and instead get onto the home page of some Goth band whose leader bites off the heads of bats and whose drummer sells products for Amway.

Just saying…

Posted in Higher Education, Methodology, Programming | 2 Comments

Yes, Virginia, the social and data sciences are “science”

Dedicated to the memory of Will H. Moore

This is the first in a two-part series on leveraging quantitative social science programs to provide training in data science, inspired by a recent invitation to provide input on that topic at an undisclosed location outside of Trumpland. Where I may wish to seek asylum at some point, so I don’t want to mess it up.

In the process of working through my possible contributions, I realized my approach was predicated on the issue of whether these fields were sufficiently systematic and developed that the term “science” applied in the first place. They are—or at least the social science parts are—though as far as leveraging goes, as we shall see, the argument is more nuanced. [1] That’s for the next essay, however.

The “Is social science really science?” issue, on the other hand, never seems to go away, even as the arguments against it grow older and dumber with each passing year. In fact just a couple weeks ago I was out at a workshop in…well, let’s just be nice here: you know who you are…and once again had experiences that can best be compared to a situation where NASA has assembled a bunch of people to assess a forthcoming mission to a comet, and the conversation is going okay, if not great, and all of a sudden someone wearing the cap and gown of a medieval scholastic pipes up with “Excuse me, but your diagram has everything wrong: the Earth is at the center of the planetary system, not the Sun!” Harumph…thank you. Despite the interruption, the conversation continues, if more subdued, until about fifteen minutes later another distinguished gentleman—they are always male—also in cap and gown, exclaims “This is rank foolishness! Your so-called spacecraft can’t pierce the crystalline spheres carrying the planets, and it doesn’t need to: comets are an atmospheric phenomenon, and you should just use balloons!”

Protocol in these situations, of course, requires one to listen politely while others nod in agreement, all the while thinking “That is the dumbest f’ing thing I’ve heard since, well, since the last time I was in a meeting sponsored by this organization.” And hoping at least that the lunch-time sandwiches will be good. [they were]

The parade of nonsense invariably culminates in some august personage, who certainly hasn’t been exposed to anything remotely resembling philosophy of science since hearing a couple apocryphal stories in 8th grade about Galileo, woefully intones the mantra “I’m sorry, but it is simply impossible to study humans scientifically: they are too unpredictable.”

Yeah, brilliantly original observation Sherlock! And that’s why every morning we see Sergey Brin and Mark Zuckerberg on opposite sides of a Hwy 101 freeway entrance in Palo Alto holding cardboard signs reading “Will code for food.” So sad.

So, before proceeding to my bolt-hole essay [2], let’s pursue this larger issue in some detail with, of course, seven observations. I’m going to focus most of my remarks on the scientific development of political science, which is the case I know best, but every one of these characterizations applies to all of the social science sciences (in the case of economics, lagged by a good fifty years, and demography, perhaps 75 years).

I will be the first to acknowledge, of course, that “political science” created a bit of trouble for itself since originally, when the “American Political Science Association” (APSA)—now merely a real estate venture that happens to sponsor academic conferences and journals—labelled itself in 1903 (!) as “science” rather than say, “government” or “politics.” This designation was partially signaling a common cause with the Progressive reaction to the corrupt machine politics of the day, but mostly because in the midst of the electrical and chemical revolutions of the late 19th century, “science” was a really cool label. Such was the zeitgeist, and the nascent APSA’s branding was no different than what its contemporary Mary Baker Eddy did in the field of religion. [3]

From this admittedly dodgy start, political science did, nonetheless, gradually develop a strong scientific tradition (as economics had about fifty years earlier), notably with Charles Merriam at the University of Chicago in the 1930s—though frankly, Aristotle’s mostly-lost empirical work on governance structures appears to have been pretty decent science even in the 4th century BCE—and from the 1960s onward, surfing a veritable technological tsunami of the computer and communications revolutions during the late 20th century. The results of these changes will be outlined below. [4]

From the perspective of formally developing a philosophy of social science, however, these developments hit at a rather bad time, as the classical logical positivist agenda, dating back to the 1920s, had ground to a slow and painful halt on the insurmountable issue of the infinite regress of ancillary assumptions: turtles all the way down. That program was replaced, for better or worse (from the perspective of systematic definitions, mostly the latter) by the historical-sociological approach of Thomas Kuhn. On the positive side, the technological imperatives/opportunities were so great that the scientific parts of the field simply barged ahead anyway and—large scale human behavior not being infinitely mutable—probably ended up about where they would have had the entire enterprise been carefully designed on consistent philosophical principles.

And thus, we see the following characteristics as defining the scientific study of human social behavior:

1. A consistent set of core methodological concepts accepted by virtually all researchers and institutions using the approach.

In the case of political science, these are conveniently embodied in the still widely used—for all intents and purposes canonical—text first published in 1994: King, Keohane and Verba, Designing Social Inquiry: Scientific Inference in Qualitative Research (KKV) [5] Apply the concepts and vocabularyof KKV (which, recursively, deal with the critical importance of concepts and vocabulary) anywhere in the world where the scientific approach to the study of political behavior is used, and you will be understood. KKV certainly has its critics, including me, because it defines the discipline’s methodology in a narrow frequentist statistical framework—albeit that approach probably still characterizes about 90% of published work in the discipline—but those debates occur on the foundations provided by KKV, who had adopted these from the earlier developments starting with Merriam and his contemporaries, and with massive input from elsewhere in the social and behavioral sciences, particularly economics and psychology.

2. A set of well-understood tools that have been applied to a wide variety of problems.

Okay, as I’ve argued extensively elsewhere [unpaywalled director’s cut here], this has probably become too narrow a set of tools, but at least the [momentously horrible in far too many applications] downsides of those tools are well understood and readily critiqued. These are taught through a very [again, probably too] standardized curriculum both at elite graduate institutions but also through specialized summer programs in the US and Europe, with the quantitative analysis program of the University of Michigan-based Inter-University Consortium for Political and Social Research dating back more than half a century. These core tool sets have experienced the expected forking and specialization through the creation of new and more advanced techniques. [6]

That core is not static: while published political science research is increasingly dominated by linear and logistic regression [7] the past two decades saw the rapid introduction and application, for example, of both Bayesian model averaging for conventional numerical analysis and latent Dirichlet allocation models of textual topic models, the gradual introduction of various standard machine learning methods [8] and, I can say with certainty, we will soon see ample applications of various “deep learning” methods at least from researchers at institutions that can afford to estimate these.

3. Theories and methods of inference

As with the “Science” part of “APSA,” this one’s a little complicated. Or it is if you are listening to me.

On the surface—and this is the canonical KKV line—quantitative political science has a very clear model of inference, the “null hypothesis significance testing” or “frequentist” statistical approach, which is nearly universally used. Unfortunately, frequentism is a philosophical mishmash that was simply the last-thing-standing at the end of vicious methodological debates over statistical inference during the first three decades of the 20th century, is just barely logically coherent while being endlessly misinterpreted, and its default assumptions are not really applicable to most [not all] of the problems to which it is applied. Except for that, frequentism is great.

The alternative to frequentism is Bayesian inference, which is coherent, corresponds to how most people actually think about the relationship between theories and data, and in the past forty or so years has become technically feasible, which it was not true in the 1920s, when it was relegated to some intellectual netherworld by the victorious “ABBA”—anything but Bayesian analysis—faction.

Finally, creeping in from the side, without—to date—any real underlying philosophy beyond sheer pragmatism, are the machine learning methods. Though betting as ever on pragmatism, a yet-to-be-specified philosophical merging of Bayesian and machine learning, which will not be particularly difficult to do, is likely to develop and could well be the dominant approach in the field by, say, 2040. Just saying.

The important point here, however, is that the issue of inference is actively taught and debated, and in the case of the frequentist-Bayesian debate, this discussion goes back more than a century. There’s a lot going on here.

4. Theories and methods for assessing causality

Contrary to the incantations of the bozos who intone “correlation is not causality” at social scientists like we’ve never heard the phrase before, causation is a fabulously complex problem: The Oxford Handbook of Causation alone is 800 pages in length, Judea Pearl’s book on the subject is 500 pages, and the Oxford Handbook series, perhaps noting that a copy of Oxford Handbook of Causation is currently (19 April 2017) priced on Amazon’s secondary market at $2,592.22 [9] also offers an additional 750-page Oxford Handbook of Causal Reasoning.

Which is to say, the question of causality complicated, and it’s always been complicated. It’s even more complicated when dealing with social behavior because one can’t even assume a strict temporal ordering of causes and effects. [10]  But as with inference, we have frameworks and vocabularies for looking at the problem, about fifty years of work approaching it using a variety of different of systematic empirical methods (invariably incorporating a variety of different sets of assumptions and trade-offs), and throughout the development of the discipline, an increasingly sophisticated understanding of the issues involved.

5. Experimental methodologies, including laboratory, quasi- and synthetic

Classical experimental methods generally dropped out of political science for two or three decades: the issues of generalizing from laboratory studies—particularly those with subject pools consisting of reluctant middle-class white undergraduates—to more general behaviors seemed simply too great. But a couple decades ago the method got a second look and subsequently has developed a thriving research community using a variety of methods.

But even during the nadir of classical experiments, extensive analyses were done with “natural” or “quasi-” experiments, particularly in the policy realm. Starting in the 1990s, “synthetic” experiments using artificially matched controls (in the discipline, these are often classed in the “causation” literature) have also seen extensive work.

It is certainly the case that compared to chemistry or mechanics, it’s hard to do a true experiment in political science (though with a suitably cooperative government and suitably lax institutional review board, you can get pretty close). Last time I looked, however, it was also pretty hard to do these (outside a computer) in geology, astronomy and climatology as well. Last time I looked, no one (outside Trump administration appointees) was questioning the scientific status of those fields.

6. Consistent standards for the presentation and replication of data and results

Thanks in large part to the early and persistent efforts of Harvard’s Gary King, political science has been well ahead of the curve, by a good twenty years, in combating what is now called the “reproducibility crisis.” [11] Completely solving it, no—that won’t occur until we pry people away from thinking that loading lots of closely related variables into methods with the potential for providing wildly different results based on trivial differences in their methods of matrix inversion and gradient descent is a good idea [it isn’t]—but the checks are in place and, outside the profit-seeking journals (alas, most of them), the proper institutional expectations are as well.

The critical observation here, however, is that we can even have a reproducibility crisis: for three or four decades now quantitative political science has had a sufficiently strong foundation in shared data and methodologies that it is possible to assess whether something has gone wrong. Usually these are innocent mistakes due to carelessness and/or covariance matrices that are a bit too close to being singular, but every once in a while, not so innocent. Ending up as a story in every major publication in the country, as well as Nature and Science.

From the perspective of the scientific method, if not the individuals and institutions involved, that’s a good thing. Post-modernist approaches, I can assure you, will never experience a reproducibility crisis.

7. A mature international research community grounded in major universities and with active scientific communication through professional conferences and journals [12]

Which is to say, all of the features I’ve discussed in the previous sections have been thoroughly institutionalized, following exactly the model that goes back to the establishment of the Royal Society in London in 1660. Ironically, due to predictable institutional lock-in effects, it was necessary for the quantitative side of political science to actively break off from the APSA in the 1980s—a critical insight of John Jackson and Chris Achen, who founded and shepherded the development of the now-independent Society for Political Methodology in that period—but by the early 2000’s the SPM’s journal, Political Analysis, had well eclipsed the journals of the older organizations as measured by the usual impact factor scores. [13] Graduate students at [most] elite institutions can get state-of-the-art training in methodology, and go on to post-docs and, ever so occasionally, faculty positions at other elite institutions. [14] Faculty and grad students—well, prior to the outcome of the 2016 US election—move easily between institutions in North America and Europe. [15]

 

The upshot: the social “sciences” involve a complex community dealing with multiple and continually evolving approaches and debates at both the philosophical and methodological levels. Returning to our touchstone of the Oxford Handbook series, the Oxford Handbook of Political Methodology, whose three editors include two past presidents of the Society for Political Methodology, runs to 900 pages. Oh, and since we’re discussing the social sciences more generally, did I mention that the Oxford Handbooks relevant to statistical economic modeling will set you back about 3,000 additional pages?

Do you need to master every nuance of these volumes to productively participate in debates on the future applications of social science methodology? No, you don’t need to master all of it—no one does—but before making your next clever little remark generalizing upon issues whose history and complexity you are utterly clueless about, could you please at least get some of the basics down, maybe even at the level of a first-year graduate student, and let the rest of us who know the material at levels considerably above that of a first-year graduate student get some work done?

Just saying.

As promised, we will pick up on some of the more positive aspects of this issue in the near future.

Footnotes

1. So I’m not going to be using the phrase “data science” much in this essay, though in most of the “data sciences” one is interested in human individual and social behavior, so it’s the same thing.

2. Which is to say, now drawing to a close my tortured tale of the travails which I must endure to provide you, the reader, with a seemingly endless stream of self-righteously indignant snark.

3. In the domain of religion, L. Ron Hubbard would take this approach “to 11” in the 1950s, the Maharishi Mahesh Yogi would do the same in the 1970s and so on: history doesn’t repeat but it certainly rhymes. And by the way, anyone who thinks U.S. culture is relentlessly “anti-science” hasn’t spent much time looking at advertising for cosmetics and diet programs.

4. Those comments will focus on only the scientific aspects of political science, which retains a variety of distinct approaches—for example philosophical, legal-institutional, historical, observational, and, of course, in recognition of 4-20, we must acknowledge post-modernism—rather than being exclusively scientific. Let a hundred flowers bloom: My point is simply that if because of organizational imperatives (and, perhaps, the ability to get consistent reproducible results) you want a “scientific” study of politics, these methods are extensive, sophisticated, and well developed, and this has been the case for decades.

5. Two of the authors of this now canonical text were, remarkably, from Harvard, a little institution just outside of Boston you may have heard of, and yet even so the book managed to influence the entire field: imagine that! Despite the title, it’s a guide to quantitative research: the word “qualitative” is just in there to gaslight you.

6. Which the organization-which-shall-not-be-named ignores entirely, preferring instead to pay people to develop their own independent methods no one else in the field takes seriously. Ain’t it great to have untold gobs of money? Organization-which-shall-not-be-named: pretty slick title for this essay, eh?

7. All too commonly over-specified logistic models applied, often as not, to problems which could be far more robustly studied using simple analysis-of-variance methods originally developed by Laplace in the 1770s, but no longer taught as part of the methodological canon…I digress…

8. Mind you, some of us were applying machine learning methods to conflict forecasting in the 1980s—see Valerie Hudson’s edited Artificial Intelligence and International Politics (1991)though this didn’t catch on and, given the data and computational power we had at the time, was perhaps premature.

9. Seriously, but there’s a pretty mundane causal explanation for that sort of thing and it involves computers.

10. Classic case here in political science—the buzz-term is “endogeneity”—is determining the efficacy of political campaign expenditures: In general, more money will be spent on races that are anticipated as being close, so simple correlations will show high expenditures are associated with narrower victories, the opposite of the intended effects. There are ingenious ways for getting around this using statistical models—albeit they are particularly difficult on this problem because the true effects may actually be fairly weak, as Jeb Bush during the U.S. Republican Party primaries in 2016 was only the latest of many well-funded candidates to discover—and those methods are anything but intuitive.

11.  Shout-out to Charlottesville’s own Center for Open Science

12. Okay, so too many of those journals publish lowest-common-denominator research that is five years out of date, and they all are fiercely resisting open access, but that’s just rent-seeking behavior that could be stopped overnight with a suitable collaborative assault by about twenty-five deans and five research funders. Someday. I’m a Madisonian: I study humans, not angels.

13. Meanwhile both the political science experimentalists and social network analysts have split off from the more traditionally statistical SPM, forming their own organizations, with the text analysts probably soon to follow. Which really pisses off some people in SPM but hey, things change: go for it. Also it’s not like the SPM summer meetings, heading for their 34th consecutive year, are exactly undersubscribed.

14. Most do not and get jobs at lower ranked institutions: this is true in all disciplines and there’s even a name for it, which I can’t readily locate.

15. For whatever reason, I’ve long been more comfortable professionally with the Nordic and Germanic Europeans than with my fellow Americans. Perhaps because I write stuff like this. Or because the Europeans don’t harbor the same preconceptions about Kansas as do Americans…

I would also note that I’m using the phrase “North American and European” simply to reflect where things stand at the moment: the conservatism of the well-funded Asian universities (e.g. Japan, South Korea, older Chinese institutions) and the lack of resources  (and still more academic conservatism) in pretty much everywhere else in the world limit the possibilities for innovation. But in time this will happen: generations of graduate students from around the world have been getting trained in (and have made important contributions to) these methods, but the institutional barriers in transferring them, particularly when dealing with the study of potentially sensitive political issues (that’s you, China…), are substantial.

Posted in Uncategorized | 1 Comment

Reflecting on the suicide of Will Moore

I spent most of today working on a new blog post motivated in part by a re-tweet of a teaser for same by, well, Will Moore. Who I had seen in Phoenix only six weeks ago where he introduced my talk with—and I am now so glad I told him this at the time—one of the most thoughtful and eloquent introductions I’ve ever received. This was followed by dinner at an appropriately sleazy Tex-Mex place which Will, being Will, hoped would at least begin to match the appropriately sleazy zydeco joint he took me to outside Tallahassee a few years earlier.

Then, via Twitter, the village well of the political set, the news, and then reading through his final blog post. There’s more to say than really works on Twitter, and so that other blog post is going to wait a day or two, even though I suspect Will would be particularly fond of it. In this one, three thoughts for the living:

1. It is certainly the case that our relatively small community of experts on violent political conflict—of which Will was a part—do not have the most stressful of jobs: those go to the people sent, nowadays often repeatedly, into conflict zones by “leaders” who simply think it will send “a message”, who would never in a million years ask members of their own families to do this, yet mindlessly send fellow citizens into regions and cultures they don’t understand and have been given little practical preparation for, and on returning are told simply to “get over it” or, at best, wait at the end of a really long line. People on the receiving end of these cynical and soulless political “statements” don’t fare too well either. I’m not equating our academic and research experience to that.

Yet at the same time, I suspect, particularly over the long run, this work begins to take a toll. I left academia before “trigger warnings” came into vogue, but in my courses on conflict and on defense policy, there were books where every other chapter probably deserved a trigger warning. And those were just the ones I assigned: the background reading could be far worse. In the process of coding the PITF atrocities data, every month I get to read every story from anywhere in the world of journalists getting gunned down in front of their children, people desperately searching for the bodies of their wives or husbands in the debris of marketplace car bombings, and joyful wedding parties suddenly reduced to bloody carnage because some shell went astray or someone misinterpreted the video from a distant drone. You can’t take it all in, but you can’t, and shouldn’t, just ignore it either. When I’m doing this coding, my mood is invariably somber, and every once in a while, I’ll say something and realize no, most people don’t think like this.

And so, to those in this business: we’re too small a population to ever study, but this could well have effects, and I’m sure they are not positive. Be careful, eh?

2. Since the Deaton and Case study, there’s been a lot of concern—though of course, little real action—on the rise in “deaths by despair” in the white middle class. Is this another such situation?: perhaps, and Will’s final blog certainly points in that direction.

My grandfather and great-uncle in a down-and-out corner of nowhere in southwestern Indiana both killed themselves late in life: the family saying—Hoosiers, just a bundle of laughs they are—was “The Schrodt treatment for depression is drinking a pint of Drano.” So the prospect of depression—which I got close enough to a couple of times much earlier in my life to get a sense of the possibilities—is always there. I self-medicate a bit—St. John’s wort during the winter months, and exercising caution in my consumption of recreational depressants—and self-meditatelot. Both seem effective at keeping the beast at bay.

If not, there are people who understand these things way better than an affected individual can as an amateur, and please avail yourself of them: when it was needed— induced in my case by stresses in the academic world—I certainly did. The Hollywood/Woody Allen/New Yorker cartoon image of endless years of talking with some balding and bearded guy while on a couch: no, it’s not like that at all (or at least wasn’t for me), instead just a lot of focused and sensible discussions with one or more smart and empathetic people for a few weeks or months aimed at figuring out what is setting you off and how you might change this. My grandfather and great-uncle didn’t have that sort of help available, whereas I—and I would guess virtually everyone reading this blog—did have it, easily. Use it.

Your chances of needing therapy, however, will decrease if you’ve got some community. Any community: book group, Renaissance fairs, Saturday bicycling, roller derby, helping kids learn to code. Whatever: just some group of people who will occasionally inconvenience and aggravate you but which you trust could be occasionally inconvenienced and aggravated in return. And which will for a period of time get you away from staring at screens filled with pixels and wondering how you’ll find time to read (or write) yet another article. We’re social animals: we have no more evolved to be alone than we have evolved to live underwater. Bob Putnam pointed this out twenty years ago; little changed, and the outcome was Deaton and Case.

3. My final point is one I’ve made before, and is addressed specifically to aging academics: if you are feeling like you’ve done your time, as Will clearly did based on what he said in his final blog posting, get out! Voluntarily, not [just] for the sake of the next generation, but for your own sake. I did this four years ago and can quite literally say I have not for a second regretted the decision: there’s another world out there, new things to explore, new opportunities, things you never thought you’d do, go for them. If there is one thing I wish I could have said to Will on that trip in January—though I probably did, just not with sufficient effect—that would be it.

If you are happy in academia, great!—keep at it. But don’t keep at it if you are not happy: you’ve got golden handcuffs, and they may be golden, but they still are handcuffs. In something like Will’s situation, with family responsibilities in the past, the world is open to you and you won’t be able to even imagine some of those opportunities until you let go of the routines and grasp the freedom. I could (and have) go on and on about how academia, with its rigid schedules, suffocating bureaucratic complexity, repetitive debates, endless preening and hustling, and vast time horizons stretching far into the future, is not a country for old men (nor, generally, women of any age): there’s more to life than another stack of blue books and another faculty meeting considering reversing policies you had endless faculty meetings putting into place fifteen years ago, those in turn reversing policies established fifteen years before that.

Really.

When we were living in Norway, we noticed a common phrase on gravestones in little rural cemeteries: Takk for alt. Fred.  Thanks for everything. Peace.

Thanks for everything, Will. Peace.

 

Posted in Higher Education, Ramblings | 1 Comment

Reflections on Uber, brogrammers, and the effectiveness of working class programmers

The toxic “brogrammer” culture has been in the news again, initially with the blog posting of engineer Susan J. Fowler’s year-long experience with sexual harassment at Uber, the reaction of Uber’s CEO [1] Travis Kalanick who was shocked, shocked to discover he is surrounded by utter assholes, then new video proof that, shock, shock, Travis Kalanick himself is an utter asshole (as well as exhibiting a charmingly naive lack of awareness of the capabilities of dash-cams), followed by still more revelations of nefarious deeds [2], reaching a point where I wouldn’t be surprised to find the Trump administration simply deciding to hire Uber’s entire management team to fill their thousands of empty appointments. Oh, and did I mention that despite all this, Uber is losing fantastic sums of money?  Hey, you go guys!

But while Uber is perhaps unusually bad given the visibility and nominal value of the company, it is hardly unique, and in fact the bad-boy programmer is a long-running cultural meme: see for example the 1993 movie Jurassic Park where you get a guaranteed applause line when the “programmer”—subtly named “Dennis Nedry” and played as Jabba-the-Hut minus table manners—becomes Dilophosaurus chow. Two books I read last summer on the current programming start-up culture— Disrupted: My Misadventure in the Start-Up Bubble by Dan Lyons and Chaos Monkeys: Obscene Fortune and Random Failure in Silicon Valley by Antonio Garcia Martinez—are essentially book-length renditions of the same story, and the racist, misogynistic culture of Silicon Valley generally is a story dating back two or three decades. [3][22]

So, what gives here? These stories are so pervasive that those outside the real world of programming probably are asking why there is even grist for a blog entry here. But at least some of those of us on the inside find it puzzling: how could people with the personality disorders it apparently takes to get ahead in these companies ever write decent code? Fundamentally, programming requires intense concentration for extended periods, a high level of attention to unforgiving detail, and a willingness to set aside your ego when trying to improve a product. Not, shall we say, exactly consistent with the nonstop Animal House bacchanalia that characterizes these companies. How are these people possibly writing decent code?

The answer, I would submit: they aren’t. And the intensity of their craziness and exclusivity is simply a smokescreen for that fact.

Or to be a bit more nuanced, what the brogrammers are riding on—and their success will almost certainly be temporary—is that fact that the contemporary programming environment allows an individual, or a relatively small group of individuals, to do absolutely extraordinary things with relatively little effort. Hence a team of brogrammers with only average skill can ship a reasonably workable product more or less on time while spending, max, perhaps 20 hours a week doing something with code while not hung-over or otherwise incapacitated [4], and spending the rest of the time on office politics, talking sports, violating every known workplace discrimination law, and ingesting every imaginable intoxicant. All the while claiming, of course, to be working 80 hour weeks. The sheer aggressiveness with which they work to exclude people from outside the brogrammer culture is born of the necessity to keep this fact from becoming common knowledge.

But, no, no, it can’t be!: our entire computer infrastructure, the very fact VCs are throwing billions  of dollars at us, is dependent on hiring arrogant misogynistic assholes and party animals! They must be tolerated, lest we be reduced to a state where our iPhones are useful only for killing rats for food! Nooooo…

Well, bunko, let me explain: it ain’t like that any more. Mind you, it probably never was like that—and I do hope that Hidden Figures, both the book [5] and the movie, begin to correct the historical misconceptions—but it certainly isn’t that way now, because getting computers to do incredibly impressive things is really easy now. [6]

This, in turn, is primarily due to two innovations which only came into play in the past decade (conveniently just beyond the learning horizon of a substantial number of people who are investing money in brogrammer teams): open-source toolkits, and web-based crowd-sourced documentation [20], in particular a site called Stack Overflow. In a nutshell, the toolkits mean that almost any general problem you need to solve, unless it is really recent—typically meaning weeks—will already have been coded, and that code is available for free. If you have any problems getting that code to work, and it’s been around more than six months or so, the solutions for those issues are on the web as well. [7] You basically just fill in the gaps relevant to your specific application, and you’re done. 

Really, it’s that simple. As William Gibson puts it “The future is already here; it is just unevenly distributed.”

To explore this a bit further, consider the recent story about a Nigerian programmer being stopped at the US border by the Stasi…errr, ICE…and “asked to write a function to balance a binary tree.” [8][19] As some wag on Twitter noted, the correct response to that question is “I’d look it up on Stack Overflow.” [10.] A slightly better question would have been “Tell me the circumstances under which you would need to balance a binary tree” but even then the correct answer would be “You first tell me a situation where I could possibly justify the time involved in setting up a binary tree rather than doing equivalent tasks with existing data structures in Python.” By that point, of course, anyone giving those responses would be blindfolded, in cuffs and in the cargo hold heading back to Lagos but still, those are the correct answers. Assuming that a working programmer needs to be able to balance a binary tree is like refusing to hire a house painter unless they know how to make their own brushes and mix their own pigments. 

With decided speed [16] the world of programming has evolved into one where people with decent training but generally average skills can do absolutely extraordinary things in very short amounts of time. This is the world of self-documenting open-source software, increasingly running in effectively universal hardware environments, which is the consequence of a series of relatively recent developments about which I will have an absolutely fantastic blog entry if I can ever get around to finishing it. [11] The necessity of retaining the arrogant irreplaceable toxic genius asshole and the swarms of party animal brogrammers indulged with a “work” environment characterized by substantial investments in toys and endless keggers was a gamble with the odds stacked against it in the best of times, and is completely unnecessary now.

Which brings us to my final point, the possibility of the emergence of the working class programmer. Working class in two respects. First, recognition that programming projects can be quite competently done by a suitably trained, responsible and, well, ordinary individuals. The phrase “programmer” as a generic job description was probably last relevant in the early 1970s, before the emergence of time-sharing, or at best in the early 1980s before the emergence of graphic user interfaces.[21] After those points in time, the field fragmented—in a perfectly normal and organic fashion—into ever-increasing sub-specialties which require different sets of skills [12]  but, like all professional specializations, can be mastered to a level where one can produce very competent code after about 500 to 1000 hours of training and practical experience. The notion that only a tiny number of people can achieve this is the much maligned “talent myth,” [13] which leads clueless managers and VCs to seek out “the best people”—who just happen to always be white, male, and great companions, particularly while downing vodka shots at strip clubs [14]—rather than putting together a calm, predictable, and competent team with the appropriate skill sets.

These jobs are also, perhaps a bit anachronistically, “working class” in the sense that the salaries required to get these people are the sort associated, with the usual adjustments of inflation, with those of workers with specialized skills in the industrial age (with health insurance, nowadays without a pension or union), which means in the $60K to $100K range, and working 40 hour weeks or something reasonably approximating those. But also with the expectation that one will follow contemporary professional standards. There are plenty of such people around.

[Shameless self-promotion alert!!] The other thing you might at least consider is getting a few people with a bit—maybe even more than a bit?—of experience, because about 90% to 100% of the stuff you need to do on any project will be routine, not the stuff that can (in theory though rarely practice) only be accomplished by arrogant irreplaceable toxic genius assholes. And when it comes to “routine”, or even some things that aren’t routine, experience makes things easy. In particular, take this quote from 2017 Superbowl champion quarterback Tom Brady 

“I have the answers to the test now,” Brady said. “You can’t surprise me on defense. I’ve seen it all. I’ve processed 261 games, I’ve played them all. It’s an incredibly hard sport, but because the processes are right and are in place, for anyone with experience in their job, it’s not as hard as it used to be.”

Now move that into the programming realm: you’ve got a brilliant new idea that you want to pitch to—or may already have been funded by—DARPA or IARPA. But I know the same thing has been tried five times over the past forty years, and based on the current state of the technology, I know which parts look easy but aren’t, which parts can in fact be efficiently solved now but couldn’t be earlier, and I’ve got a pretty good idea where things are going to take much longer than you think. I know which of the data sets you think exist in fact don’t, and the details of twenty more data sets which would actually be more useful but which you’ve never heard of. I can’t do this with every project, just as I’m guessing Tom Brady’s tennis game isn’t particularly exceptional, but in my field, I can do it with a lot, and in other fields, there are plenty of other people just like me, for whom “it’s not as hard as it used to be.”

Just sayin…

So where do you go from here? If your shop is festering with brogrammers, follow the advice of Nellie Forbush in South Pacific and “show’em what the door is for.” Though nine times out of ten—well, unless you are Uber [17]—just enforcing the rules in the employee handbook will be sufficient. Based on what I’ve seen, these brogrammers with no discernible talents beyond office politics,  partying, and harassing women and minorities would fit in pretty well as roofers—hanging dry-wall takes too much attention to detail—and will enjoy the camaraderie of following hailstorms across the Great Plains. [15]

But, but, I can’t do that! I have to lose money! Lots of money! In order to be successful and attract vast sums of investment capital, I have to make absolutely sure my start-up is horribly inefficient, misses deadlines and ships crappy code! If I don’t do that, the VCs won’t take my company seriously, and pretty soon I’ll be hanging drywall! You…just…don’t…understand!!!

Calm down, calm down…surely we can work this out. Yes, there are vast pots of dumb money floating around, and only so many good ideas, though if you knew how to assemble a competent team for programming, not just tail-gating, more good ideas than you might think. So, how about this:  just put a bunch of Syrian refugees on your payroll, tell them just to take the money and spend their time getting their lives back together and watch while their kids rapidly learn English to a level of fluency superior to that of the average Middlebury undergraduate, and tell the VCs the Syrians have brought you secret algorithms stolen from the Russians and smuggled out of Aleppo in the final days of the siege. You know, Rogue One. Key thing is they’ll be burning money, which the VCs want to see, and unlike the brogrammers, the refugees won’t be getting in the way of people actually adding value to your product. 

This, bunko, is a pathway to being hailed as a management genius and pretty soon, a TedX talk!

Footnotes

1. Chief enabling offender

2. That’s not to say Uber doesn’t have an eventually viable business model, just as sending around a bunch large guys carrying lead pipes to threaten to break the kneecaps of owners of pizza parlors is a viable business model. [18]  But recognize it for what it is, and stop pretending that it encompasses some sort of fantastic breakthrough in software and systems design: all Uber has done is implement a pretty obvious and easily duplicated idea with a reasonable level of competence. Albeit by most accounts, the sorts of operations that dispatch guys with lead pipes to engage the owners of pizza parlor owners turn a profit. Which Uber, as noted above, does not.

3. Though within the community, the Bastard [Systems] Operator from Hell—BOFH—was a long-standing meme dating to the early 1990s . Though unlike Uber managers, the BOFH was not presumed to be disproportionately harassing women, or at least that wasn’t part of the memes I was hearing—it probably does exist somewhere in the genre.

4. From an outsider’s perspective, one of the absolutely weirdest management fads (more like a cult) over the past couple of decades is “pair programming,” where you hire two programmers to do the work of one. It has its very strong advocates (and detractors)—just Google it—and doubtlessly worked in some instances. But note that this is the unscrupulous manager’s dream: “Wow, I get to hire all of my otherwise unemployable party-boy friends, and pair each one with a nerd who will not only do all of the work, but who I’ll allow them to terrorize with impunity, and things will be golden forever!” Can’t imagine that this model hasn’t been deployed in more than one situation.

5: Shout-out to Charlottesville author Margo Lee Shetterly!

6. What is programming? In my mind, if what you are doing uses logical loops and branches, you’re programming. Ada Lovelace wrote the first loop: that’s when programming started. If this statement makes no sense to you, you aren’t programming. Data structures also matter; the rest is pretty much just appropriately using an ever-changing collection of libraries and knowing how to efficiently debug code. Mostly the last.

7. We’re still on the leading edge on this, but almost all serious production applications will now be configured to run in the cloud, thus standardizing hardware and to some degree, operating systems. REST interfaces to Docker containers also seem to be emerging—at least for now—as a standard for hiding diversity in the underlying software.

8. The Stasi agent allegedly said “You don’t look like a programmer.” Which is rather making my point isn’t it?: “programmers” may be white, South Asian or Chinese, but never African. To say nothing of African-American. Which is to say, in terms of racial integration we’ve actually gone backwards from the opening scene in the movie Hidden Figures, where an archetypical redneck cop in tidewater Virginia could still be persuaded that African-American women were helping us beat the Russians. [9]

9. To say nothing of the anachronism that a [now] presumably Republican southern cop would view the Russian government with suspicion, rather than embracing them as BFF. My, how things do change…

10. For those unacquainted with Stack Overflow, it provides the equivalent of the fact that one can enter the phrase “which dinosaur killed programmer in Jurassic Park” into Google and instantly get a thorough answer, bypassing the inconvenience of watching the movie.

11. Or more precisely, editing it, as I’ve written 16 pages, and these entries should be about half that. 10 pages already written on the identity/economics crisis in the Democratic Party, 35 pages on the future of Western democracy…someday…

12. Generally not including balancing binary trees.

13. “Talent myth” links: originally the concept was Malcolm Gladwell’s in 2002; here’s a 2015 update from Huffington post http://www.huffingtonpost.com/amol-sarva/talent-is-a-myth_b_6793870.html and here’s a fairly influential version specifically directed at the programming community https://lwn.net/Articles/641779/ But really, any task whose effectiveness comes from summing a series of sub-tasks for which skill levels are randomly distributed, according to any distribution, will end up with the composite effectiveness having a Gaussian (bell-shaped) distribution: that the law! (specifically the Central Limit Theorem). Whereas it is nearly impossible to come up with a data generating process that would produce the U-shaped curves assumed by the talent myth.

14. The vast amounts of money funding keggers and tolerance of behaviors that keep HR’s lawyers awake at night are proof positive that high-income tax rates are too low, not too high. Another reliable rule-of-thumb: If you are spending more than a quarter of your time dealing with personnel conflicts or office politics, you are over-staffed.

15. Of course, where they will actually end up is in finance, not roofing—again, unless under-secretary positions are still available in the Trump administration—though that apparently doesn’t provide quite the number of jobs for the well-connected party-boy incompetents as it once did. Though it is unsurprising that VCs and hedge funds find this model attractive, as it is essentially the same as their own model: put up with large amounts of stuff that will fail (as no less than Warren Buffet has pointed out, hedge funds are actually a terrible deal) in the hope that one or two will pay off big time. It is true that a few arrogant genius asshole programmers actually might, in the right set of circumstances, add value to a project, and it is just possible that maybe one of them will make your company a unicorn. But the chances are low that you’ve actually managed to hire such an individual, whereas the chances are quite good that the ones you actually did hire will create such a sufficiently dysfunctional environment that your company will fail, or yet least not do as well as it would otherwise.

16. Doing my background research, I was astonished to see that Stack Overflow dates only to 2008: it feels as though it has been around forever.

17. Or Kay and Jared Jewelers

18. As it happens—I’m in meetings where this sort of thing is discussed seriously in the context of political instability—extortion, rather than dealing drugs, is actually the most straightforward way for most gangs to make money. They also don’t make nearly as much money as you probably imagine. Then again, neither does Uber.

19. Presumably the Stasi/ICE guy saw this question somewhere on LinkedIn.

20. I’m pretty sure this seemingly magical process of self-documentation accounts for the almost cult-like infatuation seen in the tech community for a forthcoming “singularity” where networked machines effectively become conscious. But no, it’s just people trying to be helpful in exchange for a bit of recognition, and that is a very fundamentally human thing—in fact quite possible one of the single most important things that makes us human—not a machine thing.

21. Not long after the Macintosh went on the market, I wrote a commercially successful ancillary for what became a best-selling political science textbook. “Commercial” in the sense that it was for a major textbook publisher, and I was paid reasonably well for my efforts. “Successful” in the sense that the textbook sold well—not necessarily for reasons related to my program—and we won a couple awards for the program, and the publisher continued it into multiple editions, eventually taking the work in-house. Writing for the Mac was a real eye-opener: less than ten years earlier, Kernigan and Ritchie had defined not only the C language, but major elements of structured programming, in only 200 pages. The books showing how to program the Mac ran to four—eventually five—thick volumes, with “every one assuming you had already mastered the others.” It was a completely new world.

22. I also should note that this rant is not directed at any people or projects I’ve worked with directly: it is motivated by the very consistent set of stories coming through in the autobiographical accounts of others. I’ve certainly worked on teams that had programmers who were handsomely paid while contributing absolutely nothing to the project—and am embarrassed to say that on at least a couple of occasions (none recent!!) I’ve been that person—but they’ve been pleasant about it. I’ve also run into plenty of the arrogant irreplaceable toxic genius asshole programmer types in academic settings and come to think of it, on one occasion made the very serious mistake of letting one into a project I was directing, but in my government and commercial work have successfully avoided—or perhaps more accurately, my various project managers have created teams that have avoided—encountering them. Then again, I’ve generally worked on projects that have produced pretty decent code that does what it is supposed to do, rather than merely blowing through billions of dollars of dumb money. Funny, that.

Posted in Methodology, Programming | 1 Comment

Seven Conjectures on the State of Event Data

[This essay was originally prepared as a memo in advance of the “Workshop on the Future of the CAMEO Ontology”, Washington DC, 11 October 2016, said workshop leading to the new PLOVER specification. I had intended to post the memo to the blog as well, but, obviously, neglected to do so at the time. Better late than never, and I’ve subsequently updated it a bit. It gets rather technical in places and assumes a fairly high familiarity with event data coding and methods. Which is to say, most people encountering this blog will probably want to skip past this one.]

The purpose of this somewhat unorthodox and opinionated document [1] is to put on the table an assortment of issues dealing with event data that have been floating around over the past year in various emails, discussions over beer and the like. None of these observations are definitive: please note the word “conjecture”.

1. The world according to CAMEO will look pretty much the same using any automated event coder and any global news source

The graph below shows the CAMEO frequencies across its major categories (two-digit) using three different coders, PETRARCH 1 and 2 [2], and Raytheon/BBN’s ACCENT (from the ICEWS data available on Dataverse) for the year 2014. This also reflects two different news sources: the two PETRARCH cases are Lexis-Nexis; ICEWS/ACCENT is Factiva, though of course there’s a lot of overlap between those.cameo_compare

 

 

 

 

Basically, “CAMEO-World” looks pretty much the same whichever coder and news source you use: the between-coder variances are completely swamped by the between-category variances. What large differences we do see are probably due to changes in definitions: for example PETRARCH-2 uses a more expansive definition of “express intent to cooperate” (CAMEO 03) than PETRARCH-1; I’m guessing BBN/ACCENT did a bunch of focused development on IEDs and/or suicide bombings so has a very large spike in “Assault” (18) and they seem to have pretty much defined away the admittedly rather amorphous “Engage in material cooperation” (06).

I think this convergence is due to a combination of three factors:

  1. News source interest, particularly the tendency of news agencies (which all of the event data projects are now getting largely unfiltered) to always produce something, so if the only thing going on in some country on a given day is a sister-city cultural exchange, that will be reported  (hence the preponderance of events in the low categories). Also the age-old “when it bleeds, it leads” accounts for the spike on reports of violence (CAMEO categories 17, 18,19).
  1. In terms of the less frequent categories, the diversity of sources the event data community is using now—as opposed to the 1990s, when the only stories the KEDS and IDEA/PANDA projects coded were from Reuters, which is tightly edited—means that as you try to get more precise language models using parsing (ACCENT and PETRARCH-2), you start missing stories that are written in non-standard English that would be caught by looser systems (PETRARCH-1 and TABARI). Or at least this is true proportionally: on a case-by-case basis, ACCENT could well be getting a lot more stories than PETRARCH-2 (alas, without access to the corpus they are coding, I don’t know) but for whatever reason, once you look at proportions, nothing really changes except where there is a really concentrated effort (e.g. category 18), or changes in definitions (ACCENT on category 06; PETRARCH-2 on category 03).
  2. I’m guessing (again, we’d need the ICEWS corpus to check, and that is unavailable due to the usual IP constraints) all of the systems have similar performance in not coding sports stories, wedding announcements, recipes, etc:  I know PETRARCH-1 and PETRARCH-2 have about a 95% agreement on whether a story contains an event, but a much lower agreement on exactly what the event is. The various coding systems probably also have a fairly high agreement at least on the nation-state level of which actors are involved.

2. There is no point in coding an indicator unless it is reproducible, has utility, and can be coded from literal text

IMHO, a lot of the apparent disagreements within the event data community about coding of specific texts, as well as the differences between the coding systems more generally stem from trying to code things that either can’t be consistently coded at all—by human or automated systems—or which will never be used. We should really not try to code anything unless it satisfies the following criteria:

  • It can be consistently categorized by human coders on multiple projects working with material from multiple sources who are guided solely by the written documentation. I.e. no project-level “coding culture” or “I know it when I see it.”; also see the discussion below on how little we know about true human coding accuracy.
  • The coded indicators are useful to someone in some model (which probably also puts a lower bound on the frequency with which a code will be found in the news texts). In particular, CAMEO has over 200 categories but I don’t think I’ve ever seen a published analysis that doesn’t either collapse these into the two-digit top-level cue categories, or more frequently the even more general “quad” or “penta” categories (“verbal cooperation” etc.), or else pick out one or two very specific categories. [3]
  • It can be derived from the literal text of a story (or, ideally, sentence): the coding of the indicators should do not require background knowledge except for information explicitly embedded in the patterns, dictionaries, models or whatever ancillary information is used by the automated system. Ideally, this information should be available in open source files that can be examined by users of the data.

If an indicator satisfies those criteria, I think we usually will find we have the ability to create automated extractors/classifiers for it, and to do so without a lot of customized development: picking a number out of the air, one should be able to develop a coder/extractor using pre-existing code (and models or dictionaries, if needed) for at least 75% of the system.

3. There is a rapidly diminishing return on additional English-language news sources beyond the major international sources

Back in the 1990s, with the beginnings of the expansion of the availability of news sources in aggregators and on the Web, the KEDS project at the University of Kansas was finally able to start using some local English-language sources in addition to Reuters, where we’d done our initial development. We were very surprised to find that while these occasionally contributed new events, they did not do so uniformly, and in most instances, the international sources (Reuters and AFP at the time) actually gave us substantially more events, and event streams more consistent with what we’d expected to see (we were coding conflicts in the former Yugoslavia, eastern Mediterranean, and West Africa). This is probably due to the following

  1. The best “international” reporters and the best “local” reporters are typically the same people: the international agencies don’t hire some whiskey-soaked character from a Graham Greene novel to sit in the bar of a fleabag hotel near the national palace, but instead hire local “stringers” who are established journalists, often the best in the country and delighted to be paid in hard currency. [19]
  2. Even if they don’t have stringers in place, international sources will reprint salient local stories, and this is probably even more true now that most of those print sources have web pages.
  3. The local media sources are frequently owned by elites who do not like to report bad news (or spin their own alt-fact version of it), and/or are subject to explicit or implicit government censorship.
  4. Wire-service sourcing is usually anonymous, which substantially enhances the life expectancy of reporters in areas where local interests have been known to react violently to coverage they do not like.
  5. The English and reporting style in local papers often differs significantly from international style, so even when these local stories contain nuggets of relevant information, automated systems that have been trained on international sources—or are dependent on components so trained: the Stanford CoreNLP system was trained on a Wall Street Journal corpus—will not extract these correctly.

This is not to say that some selected local sources could not provide useful information, particularly if the automated extractor was explicitly trained to work with them. There is also quite a bit of evidence that in areas where a language other than English predominates, even among elites, non-English local sources may be very important: this is almost certainly true for Latin America and probably also true for parts of the Arab-speaking world. But generally “more is better” doesn’t work, or at least it doesn’t have the sort of payoff people originally expected.

4. “One-a-day” (OAD) duplicate filtering is a really bad idea, but so is the absence of any duplicate filtering

I’m happy to trash OAD filtering without fear of attack by its inventor because I invented it. To the extent it was ever invented: like most things in this field, it was “in the air” and pretty obvious in the 1990s, when we first started using it.

But for reasons I’ve recently become painfully aware of, and I’ve discussed in an assortment of papers over the past eighteen months (see http://eventdata.parusanalytics.com/papers.dir/Schrodt.TAD-NYU.EventData.pdf for the most recent rendition), OAD amplifies, rather than attenuating, the inevitable coding errors found in any system, automated or manual.

Unfortunately, the alternative of not filtering duplicates carries a different set of issues. While those unfamiliar with international coverage typically assume that an article which occurs multiple times will be somehow “more important” than an article that appears only once (or a small number of times), my experience is that this is swamped by the effects of

  • The number of competing news stories on a given day: on a slow news day, even a very trivial story will get substantial replications; when there is a major competing story, events which otherwise would get lots of repetition will get limited mentions.
  • Urban and capital city bias. For example, when Boko Haram set off a car bomb in a market in Nigeria’s capital Abuja, the event generated in excess of 400 stories. Events of comparable magnitude in northeastern regional cities such as Maiduguri, Bui or Damaturu would get a dozen or so, if that. Coverage of terrorist attacks over the past year in Paris, Nice, Istanbul and Bangkok—if not Bowling Green—show similar patterns.
  • Type of event. Official meetings generate a lot of events. Car bombings generate a lot of events, particularly by sources such as Agence France Press (AFP) which broadcast frequent updates.[4] Protracted low level conflicts only generate events on slow news days and when a reporter is in the area. Low-level agreements generate very few events compared to their likely true frequency. “Routine” occurrences, by definition, generate no reports—they are not “newsworthy”—or generate these on an almost random basis.
  • Editorial policy: AFP updates very frequently; the New York Times typically summarizes events outside the US and Western Europe in a single story at the end of the day; Reuters and BBC are in between. Local sources generally are published only daily, but there are a lot of them.
  • Media fatigue: Unusual events—notably the outbreak of political instability or violence in a previously quiet area—get lots of repetitions. As the media become accustomed to the new pattern, stories drop off.[18] This probably could be modeled—it likely follows an exponential decay—but I’ve rarely seen this applied systematically.

So, what is to be done? IMHO, we need to do de-duplication at the level of the source texts, not at the level of the coded events. In fact, go beyond that and start by clustering stories, ideally run these through multiple coders—as noted earlier, I don’t think any of our existing coders are optimal for everything from a Reuters story written and edited by people trained at Oxford to a BBC radio transcript from a static-filled French radio report out of Goma, DRC and which is then quickly translated by a non-native speaker of either language—then base the coded events on those that occur frequently in that cluster of reports. Document clustering is one of the oldest applications in automated text analysis and there are methods that could be applied here.

5. Human inter-coder reliability is really bad on event data, and actually we don’t even know how bad it is.

We’ve got about fifty years of evidence that the human coding [5] on this material doesn’t have a particularly high correlation when you start, for example, comparing across projects, over time, and in the more ambiguous categories.[6] While the human coding projects typically started with coders at a 80% or 85% agreement at the end of their training (as measured by Kronbach’s-alpha, typically) [7], no one realistically believes that was maintained over time (“coding drift”) and across a large group of coders who, as the semester rolled on, are usually always on the verge of quitting. [8] And that is just within a single project.

The human-coded WEIS event data project [10] started out being coded by surfers [11] at UC Santa Barbara in the 1960s. During the 1980s WEIS was coded by Johns Hopkins SAIS graduate students working for CACI, and in Rodney Tomlinson’s final rendition of the project in the early 1990s [12], by undergraduate cadets at the U.S. Naval Academy. It defies belief that these disparate coding groups had 80% agreement, particularly when the canonical codebook for WEIS at the Inter-University Consortium for Social and Political Research was only about five (mimeographed) pages in length.

Cross-project correlations are probably more like 60% to 70% (if that) and, for example, a study of reliability on (I think [20]) some of the Uppsala (Sweden) Conflict Data Project conflict data a couple years ago found only 40% agreement on several variables, and 25% on one of them (which, obviously, must have been poorly defined).

The real kicker here is that because there is no commonly shared annotated corpus, we have no idea of what these accuracy rates actually are, nor measures of how widely these vary across event categories. The human-coded projects rarely published any figures beyond a cursory invocation of the 0.8 Kronbach’s-alpha for their newly-trained cohorts of human coders; the NSF-funded projects focusing on automated coding were simply not able to afford the huge cost of generating the large-scale samples of human-coded data required to get accurate measures, and various IP and corporate policy constraints have thus far precluded getting verifiable information on these measures on the proprietary coders.

6. Ten possible measures of coder accuracy

This isn’t a conjecture, just a point of reference. These are from  https://asecondmouse.wordpress.com/2013/05/10/seven-guidelines-for-generating-data-using-automated-coding-1/

  1. Accuracy of the source actor code
  2. Accuracy of the source agent code
  3. Accuracy of the target actor code: note that this will likely be very different from the accuracy of the source, as the object of a complex verb phrase is more difficult to correctly identify than the subject of a sentence.
  4. Accuracy of the target agent code
  5. Accuracy of the event code
  6. Accuracy of the event quad code: verbal/material cooperation/conflict [13]
  7. Absolute deviation of the “Goldstein score” on the event code [14]
  8. False positives: event is coded when no event is actually present in the sentence
  9. False negatives: no event is coded despite one or more events in the sentence
  10. Global false negatives: an event occurs which is not coded in any of the multiple reports of the event

This list is by no means comprehensive, but it is a start.

7. If event data were a start-up, it would be poised for success

Antonio Garcia Martinez’s highly entertaining, if somewhat misogynistic, Chaos Monkeys: Obscene Fortune and Random Failure in Silicon Valley quotes a Silicon Valley rule-of-thumb that a successful start-up at the “unicorn” level—at least a temporary billion-dollar-plus valuation—can rely on only a single “miracle.” That is, a unicorn needs to solve only a single heretofore unsolved problem. So for Amazon (and Etsy), it was persuading people that nearly unlimited choice was better than being able to examine something before they bought it; for AirBNB, persuading amateurs to rent space to strangers; for DocuSign [21], realizing that signing documents was such a pain that you could attain a $3-billion valuation just by providing a credible alternative [22].  If your idea requires multiple miracles, you are doomed.[15]

In the production of event data, as of 2016, we have open source solutions—or at least can see the necessary technology in open source—to solve all of the following parts for the low-cost near-real-time provision of event data:

  • Near-real-time acquisition and coding of news reports for a global set of sources
  • Automated updating of actor dictionaries through named-entity-recognition/resolution algorithms and global sources such as Wikipedia, the European Commission’s open source JRC-Names database, CIA World Leaders and rulers.org
  • Geolocation of texts using open gazetteers, famously geonames.org and resolution systems such as the Open Event Data Alliance’s mordecai.
  • Inexpensive cloud based servers (and processors) and the lingua franca of Linux-based systems and software
  • Multiple automated coders (open source and proprietary) that probably well exceed the inter-coder agreement of multi-institution human coding teams

More generally, in the past ten years an entire open source software ecosystem has developed relevant to this problem (but typically in contexts far removed from event data): general-purpose parsers, named-entity-recognition/resolution systems, geolocation gazetteers and text-to-location algorithms, near-duplicate text detection methods, phrase-proximity (word2vec etc) and so forth

The remaining required miracle:

  • Automated generation of event models, patterns or dictionaries: that is, generating and updating software to handle new event categories and refine the performance on existing categories.

This last would also be far more easy if we had an open reference set of annotated texts, and even Garcia Martinez allows that things don’t require exactly one miracle. And we don’t need a unicorn (or a start-up): we just need something that is more robust and flexible than what we’ve got at the moment.

SO…what happened???

The main result of the workshop—which covered a lot of issues beyond those discussed here—was the decision to develop the PLOVER coding and data interchange specification, which basically simplifies CAMEO to focus on the levels of detail people actually use (the CAMEO cue categories with some minor modifications [16]), as well as providing a systematic means—“modes” and “contexts”—for accommodating politically-significant behaviors not incorporated into CAMEO such as natural disasters, legislative and electoral behavior, and cyber events. This is being coordinated by the Open Event Data Alliance and involves an assortment of stakeholders (okay, mostly the usual suspects) from academia, government and the private sector. John Beieler and I are writing a paper on PLOVER that will be presented at the European Political Science Association meetings in Milan in June, but in the meantime you can track various outputs of this project at https://github.com/openeventdata/PLOVER. A second effort, funded by the National Science Foundation, will be producing a really large—it is aiming for about 20,000 cases, in Spanish and Arabic as well as English—set of PLOVER-coded “gold standard cases” which will both clearly define the coding system [16] and also simplify the task of developing and evaluating coding programs. Exciting times.[24]

Footnotes:

1. Unorthodox and opinionated for a workshop memo. Pretty routine for a blog.

2. The blue bar shows the count of codings where PETRARCH-1 and PETRARCH-2 produce the same result; despite the common name, they are essentially two distinct coders with distinct verb phrase dictionaries.

3. Typically with no attention as to whether these were really implemented in the dictionaries: I cringe when I see someone trying to use the “kidnapping” category in our data, as we never paid attention to this in our own work because it wasn’t relevant to our research questions.

4. I read a lot of car bomb stories: http://eventdata.parusanalytics.com/data.dir/atrocities.html

5. When such things existed for event data: There really hasn’t been a major human coded project since Maryland’s GEDS event project shut down about 15 years ago. Also keep in mind that if one is generating on the order of two to four thousand events per day—the frequency of events in the ICEWS and Phoenix systems—human coding is completely out of the picture.

6. In some long-lost slide deck (or paper) from maybe five or ten years ago, I contrasted the requirements of human event data coding to research—this may have been out of Kahneman’s Thinking Fast and Slow—on what the human brain is naturally good at. The upshot is that it would be difficult to design a more difficult and tedious task for humans to do than event data coding.

7. Small specialized groups operating for a shorter period, of course, can sustain a higher agreement, but small groups cannot code large corpora.

9. In our long experience at Kansas, we found that even after the best selection and training we knew how to do, about a third of our coders—actually, people developing coding dictionaries, but that’s a similar set of tasks—would quit in the first few weeks, and another sixth by the end of the semester. A project currently underway at the University of Oklahoma is finding exactly the same thing.

10.The WEIS (World Events Interactions Survey) ontology, developed in the 1960s by Charles McClelland, formed the basis of CAMEO and was the de facto standard for DARPA projects from about 1965 to 2005.

11. Okay, “students” but at UCSB, particularly in the 1960s, that was the same thing.

12. Tomlinson actually wrote an entirely new, and more extensive, codebook for his implementation of WEIS, as well as adding a few minor categories and otherwise incrementally tweaking the system, much as we’ve been seeing happening to CAMEO. Just as CAMEO was a major re-boot of WEIS, PLOVER is intended to be a major modification of CAMEO, not merely a few incremental changes.

13. More recently, researchers have started pulling out the high-frequency (and hence low information) “Make public statement” and “Appeal” categories out of “verbal cooperation”, leading to a “pentacode” system. PLOVER drops these.

14. The “Goldstein scale” actually applies to WEIS, not CAMEO: the CAMEO scale typically referred to as “Goldstein” was actually an ad hoc effort around 2002 by a University of Kansas political science grad student named Uwe Reising, with some additional little tweaks by his advisor to accommodate later changes in CAMEO. Which is to say, a process about as random as that which was used to develop the original Goldstein scale by an assistant professor and a few buddies on a Friday afternoon in the basement of the political science department at the University of Southern California. Friends don’t let friends use event scales: Event data should be treated as counts.

15. Another of my favorite aphorisms from Garcia Martinez: “If you think your idea needs an NDA, you might as well tattoo ‘LOSER’ on your forehead to save people the trouble of talking to you. Truly original ideas in Silicon Valley aren’t copied: they require absolutely gargantuan efforts to get anyone to pay serious attention to them.” I’m guessing DocuSign went through this experience: it couldn’t possibly to worth billions of dollars.

16. To spare you the suspense, we eliminated the two purely verbal “comment” and “agree” categories, split “yield” into separate verbal and material categories, combined the two categories dealing with potentially lethal violence, and added a new category for various criminal behaviors. Much of the 3- and 4-digit detail is still retained in the “mode” variable, but this is optional. PLOVER also specifies an extensive JSON-based data interchange standard in hopes that we can get a common set of tools that will work across multiple data sets, rather than having to count fields in various tab-delimited formats.

17. CAMEO, in contrast, had only about 350 gold standard cases: these have been used to generate the initial cases for PLOVER and are available at the GitHub site.

18. For example, a recent UN report covering Afghanistan 2016 concluded there had been about 4,000 civilian casualties for the year. I would be very surprised if the major international news sources—which I monitor systematically for this area—got even 20% of these, and those covered were mostly major bombings in Kabul and a couple other major cities.

19. With which they may use to buy exported whiskey, but at least that’s not the only thing they do.

20. Because, of course, the article is paywalled. One can buy 24-hour access for a mere $42 and 30-day access for the bargain rate of $401. Worth every penny since, in my experience, the publisher’s editing probably involved moving three commas in the bibliography, and insisting that the abstract be so abbreviated one needs to buy the article.

21. The original example here was Uber, until I read this. Which you should as well. Then #DeleteUber. This is the same company, of course, where just a couple years ago one of their senior executives was threatening a [coincidentally, of course…] female journalist. #DeleteUber. Really, people, this whole brogrammer culture has gotten totally out of control, on multiple dimensions.

Besides, conventional cabs can be, well, interesting: just last week I took a Yellow Cab from the Charlottesville airport around midnight, and the driver—from a family of twelve in Nelson County, Virginia, and sporting very impressive dreadlocks—was extolling his personal religious philosophy, which happened to coincide almost precisely with the core beliefs of 2nd-century Gnosticism. Which is apparently experiencing a revival in Nelson County: Irenaeus of Lyon would be, like, so unbelievably pissed off at this.

22. Arguably the miracle here was simply this insight, though presumably there is some really clever security technology behind the curtains. Never heard of DocuSign?: right, that’s because they not only had a good idea but they didn’t screw it up. Having purchased houses in distant cities both before and after DocuSign, I am inordinately fond of this company.

23. PLOVER isn’t the required “miracle” alluded to in item 7, but almost certainly will provide a better foundation (and motivation) for the additional work needed in order for that to occur. Like WEIS, CAMEO became a de facto “standard” by more or less accident—it was originally developed largely as an experiment in support of some quantitative studies of mediation—whereas PLOVER is explicitly intended as a sustainable (and extendible) standard. That sort of baseline should make it easier to justify the development of further general tools.

Posted in Methodology | Leave a comment

A Numerical Reflection upon the 2015-2016 APSA Placement Statistics

[Okay, this “Seven…” gimmick isn’t working for producing finished blogs—mind you, I’ve got about a dozen 50%-80% finished entries in the pipeline—and [shock!] there are things that can be said with less than “Seven…” witty subcategories, but are still longer than an extended set of Tweets, which no one reads to the end of anyway. [1] So I may be doing some shorter blog posts for a while.]

Brethren and sistern [2], our reading this day is from the newly released American Political Science Association 2015-2016 APSA Graduate Placement Survey. And more specifically, the chapters and verses—which is to say, the entire report—dealing with the continued decline in the proportion of political science PhDs who are placed in tenure-track (TT) positions. Now down to an abysmal 35.4%.

Well, at least that simplifies the task of the Director of Graduate Placement, eh?—he or she can just address the year’s crop of candidates with a straightforward “Look to your left and to your right: only one of you is going to get the TT job you have laboriously trained for.”

Otherwise it’s insane: why, oh why, are you people allowing this to continue? Have you no shame?

Let’s put this in perspective. I’m not sure what placement rates were when I graduated (Indiana, then as now ranked around 20 nationally, so competitive) in 1976—though we were complaining bitterly that they’d dropped from a rate approaching 100% in the previous decade—but I do know that I joined a department (Northwestern) which I’m pretty sure was composed entirely of TT faculty: I didn’t even really know what an “adjunct” was at the time. Pretty much the same was true twelve years later when I moved to Kansas, though I think there we had a couple long-term adjuncts teaching specialized courses in policy and law that we couldn’t otherwise staff. Mandatory retirement at age 65 was still in place both institutions, so someone hired into a TT slot would on average occupy that for about 35 years. [3]

And that, campers, is the source of our problem. Mandatory retirement was abolished in the US—with a few exceptions such as airline pilots and FBI agents, but not tenured academics—in 1986 [4]. Initial projections were that academics would retire anyway at age 70 or thereabouts but, well, from what I’m hearing, that isn’t happening. In fact I’m hearing quite the opposite: I talked recently with a chair who was extolling the virtues of her multiple faculty who were in their 80s.

Now, although I don’t think I really had the stamina (or cultural/emotional links to the average undergraduate student) to effectively engage a classroom using contemporary active learning methods after the age of 55 or so, these people have every legal right to do what they’re doing and I’m sure they have assured themselves this “aging” thing is just some sort of primordial myth from which they are exempted. Whatever. We’re here to look at comparative numbers.

Let’s assume the average person hired in a TT slot now occupies that position to age 80 rather than 65: 50 years rather than 35 years. In an eye-blink, we have just reduced the availability of TT slots by about 30%. Permanently: this is not a generational thing, it is a permanent structural change.

But it gets worse: not only will these people hold those positions an additional 15 years, but they will almost certainly do so at their career-high salary levels and, unless you’ve got deans, associate deans, deanlets and deanlings backed by a really good set of lawyers and an HR department which really likes to make phenomenal amounts of highly public trouble for itself, those individuals will generally get roughly the average departmental salary increase, but do so on the highest tier of the department salaries. Which will pretty much suck all of the financial oxygen out of the room, and at public universities at least, that additional salary probably will not be funded by tuition increases. [5]

Hence the proliferation of adjuncts, and the further decline of TT positions: public universities simply don’t have the money to fund such positions any more.

The effects of average salary increases (and salaries as a function of age) are much harder to approximate than the effects of abolishing mandatory retirement, but I’d guess this factor forces another 10% to 20% reduction in TT slots. And this before various other long-term trends such as student-as-customer, university-as-athletic-franchise, degree-program-as-occupational-training and campus-as-entertainment-park which further divert resources and reduce demand for dedicated lifetime-employed positions, particularly in the social sciences.

Now, there’s an obvious [roughly] market-clearing solution in these circumstances: reduce the production of PhDs—or at least PhDs aimed at TT employment [5]—by about 40% to 50%. Not only is that the obvious solution, it’s the only solution. I’ve seen almost no evidence this is happening, and fundamentally it is a classical tragedy-of-the-commons situation which has had the classical tragedy-of-the-commons outcome.

Good luck with that.

Footnotes

  1. Or at least not my extended tweets. Though eyeballing the Twitter stats, the drop-off is a great natural example of exponential decay.
  2. Yes, “sistern” is a word. In Middle English.
  3. That figure is almost certainly high: between voluntary and involuntary attrition, 30 years would probably be closer. But we’ll stay conservative.
  4. Abolition for tenured faculty in the US was actually delayed for an additional 8 years, to 1994, which is why we start seeing these effects kicking in around the 2000s rather than earlier
  5. At this same time, public financial support was also dropping precipitously compared to the levels during the post-WWII expansion of those systems, and tuition increases were hard-pressed to deal just with that decline in revenue. Declining public funding  accounts for substantial amounts, though probably not all, of the oft-heard ” tuition increases faster than inflation” complaint, at least in the 1990s and 2000s.
    Until the Great Chinese Trade War—or the Great Student Visa Crackdown—is in full swing, elite private schools appear to be able to raise tuition indefinitely. Such schools, of course, tend mostly to hire from each other (and a few top publics), though somewhat surprisingly the differences reported by the APSA between the top two NRC tiers (1-19 and 20-37) are surprisingly small, except for the top tier, as expected, being more likely to place in PhD programs. Below that, things get really grim.
  6. Ironically, I think there is a substantial unmet need outside of academia for individuals with advanced training in the social sciences, both quantitative and qualitative, that certainly goes beyond the typical two to four semester M.A., and in many cases would include the independent research experience required by a dissertation. At least on the quantitative side, such opportunities are probably absorbing most if not all of those unable to find TT employment. But there’s a huge waste in the current system: time spent training people on the assumption that they will spend their lives in the classroom and writing unread[able] articles published in paywalled venues following a five-year lag could be put to much better use on more practical topics. I’ve yet to see anyone do this in political science, though one sees some initial efforts in multi-disciplinary programs.

Addenda, 8 Feb 2017

Well, people seem to be reading this, so a few additional thoughts

1. Once again, my numbers on the impact of the end of mandatory retirement are just approximations, though probably not far off. I’m guessing most people won’t literally delay retirement until their 80s, but delay until mid-70s already seems to be quite common, so we’re still adding 10 years. Meanwhile I probably substantially over-estimated the average time spent prior to age 65 in a TT job, since attrition from the standard full-teaching-load TT position can occur through a variety of tracks, particularly administration (which in my experience is disproportionately drawn from the ranks of political scientists: during part of the period I was a grad student at Indiana, the president, provost and dean of liberal arts were all political scientists) or, as I did, shifting to buying out most courses with research funding. So the average time in a TT prior to age 65 may well be more like 25 years, though that and retirement at 75 leads you to pretty much the same numbers.

In the second half of my years at Kansas we had a highly effective provost, David Shulenburger who developed, along with his senior data analyst, Deb Teeter, an absolutely massive database on faculty and departmental costs and [multiple measures of] productivity. So I’m certain that Shulenburger and Teeter knew to the third decimal point the values of the sorts of numbers I’m speculating about here. Most faculty, of course, assumed Shulenburger allocated resources solely on the basis of personal whim, immensely hampered by his inability to recognize their intrinsic genius.

2. Barring various ailments, you can definitely live a “life of the mind” well past the age of 55: in fact with the Internet now instantly augmenting whatever lapses one has in memory, you can have a rather fabulous time doing this. What almost no one can do as well at age 55 as at age 35 is conduct a 3-hour active learning seminar, or, for that matter, pull an all-nighter to write a brand new lecture based on three books you read the previous day. Yeah, yeah, the Boomers reading this—other than the fact they won’t be—are all figuratively jumping up with counter-examples (most of which are urban legends) and yeah, yeah, at age 39 Tom Brady brought the Patriots back from near certain defeat to win the 2017 Super Bowl. But I’m guessing the NFL isn’t planning to shift its recruiting strategy to focus on 39-year-olds.

The other thing you unquestionably lose by age 55 is any sort of intellectual networking with the people who are developing the new cool theories: by 55, most of your peers from graduate school and pre-tenure days are no longer doing any meaningful research at all (or have shifted their focus to something like UFOs…), or have become administrators, or are running focused research projects where the work occurs largely out of the limelight. Want to see what Harry Potter felt like wearing that invisibility cloak?: walk through a conference venue when you are over the age of 50. Yes, this will happen to you (maniacal laughter…and get off my yard…)

[By the way, do not try to make these arguments in an academic department because your HR department will become exceedingly upset. I can say them because I’m no longer employed in academia.]

3. The sort of qualitative work I know of that is tremendously valued outside of academia is the intensely focused language/culture/field experience that is typical of much of comparative politics. Just as with quantitative training, the specific topic isn’t all that important, it’s the fact that you (and your immune system) can do it at all. I’m guessing there are culturally-immersive equivalents in U.S. politics, and there your choice of “shots” will be whiskey vs bourbon rather than typhoid vs typhus.

This doesn’t extend indefinitely: if by “qualitative” you mean simply sitting around in seminar rooms discussing ever more esoteric theories you have pretended to read, there’s a point where that is too, well, “academic” and no one else cares. The equivalent in the quantitative realm are people who spend their entire time studying estimators on simulated data: yawn… In academia there are niches for this sort of thing, just not very many, and your chances of getting placed into one are extremely low.

4. I’m guessing one of the primary motivators for the proliferation of PhD programs which have no realistic chance of placing most students into TT positions is to provide “graduate” TAs for the fabulously lucrative introductory courses, which basically drive most of the finances of large departments (certainly political science). This is actually a terrible model, and universities would be much better off using their best senior undergraduates in these roles, but you rarely see this. Nonetheless, I’ll always remember one of the honors students at Kansas who was working for me noting “I look at people in the honors program here, and they’re going off to graduate school at places like Harvard, Michigan and Berkeley. Whereas the GTAs here are going to graduate school at, well, KU.”

Addenda, 25 April 2017

In the wake of political science professor Will Moore’s suicide last week, it is also worth noting that putting up with the apparently ever-increasing psychological stresses of this rat race may not be good—or even safe—for you. In fact a remarkable number of blogged comments about Will—beyond the observations that in his academic life he was highly productive, well-networked, and by all external appearances, unusually successful—noted this issue: Cullen Hendrix and (more nuanced) Christian Davenport are just two among these.  But completely independently of that event, a couple weeks earlier Science magazine had posted an extended discussion of the topic of mental health in academia, focusing on graduate students.

Sigh.

I’m not sure where this all ends up, but the current process, with the implicit assumption by administrators and senior faculty that they can continually raise expectations, and lower rewards, with no consequences, does not seem sustainable. Some form of three or four-year post-secondary “educational” experience, nominally devoted to learning but mostly focused on branding and drinking, seems pretty common across history and societies. Massive hyper-bureaucratized research universities, on the other hand, date back only a century or so, and even urbanized, industrialized societies appear to be able to get along just fine without them: the contributions of the universities to both the political and technological revolutions of the 18th and 19th centuries were minimal. University administrators, however, appear to think nothing can possibly displace them. So did the dinosaurs.

Posted in Higher Education | 2 Comments

Seven Observations on the 2016 Election

On my day-pack I’ve got a little enamelled pin I bought several years back in a small shop in Juneau run by a guy who has, well, opinions. It shows a typewriter with the words “Write hard, die free.” [1]

So, where we at? On 8 November, a majority of voters cast their ballot [2] for someone who has probably played a bit fast and loose with ethics [3], probably is a bit too loyal to subordinates, and unquestionably has very cozy ties with Wall Street. But thanks to very straightforward and completely transparent conditions involving the Electoral College which any 7th-grader—but apparently not the strategic geniuses of the Democratic National Committee (DNC)—can calculate, we will instead be inaugurating as President a vindictive, highly insecure misogynistic compulsive liar with authoritarian tendencies who has zero prior political experience and the attention span of a chihuahua.[4] May we live in interesting times.

My takeaways:

  1. This did not need to happen and is primarily the result of mind-boggling incompetence by the professionals of the Democratic Party.
  1. Fundamentally—consistent with pretty much everything everyone is saying—the election was lost by taking Rust Belt whites for granted [5]: without question, flip Wisconsin, Michigan, Ohio and Pennsylvania and Clinton would have been president. Everything else—Wikileaks, Comey, emails—was just gravy.
  1. The existing opinion and likely-voter models have been shown to be woefully inadequate, however much individuals making money off these will protest otherwise.

And a couple things we should keep in mind did not happen

  1. The country did not shift radically to the right: Trump did not even get a majority of the votes cast, much less of eligible voters.
  1. Trump is not a Republican in any conventional sense, though clearly the Republicans benefited from the Trump victory more than the Democrats did. Well, probably benefited more.
  1. There is a whole lot to still play out here.

Seven observations:

This was the victory of a populist third party with little clear ideology

Trump expertly fashioned himself to take advantage of the rising anti-elite, anti-globalization, and anti-immigrant populism we’ve been seeing in the US since the Tea Party successes in 2010, and surging in Europe for the past two or three years.[6] A couple of Republican tropes were tossed in—arch-conservatives on the Supreme Court, repeal of the Affordable Care Act and, of course, tax cuts primarily directed at the wealthy—but compared, say, to Paul Ryan, there was nothing like a coherent ideological package here. Furthermore Trump went through the election with at best tepid support from most of the Republican establishment, and fervent, vocal opposition from many in the highest levels of that establishment. Trump is the ultimate “RINO”: Republican in name only.

But now he has to govern, and here we are going to learn a lot in the next couple of months. Probably one of three scenarios will play out

  1. Bush-III: With Trump completely adrift ideologically and out of his depth, the GOP establishment under Ryan and McConnell (plus a bunch of Bush administration executive veterans, ideally minus those under suspicion of committing war crimes) control the executive branch appointment and eventually advance quite a bit of the same legislation that, say, Jeb Bush or Marco Rubio would have been able get through given the GOP control of the House and (barely) the Senate.[7]
  1. Gridlock-III/Chaos-I: There are sufficient Tea Party and Trumpkin votes (and, of course, Democrats) in the House to throw confusion into almost any initiative—the obvious first clash will occur on the deficit-increasing implications of infrastructure spending, defense increases, and ever-more tax cuts followed by clashes on the “replace” part of ACA “repeal and replace”—and things will settle down into either continuation of Obama-era gridlock or more likely a wild melange of initiatives going through what is essentially a three (or four) party legislature.[8] Some GOP priorities will get through, others will not, and some populist Democratic initiatives (infrastructure, definitely) will also.
  1. Hungary-II: An Alt-right circle of the likes of Bannon, Giuliani, Eric Trump, Kris Kobach, and the hordes of sycophants, opportunists and scoundrels descending upon the Trump Tower will actually start to implement the extreme elements promised during the campaign. Key indicator will be whether an attempt is made to prosecute Clinton.

At the time of this writing, 12 November, things are definitely heading towards Bush-III: the two lead headlines in the Washington Post at this moment are “Trump team backs off some sweeping campaign pledges” and “President-elect, aides suggest softer stances on border wall, health-care law.” But beneath that: “Meet the potential Cabinet picks most likely to make liberals squirm” and yes indeed, Cabinet positions for the wing-nuts of the right is a long GOP tradition. As my late father-in-law in Kansas would have put it, right now we’re looking at a hog on ice.

Again, it’s going to take several months to get a sense of how this will play out. The other factor which should become evident fairly soon is whether Trump expects to run for a second term: I very much doubt he will given that he will find the job exceedingly constraining, and he can also use this as a magnanimous example of voluntarily relinquishing the pursuit of power in order to serve the greater good.[9] The certainty of a one-term Trump presidency would substantially complicate the dynamics on the GOP side, at all levels.

The Democrats lost this one more than Trump won it

The breadth of the Democratic Party’s incompetence is this election is absolutely stunning. As noted above, this election was lost on traditional Democratic turf: the Rust Belt. And yet Clinton spent the final weeks trying to run up the score in Georgia (!) and Arizona, ignoring the lethal hemorrhage in Wisconsin, Michigan and Pennsylvania.

By all accounts the die had been cast long before this, first by the arrogance of the establishment elite who presumably had purchased The Party Decides in shipping-container quantities and displayed it in every office on little altars with flowers, incense and candles, and well, we’ve decided, and it will be Hillary. There was no consideration of credible alternatives, and when it was clear from the success of Sanders that Clinton had serious weaknesses, this was ignored. Because, as we all know, nothing is so attractive to younger and working class voters than a grey-haired 75-year-old socialist from Vermont, so Bernie is just a fluke. [10]  

This is professional malpractice on the highest order, and I hope Trump at least has sent you folks at the DNC a bundle of passes to his golf courses as an expression of his gratitude.

Almost all of the money given to candidates is squandered, so just stop sending it

Around the middle of October, as I was being bombarded by fund raising appeals (and the occasional phone call [11]) I watched as yet another Doctors Without Borders (MSF) [12] hospital was attacked (variously by allies of Obama and allies of Trump) and thought “This is it: no more money to these campaigns: from this point on it goes to MSF.”

Yeah, right. No, I didn’t follow my own advice (or moral compass) and still gave to candidates who, in the end, didn’t have a snowball’s chance in hell. Like in the blatantly gerrymandered VA-5, my home Congressional district, where I was constantly assured that the Democratic candidate from…duh…ultra-liberal Albemarle County “has a real chance!” She lost by 16 percentage points.[13]

I can, in fact, assert that every single candidate I contributed to—in some case quite significant amounts—in this election cycle lost. So if you are playing the prediction markets in 2018, be sure to give me a call since I’m got about as good a negative correlation on these things as you are going to find anywhere. Which, trust me, is the only way anyone is going to make any money off me in Campaign-2018.

But more generally, money doesn’t make that much difference: it’s not just the clowns at the DNC, it’s everywhere, and was every bit as evident in the GOP primary as in the general. We’ve known that for decades: it was probably twenty years ago I saw the first quantitative analyses at the Political Methodology meetings which showed how weak the effect was. Everyone thought there must be an error in the analysis but here we are, in 2016, and it’s the same old thing.

Basically, the money you contribute to a campaign goes two places. Primarily, to a huge class of ignorant hucksters whose sole concern is to use the emotions of the campaign to separate you from your cash, and will lie to your face to do so if that’s what it takes. Second, to the media entertainment complex for advertising created and targeted based on the advice of the hucksters and their massively flawed polls and focus groups. Money doesn’t generate votes when the fundamentals are wrong.

So my suggestion: the next time a candidate asks you for money, take what you were planning to spend, convert it to small bills, buy hot dogs or marshmallows, then pile the remaining money into a grill, spray it with lighter fluid, invite the neighbors over, and talk about common concerns while you enjoy the blaze. I can assure you that will do vastly more to influence votes than handing it the political consultant class.[14]

Next time, my maximum contribution is $20: if you can’t run a campaign on $20 contributions, don’t ask for my help. The rest goes to MSF. Which I’m still feeling guilty about: very brave people are dying in those places.

All election “news” is now merely for entertainment

I will give Jeff Bezos credit for one thing—beyond running the sort of business that fires people when they get diagnosed with cancer—he managed to get me clicking on those Washington Post stories throughout the day like a rat hitting the bar for more cocaine. And he’s got plenty of imitators. And I got totally suckered all the while knowing I was being suckered and that, folks, starts getting pretty scary.

And for what?: I’d skim these stories in the NYT and WaPo in the morning and basically knew the content of almost every one before I read them. I suppose I should give myself a bit of a break on a few rat-cocaine providers because I truly enjoy their writing—Gail Collins, Jennifer Rubin, Greg Sargent and Ross Douthat [15]. But most of this—and most certainly the “horse race” coverage of the polls—is utterly useless. [16]

Yeah, useless: there was essentially no discussion of actual policy during this entire race. No, it was all personality, gotcha’s, Wikileaks, the latest Trump outrages and the never-ending email story, which probably one in a ten-thousand voters (if that) understood at any serious level. And horse-race, horse-race, horse-race coverage, all based on countless polls which were all pretty much completely…

Wrong. Yes, the polls have systematic errors

Every methodology presentation on opinion polling I’ve attended for at least ten years has had the same theme: “We knew how to get pretty representative samples in the 1980s and maybe 1990s, but those methods no longer work, and at some point they are going to collapse. And it could happen at any time.”

Well, buckeroo, those chickens have come home to roost…

I’m not a pollster. I don’t even play one on TV.[17] But I’ve spent I great deal of time doing statistical forecasting with noisy time series and even before the catastrophic divergence of the poll projections and the outcome were evident, a couple of things were worrying me

  1. The individual polls were jumping around far too much to be explained by changes in voter opinions—which are generally fairly static—and certainly far too much to be explained by sampling error (which in a random sample is quite well understood)
  2. The media were treating the confidence intervals as if the true result were uniformly distributed therein: they are in fact following a bell-shaped “normal” (or “Gaussian”) distribution [18]
  3. I was seeing a whole lot of incoherent and inconsistent excuses suggesting that the pollsters themselves were pretty worried that they weren’t getting at the Trump voters, and for multiple reasons.

This analysis could go on indefinitely, and there are people who know way more about these things than I do, and they should get a workshop together to discuss this REALLY SOON (which is to say, before people internalize all their “we got this so totally wrong but we actually got it right” excuses—and believe me, that is going to happen unless people in polling are angels [26], and is already happening—and figure out what the systematic lessons-learned should be.

In the meantime, I wholeheartedly support the sentiment expressed by Timothy Egan:

Finally, all of us in the American family should never trust anyone from the pollster industrial complex, including those at my own newspaper. Never. Read your horoscope; it’s far more likely to be accurate.

Trump will find the federal government decidedly difficult to work with

When contemplating General Eisenhower winning the Presidential election, Harry S Truman said, “He’ll sit here, and he’ll say, ‘Do this! Do that!’ And nothing will happen. Poor Ike—it won’t be a bit like the Army. He’ll find it very frustrating.”
Source: Richard E. Neustadt, Presidential Power, the Politics of Leadership, p. 9 (1960).]

And this in reference to the [Kansan] Dwight Eisenhower who had successfully managed the massive bureaucracy required to pull off the successful D-Day invasion: Trump has no experience even remotely comparable. (the Miss Universe pageant doesn’t count)

Let us at least begin to list the ways this will happen

  • Trump starts totally in enemy territory: an astonishing 96% of the votes in the District of Columbia were against him
  • The Federal bureaucracy is fabulously slow-moving in the best of circumstances,[20] and most are protected by the Civil Service. He’s not able to just say “You’re fired!”
  • The House GOP will remain divided, possibly now into three parts: orthodox Republicans, the remnants of the Tea Party, and now a few Trumpkin populists

The scary thing, however, is the international system, which is almost certainly going to throw some serious crisis in the first couple of years, even if Trump were simply to maintain the status quo. To the extent that he initiates an isolationist policy, the possibility of this will be substantially magnified as various actors move quickly to take advantage of the emerging vacuum.

Distressed?: Consider the Benedict Option

I’m guessing most of the readers of this blog are liberals and will be unfamiliar with this term, though it will be familiar to at least some conservatives [21] Well, Google it, but in short, this is named for Benedict of Nursia (ca. 500CE) whose response to the collapse of the Western Roman Empire was to establish communities where Christians could live true to their own morals rather than involving themselves—as many Christians had since Christianity gained secular power following the time of Constantine—in the ways of the world. The arguments are somewhat more complex than that—a lot more when you start getting into the theological nuances—but worth reading (particularly for those who think conservatism begins and ends with the likes of O’Reilly, Hannity and Limbaugh)

In the past four days, we’re already seeing a lot of this as people make it clear that if need be they intend to protect their Muslim and immigrant neighbors, as well as whoever else the alt-right has in mind (and they make those targets pretty clear as well).[22] This safety-pin thing is wonderful.[23]

Maybe circling the wagons and pulling up the drawbridges won’t be necessary: Again, only about a quarter of eligible votes went to Trump, and by all accounts a substantial number of those were clearly simply to get someone in the White House who wouldn’t veto GOP-passed legislation and who would put Scalia-II on the Supreme Court. Those voters constantly say didn’t take the rest of Trump’s threats seriously.

They may be right. But they may also be wrong, or more likely, as with the Bush administration, we’ll see a mix of relatively innocuous people but also a few frighteningly cruel ones who will push things as far as they can. And for those, be ready to push back. Big time.

My upshot:

Come 2020 (or even 2018) I want to see candidates who understand the following realities

  1. Presidential elections are won state-by-state and the Electoral College is not fair. It wasn’t supposed to be fair—it was a concession to slave-owning states—and it isn’t.
  2. Voting suppression is also very real and until you get a Supreme Court that values small-d democracy—which obviously is not happening any time soon—you are going to have to add that into your calculations.
  3. Third parties will get votes: It’s hard to imagine weaker candidates than Johnson and Stein were this year but even they got votes.
  4. You can’t simply buy popularity: Clinton should have seen that from the experiences of Jeb Bush and Carly Fiorina.[24]  You can’t get turn-out without popularity and a coherent platform. The consultants, meanwhile, will be playing you like a violin.
    Have the radical idea that there’s more to winning an election than collecting money from people who can write $10,000 checks in the blink of an eye and then figuring people will go out and vote for you simply because the GOP has alienated them. And after you’ve assured them you are virtually certain to win anyway.
  5. The Roosevelt (and Bill Clinton, and classical Progressive) coalition included rural and working class whites: it wasn’t just coastal elites and minorities. If exit polls can be believed, Obama had a critically greater level of support from working class whites than Clinton attracted.
  6. A party controling only a third of state legislatures, a third of governorships, and slightly less than half of both the House and Senate is in trouble. Even if they control California.
  7. When you hire a pollster, ask them to explain how a confidence interval works. Not to estimate the number of golf balls to fit in a 747.

Wrapping this back to the opening key, I still think that the fundamental question facing the Democratic Party is whether they are willing to offer anything to the white working class, at least in the Rust Belt, and ideally across the country [25]. And no, this does not mean extending a Democratic big tent to David Duke, the KKK and the alt-right: they can stay with the GOP. Please.

What will it take to recreate an effective opposition? Here I think Trump (and certainly Sanders) may have done us a big favor by demonstrating just how little organization and resources are required to run a successful national campaign in the 21st century, even at the national level. The fabulously well-heeled fund raisers and fixers, the legions of lavishly compensated consultants and pollsters, the massive media buys, the star-studded galas: all for naught. Trump maybe, just maybe, has pointed the way to another model.

But whether or not that is the right model, we definitely need another model.

Back to work on Docker containerization.

Beyond the Snark

These references are by no means comprehensive, and some of them of have added after I first posted this—again, I think my analysis is fairly close to the consensus view of those outside the DNC bubble—but may nonetheless be useful

One way forward (Sanders):http://www.nytimes.com/2016/11/12/opinion/bernie-sanders-where-the-democrats-go-from-here.html

Another way forward: http://www.nytimes.com/2016/11/20/opinion/sunday/the-end-of-identity-liberalism.html (I’d also note this article was at the top of the NYT “most emailed” list for a couple of days. This approach to a new liberalism would also have, shall we say, “interesting” implications for most university curricula in the humanities. Or what remains of the humanities, per George Will https://www.washingtonpost.com/opinions/higher-education-is-awash-with-hysteria-that-might-have-helped-elect-trump/2016/11/18/a589b14e-ace6-11e6-977a-1030f822fc35_story.html. )

And still more ideas, Establishment and otherwise: https://www.washingtonpost.com/opinions/cory-booker-zephyr-teachout-and-more-on-the-democrats-future/2016/11/18/5e20a65e-ace2-11e6-977a-1030f822fc35_story.html

Pretty much the same arguments I’m making with individual variations:

Debbie Dingell: https://www.washingtonpost.com/opinions/i-said-clinton-was-in-trouble-with-the-voters-i-represent-democrats-didnt-listen/2016/11/10/0e9521a6-a796-11e6-ba59-a7d93165c6d4_story.html

Frank Bruni: http://www.nytimes.com/2016/11/13/opinion/the-democrats-screwed-up.html

Thomas Edsall: http://www.nytimes.com/2016/11/10/opinion/presidential-small-ball.html

Ross Douthat on the Trump presidency, pretty much a mix between my Bush-III and Gridlock-III. http://www.nytimes.com/2016/11/13/opinion/he-made-america-feel-great-again.html (love the phrase “TrumpWorks”)

This is not good news for the GOP as we knew it: http://www.nytimes.com/2016/11/19/us/politics/never-trump-republicans.html

Polling error was systematic rather than random: http://www.nytimes.com/interactive/2016/11/13/upshot/putting-the-polling-miss-of-2016-in-perspective.html (note that “systematic” is not the same as “deliberate”: every pollster who is not a partisan hack is mortified with these results. Besides, if this was deliberate—which I very much doubt—it almost certainly hurt Clinton by reducing Democratic turnout in pivotal states:these are not the conspiratorial manipulations that were being suggested prior to 8 November.)

Another one well worth reading:
https://www.washingtonpost.com/posteverything/wp/2016/11/15/theres-a-reason-trump-supporters-believed-his-talk-about-rigged-systems/ Am I in the sort of position to get a hospital bed made available? Been there, done that. That would make an interesting future survey question, along with the “Could you get $400 in an emergency” that had surprising results.

Footnotes

1. Though in some places that can translate into “Write hard, die young”: that’s what we’re trying to avoid here in Trump’s U.S., eh?

2. My understanding is that most analysts think that once all of the absentee and mail-in ballots are counted in a week or so, Clinton will have a very substantial lead in the popular vote, not just the tenths of a percentage point reported the night of the election.

3. One of the more disconcerting moments of 2016 was when I heard a mild-mannered mother of two in Nevada, an old friend of my wife’s, say: “They say Hillary Clinton murders people? Well, I certainly hope she’s murdered people: we need someone tough like that in the White House!” She lived in a predominantly Republican area—Michael Milken has a place down the street—and I’m guessing this was a useful ploy for diverting conversations otherwise going in unpleasant directions. But I’m not entirely sure she was kidding. And as numerous people have pointed out, the fact that Anthony Weiner is still alive is proof that the Clintons don’t have people killed.

4. What could possibly go wrong…

5. But why didn’t you say that earlier?? I did: https://asecondmouse.wordpress.com/2015/11/29/seven-lessons-the-national-democratic-party-should-draw-from-the-victory-of-john-bel-edwards-1/. Meanwhile I’ll be curious to see if Thomas Franks’ previously-panned Listen Liberal: Or Whatever Happened to the Party of the People gets some renewed attention.

5. Trump has done a major favor to dominant European parties by alerting them to the possibility of a major challenge within a party rather than just through the traditional parliamentary route of a third party challenge—though European party discipline would make this much more difficult than in the US—and reinforcing the realities of Russian meddling in elections, which clearly the US did not take seriously, and the US media actively assisted.

6. This is also Kansas-II as it will presumably lead to a soaring deficit while we wait for the Supply Side Fairy to sprinkle magic pixie dust on the budget and make it all okay. Just like with every other GOP administration since Reagan.

7. We saw this for many years in Kansas when the GOP legislators were so completely split into moderate and conservative blocs that they literally barely spoke to each other.

8. Also phrased as “When they pass around the plate of shit sandwiches, you can say ‘No thank you, I’ve already had my share.'”  A single-term also allows Trump to completely ignore his populist promises: those voters sure aren’t going to be showing up at his hotels and golf courses. At this moment—late afternoon on 12 November—we’re certainly watching the “So long, suckers!!” approach; this may or may not last.

9. But more generally, Trump has himself in quite a fix now: he’s riding the tiger in the spotlight of history and there is no easy way to get off. Whereas a week ago he would have probably been a mere amusing footnote, the black swan that didn’t occur.

10. They pulled the same trick in the primaries at the senatorial level in Pennsylvania, spending millions to defeat retired Admiral Joe Sestak, who had spent six years preparing to run against Toomey but who refused to kowtow to the party elites, only some of whom are convicted felons. The establishment’s pliable political newcomer sock puppet not only paved the way for Toomey’s return to office as part of a Republican majority, but provided none of the coattails to the working class Sestak would have provided.

11. Mind you, I did have the pleasure of absolutely reaming out some poor bastard from the Democratic Senatorial Campaign Committee, the jokers who whacked Sestak: guessing I’m no longer on their list of supporters to harass for contributions.

12. Médecins Sans Frontières (MSF) International: http://www.msf.org/

13. More precisely, “Sixteen fucking percentage points”

14. Better, you could do this in a nearby state park where you might meet someone outside your bubble, though I’m guessing those who would have to work the better part of a week to make as much money as that check you’re blowing off on DNC consultants might be more than a little offended by the bonfire.

15. At the lower frequencies, Dan Drezner and Arthur Brooks; at the “where is this dude going to go next?” David Brooks; in small doses Paul Krugman.

16. An unusually vivid example of this was the persistence of reporting “Two way race” polling results over the last three months. WTF?!?: were you people planning to assassinate Gary Johnson and Jill Stein before the polls opened? Did you have premonitions Johnson would be struck dead by a meteorite and Stein mauled to death by a raccoon? In almost every state, it was a four-way race: your “two-way” race was a ludicrous hypothetical completely irrelevant to the real world.

17. Gratuitous Boomer reference

18. Okay, I’m lying: I have just incorrectly described how a confidence interval works because I have a marvelous description—gratuitous Fermat’s-Last-Theorem reference—but it is too small to fit in the main narrative of this blog. A confidence interval is in fact the interval estimated from your sample which would contain true value of the parameter [typically] 95% of the time were we able—welcome to the Mad Hatter’s Tea Party of frequentism—to replicate the experiment and estimate that interval from the resulting data a large number of times. Which we won’t, and in many cases, can’t. Which means…oh crap…just look it up… [19]

19. No, can’t help myself!…this is worse than Bezo’s WaPo…but even if the final observed value falls within your confidence interval, you haven’t shown your confidence interval is “correct,”despite the excuses you are seeing now from pollsters. To do that you’d need know the true parameter (which you don’t) and do some large number of replications of your estimator (which you won’t), and even then you would only have shown your method of estimating the confidence interval was correct in the sense of behaving in a fashion consistent with what you expect. Though whatever these problems, confidence intervals on regression coefficients are far worse… I digress…no, I don’t…FREQUENTISM MAKES NO SENSE!  IT NEVER MADE ANY SENSE!! JUST STOP DOING IT!!!…AAAEEEIIII!!!!…

20. I’m guessing that, ever the narcissist, The Donald is going to take a lot of standard operating procedure as a personal affront and this is going to become extremely wearisome for him over time. His more right-wing appointees will fare little better and probably will be facing active foot-dragging, particularly once it is clear—or widely assumed—Trump won’t run for re-election. And that is probably already being widely assumed.

21. Yes, I read your stuff when it is intellectually coherent, as distinct from conspiratorial rants employed to get people to tune in to watch advertisements for gold bars and ersatz tactical equipment purchased by folks who have a very high probability of dying in a comfortable hospital bed paid for by Medicare.

22. A nephew who is gay was in Washington last week, just after the election. “I’m going to visit the Holocaust Museum: want to see what is coming next.” Not sure whether he was joking.

23. Violent protests before Trump has even done anything?: less so. 24-hour drum circles: never.

24. Though one of the brightest bits of news in the past few days is the prospect of Carly Fiorina heading the RNC.

25. Which, by the way, probably also includes a significant number of Latinos and, against a less-overtly racially polarizing candidate, blacks, and all on the same economic issues: It’s class, not race, and remember that playing the race card against class is the oldest trick in the book for anti-progressive forces in the United States.

26. They certainly aren’t angels in my part of the prediction world: everybody now says they flawlessly predicted the collapse of the USSR and the Arab Spring, except for the inconveniently complete absence of evidence that this was true.

Posted in Politics | 5 Comments