## Tuesday, May 26, 2015

### Catch me if you can?

Last week, news that the data behind a groundbreaking field experiment purporting to show long-lasting persuasion effects in individuals' attitudes towards gay marriage had very likely been faked spread across the political science community and the internet at large. These revelations have prompted quite a bit of reflection among scholars about the importance of unwritten norms of trust in scientific research and how these sorts of frauds can be detected without bringing the research process to a grinding halt.

I think Jonathan Ladd makes an important point that we should not necessarily basing research norms and policy on detecting ex-ante the sorts of extremely bad faith fabrications like LaCour (2014). If individuals are willing to so brazenly lie about and obfuscate their research, then it is likely they will be able to circumvent any such barriers.  What shocked me the most about the LaCour fabrication was exactly how "bad faith" it was in its scope. Initially I had thought the manipulation was done to a previously collected experimental dataset - tweaking the means of the treated units in order to obtain the desired effect when one was not found upon first glance. Reading the Broockman, Kalla and Aronow's note outlining the irregularities in the study showed that the manipulations were far more extreme - a la Stapel, the observations were just made up.

Not only is this sort of cheating hard to catch, it's difficult to envision a way in which the scientific community could make it more deterrable by increasing the costs of discovery. Don Green noted in his recent interview with NYMag post-revelation:
But my puzzlement now is, if he fabricated the data, surely he must have known that when people tried to replicate his study, they would fail to do so and the truth would come out. And so why not reason backward and say, let’s do the the study properly?
The punishment being heaped upon Michael LaCour post-retraction has been so swift and severe, and it is hard to believe that the size of this punishment was unanticipated. It's hard to see the decision to fabricate as a pure risk-reward trade-off - one Science publication, no matter how prestigious, isn't worth a lifetime of ostracism for data fabrication. Rather, as Stapel's reflections suggest, there is something intrinsically "thrilling" about the process of faking data itself.

So if cheating is hard to both detect and deter ex-ante, what is to be done? Ladd is right to emphasize post-publication replication and review. As we've seen in the LaCour case, it's not a question of if fraud is detected, it's when. However, the when can be as long as decades, particularly if the manipulation is small, the costs to the scientific community in terms of "false knowledge" can be sizeable.

But there's still room for some pre-publication review strategies. If we're interested in increasing the chances that a cheater will  be detected, these strategies should aim to detect small violations and thereby push cheaters into more extreme fabrications. The more lies told, the more likely one will be detected and the entire scheme will fall apart. Note that in the LaCour case, much of the argument made by Broockman and Kalla questioning LaCour's data was made possible because a difficult-to-find dataset happened to be posted by in an unrelated scholar's replication data. This one "happy" accident was what led scholars to tear down the whole facade of the study (including whether LaCour was truthful about received grants - something which almost no scholar would think to question absent serious suspicions).

One thought that came to mind to deter more subtle manipulations, particularly in experiments, is to create a way of verifying whether a dataset has been altered during the analysis phase. How do I know that the dataset released by researchers in the replication dataset is the same as the dataset collected at the end of an experiment? The gap between study completion and publication is many years in length. While researchers could allay concerns by posting replication data prior to publication, it's hard to imagine any scholar being willing to post their data pre-analysis and open themselves up to being "scooped."

What researchers could very easily do is post online a checksum of their dataset immediately after the dataset has been collected. They would then conduct their data cleaning and analysis as usual and subsequently post their replication data after publication. Any scholar would be able to get the checksum of the posted data file and compare it to the checksum uploaded prior to analysis to confirm that the original data file has not been tampered during analysis. All modifications (data cleaning, etc...) to the file would be made transparently available in the analysis code. Because hashing algorithms are designed such that small modifications of the original file yield completely different hashes and that the chances of a "hash collision" (two different files yielding the same checksum) are extremely low, it would be very difficult for a potential cheater to make manipulations while preserving the original hash.

I could imagine such a procedure being made a component of study pre-registration as it is a very low burden on a researcher (one command line operation) and the registering organization would serve as a trusted third party in preserving the checksum. This certainly won't prevent data manipulation, but it would decrease the amount of time available to cheaters to manipulate data and possibly help increase trust in existing experimental datasets.

Thoughts?

## Friday, May 16, 2014

### How Bad are Duplication Problems in GDELT Events Data? Very!

Edit: Neal Caren beat me to it! He comes to the same conclusion as well - GDELT only appears de-duplicates when records are exactly the same...which misses a LOT.

Caerus Associates' Dr. Erin Simpson recently took FiveThirtyEight to task over an article about the recent kidnapping of 276 school girls in Nigeria. The problem: An appallingly poor and incredibly misleading use of the GDELT event dataset to analyze kidnapping "trends" in Nigeria. The full conversation as it unfolded on Twitter can be found preserved here.

This tweet in particular hit on an extremely important point
The disconnect that the original FiveThirtyEight article failed to take into account is that GDELT, and machine-coded events datasets in general, are not counts of events; they're counts of news reports. Assuming a one-to-one mapping between the two is extremely problematic. Increases in overall media volume or duplicate articles will inevitably give an inflated estimate of the underlying true event count.

Even more troublesome is that there's really no good fix for the duplication problem that has been developed. Normalizing by overall media volume doesn't do anything because we still don't know how many individual news reports = one event. When FiveThirtyEight writes "the database records 151 kidnappings on [April 15] and 215 the next," we don't know whether those kidnapping reports are all primarily talking about the single kidnapping in Borno State that garnered so much media attention, or 366 distinct kidnapping incidents (of course, it's the former).

How problematic is duplication? I decided to take a look at the data on Nigerian kidnappings myself to figure out just what percentage of these GDELT reports are all discussing the same exact event. Here's what I found:

First, multiple GDELT entries are often sourced from the same URL. It is very likely that these are duplicates since news reports, particularly wire articles, tend to focus on single events. I removed all entries with duplicate URLs and was left with only 98 kidnapping-related articles on April 15th and 123 on April 16th.

Second, for a story as prominent as the kidnapping in Chibok, it is very likely that multiple news sources will report on the same event. Detecting this type of duplication is a major challenge and an open area of research in the event data field. One way of going about this is to look at the URLs for a hint as to the content of the article - news website URLs often contain the entire headline. Presuming that any article about Nigerian girls on April 15th and 16th that gets coded as a kidnapping is likely talking about the same incident, I searched for all URLs in the kidnapping dataset that contained "girl" 61 of the unique URLs on April 15th and 56 of the addresses on April 16th matched the query.

So that left 37 potential kidnapping events on April 15th unrelated to the Chibok kidnapping. How many of them actually were?

One way of answering this question would be to train a classifier on the set of articles (and in fact, I think this is one way forward for larger-scale de-duplication tasks). However, 37 is not that many articles, so reading them would suffice for this post. All but two articles were either about the Chibok kidnapping or unrelated to Nigeria. These two articles both referenced the same event: a kidnapping in Kogi State.

151 GDELT reports of kidnappings in Nigeria on April 15th - 2 actual kidnapping events.

This is just one example of a general problem with GDELT - it's inability to do more than extremely basic de-duplication. Now this would be fine if it was selling itself as a tool for roughly monitoring what the news media is writing about, but GDELT wants to be an
"initiative to construct a catalog of human societal-scale behavior and beliefs across all countries of the world, connecting every person, organization, location, count, theme, news source, and event across the planet into a single massive network that captures what's happening around the world, what its context is and who's involved, and how the world is feeling about it, every single day."
This simply isn't going to happen until we can reliably use media reports to extract distinct international events.

A good de-duplication algorithm is essential for using machine coded events data for more than just monitoring media attention. Training classifiers to detect similar phrasing and word usage in the texts coded by TABARI seems like a promising means of removing duplicates, particularly because a sizable amount of work has been done on this problem in computer science and computational linguistics. However, given that so many of the texts from which GDELT is coded are unavailable for public review due to licensing restrictions, this approach is unlikely to be feasible in the short-term. This makes GDELT in its current form a rather limited dataset (except perhaps for more recent time periods where references to the sources are included). More generally, access to the source texts is a must for any effective de-duplication method - there's just too much information lost going from the text to the event code.

One last comment:

FiveThirtyEight decided to double down on GDELT and posted another article trying to map the geographic distribution of kidnapping reports in Nigeria. This still obviously suffers from many of the problems I mentioned above - these reports are all mostly talking about the same single event - but there's an additional issue introduced by using geocodes. The problem is that the geocoding algorithm works by extracting references to geographic locations and often many locations are mentioned in a single article. An article on the kidnapping may simply say that it occurred in Nigeria while others specify the sub-unit (Borno State). Moreover, if the article mentions a statement by the government in Abuja (or even has a dateline where the reporter happens to be stationed in the capital), the event derived from that article can easily get coded to "Abuja, Abuja Federal Capital Territory, Nigeria." This is likely why the FCT is bright red on FiveThirtyEight's map - it's the near-default location for ALL events in Nigeria (particularly when the federal government is involved).

## Monday, March 3, 2014

### Why a Nuclear Ukraine is an Empty Counterfactual

In light of Russia's military invasion of Crimea, an action that is in complete violation of its security assurances to Ukraine under the Budapest memorandum, a number of commentators have unearthed John Mearsheimer's 1993 article in Foreign Affairs arguing that Ukraine would have been better off keeping a nuclear deterrent after the fall of the Soviet Union in order to check Russian expansion.

As Walter Russell Mead succinctly put it
If President Obama does this, however, and Ukraine ends up losing chunks of territory to Russia, it is pretty much the end of a rational case for non-proliferation in many countries around the world. If Ukraine still had its nukes, it would probably still have Crimea. It gave up its nukes, got worthless paper guarantees, and also got an invasion from a more powerful and nuclear neighbor.
Indeed, this directly echoes Mearsheimer's argument for why Ukraine should have kept its arsenal
A nuclear Ukraine makes sense for two reasons. First, it is imperative to maintain peace between Russia and Ukraine. That means ensuring that the Russians, who have a history of bad relations with Ukraine, do not move to reconquer it. Ukraine cannot defend itself against a nuclear-armed Russia with conventional weapons, and no state, including the United States, is going to extend to it a meaningful security guarantee. Ukrainian nuclear weapons are the only reliable deterrent to Russian aggression. If the U.S. aim is to enhance stability in Europe, the case against a nuclear-armed Ukraine is unpersuasive.
Second, it is unlikely that Ukraine will transfer its remaining nuclear weapons to Russia, the state it fears most. The United States and its European allies can complain bitterly about this decision, but they are not in a position to force Ukraine to go nonnuclear. Moreover, pursuing a confrontation with Ukraine over the nuclear issue raises the risks of war by making the Russians more daring, the Ukrainians more fearful, and the Americans less able to defuse a crisis between them.
But in retrospect, Mearsheimer's second point was clearly wrong. This is precisely what Ukraine did in 1994 in exchange for a weakly enforceable negative security assurance from Russia, U.S. and U.K.. The puzzle then is why Ukraine failed to listen to Mearsheimer's sage advice and did the unthinkable - transfer its nuclear weapons to Russia.

In order to ask what would have happened had Ukraine opted to retain its arsenal, it is important to think through the entire counterfactual. The problem with the "if only Ukraine had nukes" line of argument it assumes that Russia would have tolerated a nuclear weapons state on its border in the first place. If we were to hold the world in 2014 constant and by magic turn Ukraine into a stable nuclear power, then perhaps Russia would have been deterred from occupying Crimea. But this is not the counterfactual we're interested in.

What would have happened had Ukraine decided not to return its nuclear weapons arsenal to Russia in the 1990s? Rather than allow a nuclear Ukraine on its doorstep, it is much more likely that Russia would have chosen to preempt Ukraine and secure its arsenal by force. We would have seen something very similar to the current situation in Crimea, but on a much grander scale. The bargain struck by Ukraine and Russia in 1994 was a way of avoiding such an outcome.

The logic behind this conclusion follows from a variation on Robert Powell's famous bargaining model of war analyzed in In the Shadow of Power.

To summarize, consider a scenario between two states with varying levels of power (ability to win in a war) bargaining over the division of some good. So long as the distribution of power between states is constant and war is costly, a mutually beneficial bargain should always be reached that reflects the underlying distribution of power. If one state is dissatisfied with the distribution of the good, then the satisfied state still finds it beneficial to make a concession such that the dissatisfied state is indifferent between peace and war. Assuming that striking a bargain is cheaper than fighting a war (a reasonable assumption), states will make a deal instead of going to war. This is the conventional inefficiency puzzle" of war - why does war occur if states incur costs in fighting and can reach a Pareto-optimal bargain that reflects the post-war outcome without having to fight?

One potential source of war that Powell identifies is a commitment problem that arises when the distribution of power between states shifts rapidly over time. Suppose that "now," state A is much stronger than state B, but in the "future," state B's power will increase relative to A. A knows that in the future, in order to avoid war with B, it will have to concede much more than it does now (since future B is stronger and in equilibrium, the distribution of goods reflects the distribution of power). If the size of the power shift is sufficiently large, A may be better off choosing to fight B when it is weaker and risk claiming the good through war, preventing the shift in power that would force it into a weaker bargaining position in the future. The cause of this war is the commitment problem facing B. B would certainly be better off preventing A from going to war in the "now," since in its weak state, it will likely lose. If it could credibly constrain itself from using its future bargaining leverage against A, then both B and A would be better off (avoiding war). However, in the absence of some third party mechanism, any promises to not exploit its future bargaining leverage made by B "now," are irrelevant once it gains power in the "future." A and B face a situation akin to a prisoner's dilemma. They are jointly better off when A does not go to war and B does not exploit its bargaining leverage, but if A does not go to war, B is best off "defecting" and using its newly acquired bargaining power to extract more concessions from A. Knowing that B will do this, A's best response is to fight and prevent B from rising.

So how does this apply to Russia and Ukraine? In the post-Soviet period, Ukraine was rising relative to Russia. It had gone from a constituent part of the Russian-dominated Soviet Union to a sovereign state in its own right. However, in the early 90s, it still remained comparatively weaker (both in size and in military capability). It had yet to fully consolidate itsinherited" military capability - and what precisely it would inherit remained up for debate. Ukraine was in a peculiar situation where it could, to some extent, "choose" how much power it would have for itself - that is, how much of the Soviet military remaining on its territory would be returned to Russia.

Ukraine's choice to give up its nuclear arsenal can be understood in the context of the model as an attempt to credibly commit to not exploit future bargaining leverage  by "smoothing" its rise relative to Russia. By giving up some of its future power (and future access to benefits), Ukraine made it more likely that Russia and Ukraine could reach the Pareto-optimal "no war" outcome in the 90s. Russia was better off choosing not to fight Ukraine and Ukraine would be better off not being invaded. Given the option to credibly commit to constrain its rise, Ukraine chose to do so. Additionally, because of substantially diminishing marginal returns to nuclear capabilities, Ukraine's "self-constraining" choice to give up its nuclear capability did not appreciably increase Russia's leverage over Ukraine. A Ukraine with 1,000 nuclear weapons is much more threatening to Russia than one with zero. Conversely, a Russia with 7,000 nuclear weapons is relatively comparable to one with 8,000.

Under the Powell model, had Ukraine chosen to not give up its nuclear weapons, Russia would have been much more likely at the time to take preemptive action to secure its arsenal by force. The deterrence argument is moot. If nuclear weapons had any meaningful deterrent effect on Russia, then Russia would likely have acted militarily in the 90s to prevent a nuclear Ukraine rather than let Ukraine wield its leverage in the future. This might have taken on a much larger character than the current action in Crimea, particularly as the loyalties of the formerly Soviet "Ukrainian" military forces at the time were much less clear than they are now. Given the relative "newness" of an independent Ukrainian state, an occupation would not be out of the question. While Mearsheimer's original article briefly considered the possibility of a pre-emptive Russian attack, it dismisses it much too quickly and easily. Although it argues that a pre-emptive war between Ukraine and Russia in 1993 would be risky, it fails to run the clock back even further and consider why Russia chose to not pre-empt Ukraine at an even earlier time (when the Ukrainian command and control was less established) unless it was convinced of Ukrainian denuclearization. Perhaps Ukrainian nuclear weapons were simply not a sufficient cause for preemption in this counterfactual 1990s. But if this is the case, then it is unlikely that they would be a credible constraint on Russia now, particularly for the type of war we are seeing right now in Crimea (as opposed to a full occupation). The Kargil crisis between India and Pakistan illustrates that a conventional, limited conflict between two nuclear powers over a disputed territory is a distinct possibility - an example of the classic "stability-instability paradox." Either way, the case for a Ukrainian deterrent falls flat.

Denuclearization was a near-necessity for state survival. It is unlikely that Ukraine could have retained its nuclear weapons arsenal in a world where it refused to bargain with Russia over their removal. This explains why Ukraine settled for a relatively toothless negative security assurance in exchange for the cost of transferring its nuclear arsenal - it would not have been able to keep them either way. The weakness of its security assurance reflected the bargaining facts on the ground. Russia simply did not have to offer Ukraine much to secure its arsenal since giving up its nukes was a Pareto-improving move. At the time, the nuclear arsenal was more of a curse than a blessing for Ukraine.

I certainly condemn Russia's actions and think the rest of the world should do what it can (which is admittedly not much at the moment) to return the situation to status quo ante. But in our rush to figure out how the Crimean invasion could have been prevented, we should not imagine that a simple reversal of a 20-year-old decision could have so easily solved the current crisis. It is better to understand the reasons why states may have chosen to behave the way they did rather than attributing foreign policy decisions to "error." Nor should we conclude that the non-proliferation agenda is somehow doomed because leaders will learn from Ukraine's example. States are already fully cognizant of the utility (and, in many cases disutility) of nuclear weapons - another case isn't going to magically change their minds. The question we should be asking is why states still refrain from proliferating despite cases like Libya and Ukraine. And indeed, what the full story of Ukraine's denuclearization tells us is that Ukraine had very logical reasons for giving up its inherited deterrent. The scenario of a nuclear Ukraine in the 1990s would have been much, much worse - both for Ukraine and likely the world.

Edit: Phil Arena reminded me that this argument is very similar to a point William Spaniel makes in his dissertation. You can find a paper version of that article here ("The Theory of Butter-for-Bombs Agreements")

## Tuesday, July 16, 2013

### Some more evidence that Florida's 'Stand Your Ground' law increased firearm homicide rates

The acquittal of George Zimmerman in the shooting death of Trayvon Martin has rightly turned attention to the permissiveness of Florida's self-defense laws. Although the state's 2005 "Stand your Ground" law was not used by the defense, it nevertheless framed the Zimmerman case from the very beginning. References to "Stand your Ground" and self-defense were included in the judge's instructions to the jury, and in a post-verdict interview, one of the jurors admitted that the law factored into their decision.

Under the common law "Castle doctrine" principle, individuals facing an imminent threat of death or bodily harm do not have a duty to retreat and may respond with force when in one's home. Stand Your Ground laws (SYG) generally extend this principle to any location where a person has a legal right to be and allow the use of deadly force in self-defense when an individual is presumed to have a "reasonable fear" of death or severe bodily injury. Since the passage of Florida's law in 2005, over thirty states have followed suit and adopted similar expansions of the Castle doctrine.

By definition, SYG laws make homicide less costly by providing the attacker with an additional legal defense. Indeed, as expected, these laws are associated with greater numbers of homicides that are ruled "justifiable." More troubling is that determinations of "justifiability" exhibit a stark racial bias in both SYG and non-SYG states - white-on-black killings are the most likely to be ruled justifiable, with while black-on-white killings are the least likely.

Defenders of SYG laws argue that, although homicides are more likely to be ruled justified, SYG can be expected to reduce the overall rate of homicide and violent crime. By permitting persons being attacked to retaliate in full force rather than retreating, SYG laws theoretically increase the costs of committing a violent offense. Even if justifiable homicides increase, defenders would argue that these homicides substitute for otherwise non-justifiable homicides. The net homicide and violent crime rates, in the presence of SYG laws, should decrease.

Two recent studies find the opposite. Far from deterring homicide, SYG laws increase its incidence. Moreover, the laws have no appreciable deterrent effect on violent crime. Analyzing data from the FBI's Uniform Crime Reporting system, Cheng and Hoekstra find that SYG laws lead to roughly an 8% increase in reported murders and non-negligent manslaughters. McClellan and Terkin find a similar effect on firearm homicides and firearm accidents using monthly data from the CDC. These findings are consistent with a different understanding of the incentives generated by Stand Your Ground. Rather than increase the costs of violence, SYG laws decrease them by expanding the range of legal defenses available to an attacker. Because of the vagueness of the "presumption of reasonable fear," and the absence of many third-party witnesses, SYG laws stack the deck in favor of an assailant by raising the prosecution's evidentiary burden (as was made clear in the Zimmerman trial).

Whether SYG laws increase or reduce murder rates is an important policy question deserving of further study. The Cheng/Hoekstra and McClellan/Terkin papers provide convincing evidence, but it is always valuable to re-examine any scientific finding using different approaches and methods. Both of these studies use standard panel regression techniques to estimate the causal effect of Stand Your Ground laws while controlling for other potential confounders. While parametric regression is an ubiquitous and powerful tool for causal inference, it is a very model-dependent approach. This can sometimes lead to misleading conclusions when the model gets too far away from the data.

Any approach to figuring out whether some "treatment" T causes Y relies on comparing the factual (what actually happened) to the counterfactual (what would have happened had T been different). The fundamental problem of causal inference is that we can never observe the counterfactual - we only see what happened. Statistical approaches to determining causality rely on estimating an appropriate counterfactual from the data. The ideal counterfactual is a case that is identical to the "factual" one on all relevant characteristics except for T. However, such cases are often lacking. There is no exact copy of Florida somewhere in the U.S. that did not pass Stand Your Ground. Ideally we would like to pick the closest case possible, but even then such a case may be nonexistent, particularly when potential confounding variables for which we would like to control are highly correlated with our treatment. The counterfactual may be a case that has never been seen before. When regression techniques are used to estimate these "extreme" counterfactuals, they rely on extrapolation outside of the scope of the observed data. As Gary King and Langche Zeng show, such extrapolations are highly dependent on often indefensible modeling assumptions that become more and more tenuous as one gets further and further away from the data. Slight alterations to the model can yield drastically different results. Moreover, typical ways of presenting regression results (tables of coefficients) rarely make the counterfactual apparent. It is very difficult to get a sense of the extent to which the results in an empirical paper are based on extrapolation. While robustness checks help, basic regression papers often obscure the factual/counterfactual comparison on which a causal claim is based. This is not to say that regression is useless or that the Cheng/Hoekstra and McClellan/Terkin results are fundamentally flawed. However, it is worthwhile to see whether the finding holds when using a different approach to causal inference.

Instead of regression, I use the Synthetic Control method developed by Abadie, Diamond and Hainmueller to estimate the effect of Florida's 2005 Stand Your Ground law on firearm homicide rates. This method has been used to evaluate comparable state-level interventions. Abadie and Gardeazebal (2003) use it to measure the effect of terrorism on economic growth in the Basque Country while Abadie et.al. (2010) assess the impact of California's Proposition 99 on cigarette sales. Synthetic control methods compare the factual time series of the outcome variable in a unit exposed to the treatment (Florida) with a "synthetic" counterfactual constructed by weighting a set of "donor" units not exposed to the treatment (states without SYG) such that the synthetic control matches the factual unit as closely as possible on potential confounding variables and pre-treatment outcomes. By forcing the weights to be positive and sum to one, this method ensures that the estimated counterfactual stays within the bounds of the data, thereby guarding against extrapolation. The intuition is that a combination of control states can approximate the counterfactual of "Florida without Stand Your Ground" better than any one state. The "synthetic" Florida provides a baseline for comparing homicide rates after SYG was implemented in 2005. It would certainly be possible to use the synthetic control approach to evaluate the effect of SYG in other states. However, I focus here on Florida because it was the earliest to enact such a law and has the most years for which the effects of SYG can be observed.

I use state-level mortality data from the CDC's Wonder database to construct a measure of per-capita firearm homicides for each state in years 2000 to 2010. Following the lists in Cheng/Hoekstra and McClellan/Terkin I also obtain a set of state-level covariates from Census, BLS and DOJ data sources related to age and racial composition of the population, poverty, median income, urbanization, unemployment, incarceration, and federal police presence. All of the covariates are measured in 2000 - prior to the start of the time-series.

The rapid adoption of SYG laws after 2005 unfortunately limits the set of "donor" states available for constructing the synthetic control. Only 22 states do not have a "Stand Your Ground"-equivalent law in force during the 2000-2010 period: Arkansas, California, Colorado, Connecticut, Delaware, Hawaii, Iowa, Maine, Maryland, Massachusetts, Minnesota, Nebraska, Nevada, New Jersey, New Mexico, New York, North Carolina, Oregon, Pennsylvania, Rhode Island, Vermont, Wisconsin. Nevada, North Carolina and Pennsylvania passed SYG laws in 2011. Additionally, because of data privacy concerns, the CDC does not report data for regions where a sufficiently small number of events occurred, which further constrains the total set of viable donor states. Nevertheless, the pool of donors is able to provide a reasonable synthetic counterfactual for Florida.

Florida's SYG was passed in October, 2005 meaning that it really only affected years 2006 onward. Matching Florida to the pool of controls on the set of covariates and on firearm homicide rates from 2000 to 2005 yields a synthetic counterfactual that reasonably approximates Florida's pre-SYG homicide patterns. The figure below plots the actual trajectory of Florida's firearm homicide rate relative to the path followed by the synthetic Florida sans-SYG. Homicide rates in actual and synthetic Florida match up rather well in the 2000-2005 period. However, from 2006-2010, the factual and counterfactual diverge dramatically. Florida's firearm homicide rate sees a huge increase from 2006 to 2007, while the synthetic rate begins to decline. Although rates drop from 2007 to 2010, they remain significantly higher than they would have been had SYG not been in place. The results suggest that that Florida experienced about 1-1.5 more annual homicides from 2006-2010 than it would have had Stand Your Ground not been implemented.

 Firearm homicide rates in Florida - Actual vs. Synthetic Control

As with all statistical techniques, it's important to evaluate how unlikely it is that the observed pattern was generated purely by randomness. That is, how significant is this result? Although there are no specific parameters and standard errors to estimate, one can get a sense of the "statistical significance" of the apparent effect of SYG using placebo tests on our donor pool. A placebo test applies the same synthetic control techniques to cases known to be unaffected by the treatment. The resulting distribution of "placebo effects" gives a sense of  the types of patterns that we would see under the hypothesis of no effect - that is, pure randomness. If the pattern exhibited by Florida appears unusual relative to then one can be relatively confident that it is not due to chance.

 Firearm homicide rates in Florida - Placebo tests (Discards states with pre-2006 MSPE five times higher than Florida's)

The figure above plots the gap in firearm homicide rates between the actual time series and the estimated synthetic control for Florida and for each of the control states. Relative to the distribution of relevant placebos, the Florida effect stands out post-2006. Florida's is the most unusual line in the set and from 2007-2010 shows a positive deviation from the control greater than any of the placebo tests. Although the pool of control states is somewhat small, limiting the number of possible placebo tests, the trajectory of Florida's homicide rate is certainly unusual and difficult to attribute to pure chance.

Supporters of Florida's law point to reductions in the violent crime rate since 2005 as evidence that the law's deterrent effect is working. However, just looking at a trend as evidence of causation makes no sense - in order to assign causality, one needs to make a comparison with some counterfactual case. Violent crime rates in Florida have been overall declining since 2000, so it is unlikely that the downward trend would not have existed had SYG not been passed.

Unfortunately, it is difficult to evaluate whether SYG reduced violent crime rates using a synthetic control approach because no good counterfactual exists in the data. Florida generally has some of the highest violent crime rates in the country and they are consistently higher than those of any of the states in the donor pool (New Mexico is close, but still lower). As a consequence, it is impossible to find any combination of control states that consistently match Florida's pre-2005 trend. Any counterfactual for Florida's overall violent crime rate would rely heavily on extrapolation outside of the data.

While these results are certainly not definitive (the relative novelty of SYG laws limits the number of periods under observation), they corroborate existing findings. Florida's Stand Your Ground law did not have a deterrence effect on homicide, and may in fact have increased the state's murder rate. This and other evidence strongly suggests that state governments should re-think their approach to self-defense laws. While politically appealing from a "tough on crime" perspective, Stand Your Ground laws likely do much more harm than good.

Edit 7/17 - Fixed broken links

## Sunday, June 23, 2013

### I guessed wrong (kind of)

In a recent post, I argued that Edward Snowden's extradition from Hong Kong was likely. Now this happens:
US whistle-blower Edward Snowden has left Hong Kong and is on a commercial flight to Russia, but Moscow will not be his final destination.
The fugitive whistle-blower boarded the Moscow-bound flight earlier on Sunday and would continue on to another country, possibly Cuba then Venezuela, according to media reports.
The Hong Kong government said in a statement that Snowden had departed "on his own accord for a third country through a lawful and normal channel".
The Hong Kong government played the politics of this case very well. From their press release
The US Government earlier on made a request to the HKSAR Government for the issue of a provisional warrant of arrest against Mr Snowden. Since the documents provided by the US Government did not fully comply with the legal requirements under Hong Kong law, the HKSAR Government has requested the US Government to provide additional information so that the Department of Justice could consider whether the US Government's request can meet the relevant legal conditions. As the HKSAR Government has yet to have sufficient information to process the request for provisional warrant of arrest, there is no legal basis to restrict Mr Snowden from leaving Hong Kong.
Provisions that allow for requests of "additional information" are common in many extradition treaties. Certainly its not known what was in the documents provided by the United States government and precisely how they did not comply with Hong Kong law, but it is very clear that this was the easiest way to deny extradition without explicitly refusing it. Snowden's case is a particularly challenging one, given that the U.S. chose to indict Snowden under the Espionage act. The Hong Kong government may have a strong argument that the initial documents were insufficient, even if it's unlikely that the United States will believe it. The novelty of this case makes a request for additional information perfectly legitimate, even if  convenient given Snowden's subsequent departure. While the H.K.-U.S. legal cooperation may have been somewhat slighted, the HK government's decision is unlikely to affect its relationships with other states since they held to the letter and intent of the treaty and, more importantly, China did not appear to overtly intervene.

While it is impossible to know what would have happened had Snowden stayed in HK, his flight does suggest that he did not believe that his defense against extradition would have been successful. The HKSAR's hands are more tied than are Venezuela's. All things being equal, Snowden would certainly have preferred to stay in HK rather than Venezuela. However, the Hong Kong government seemed to have made it clear that it could not hold out against extradition for much longer without putting its legal arrangements into much more serious jeopardy. That China chose not to explicitly intervene at the outset does illustrate that international law does operate as a constraint, even if states can strategically use it to their advantage. Delaying extradition while offloading Snowden to a less constrained third party was an inexpensive way of satisfying the Chinese government's preference against extradition while minimizing damage to Hong Kong's international legal standing.

The closing paragraph of the press release is also absolutely perfect from a political standpoint
Meanwhile, the HKSAR Government has formally written to the US Government requesting clarification on earlier reports about the hacking of computer systems in Hong Kong by US government agencies. The HKSAR Government will continue to follow up on the matter so as to protect the legal rights of the people of Hong Kong.
Translation: "I am altering the deal, pray I don't alter it any further"

## Monday, June 17, 2013

### Marginal Effect Plots for Interaction Models in R

Political scientists often want to test hypotheses regarding interactive relationships. Typically, a theory might imply that the effect of one variable on another depends on the value of some third quantity. For example, political structures like institutional rules might mediate the effect of individual preferences on political behavior. Scholars using regression to test these types of hypotheses will include interaction terms in their models. These models take on the basic form (in the linear case)

$$y = \beta_0 + \beta_1x_1 + \beta_2x_2 + \beta_3x_1x_2 + \epsilon$$

where $\beta_3$ is the coefficient on the "interaction term" $x_1x_2$. However, interaction terms are often tricky to work with. Bear Braumoeller's 2004 article in International Organization illustrated how published quantitative papers often made basic mistakes in interpreting interaction models. Scholars frequently would mis-interpret the lower-order coefficients, $\beta_1$ and $\beta_2$. Published research articles would argue that a significant coefficient on $x_1$  suggests a meaningful relationship between $x_1$ and $y$. In a model with an interaction term, this is not necessarily the case. The marginal effect of $x_1$ on $y$ in a linear model is not equal to $\beta_1$, it is actually $\beta_1 + \beta_2x_2$. That is, $\beta_1$, is the relationship between $x_1$ and $y$ when $x_2$ is zero. Often this is a meaningless quantity since some variables (for example, the age of a registered voter) cannot possibly equal zero.

The correct way to interpret an interaction model is to plot out the relationship between $x_1$ and $y$ for the possible values of $x_2$. It's not a matter of simply looking at a single coefficient and declaring a positive or negative effect. Even if the interaction coefficient $\beta_3$ is significant, the actual meaning of the interaction can differ. One interpretation may be that $x_1$ is always positively related with y, but the effect is greater for some values of $x_2$. Another is that $x_1$ is sometimes positively associated with $y$ and sometimes negatively associated with y, depending on the value of $x_2$. Looking only at the coefficients does not capture these two different types of relationships.

Luckily, figuring out the marginal effect of $x_1$ on $y$ is rather easy. In a linear model, the point estimate for how much $y$ increases when $x_1$ is increased by 1, $\hat{\delta_1}$, is equal to

$$\hat{\delta_1} = \hat{\beta_1} + \hat{\beta_3}x_2$$

The variance of the estimator $\hat{\delta_1}$ is

$$Var(\delta_1) = Var(\hat{\beta_1} + \hat{\beta_3}x_2)$$
$$Var(\delta_1) = Var(\hat{\beta_1}) + Var(\hat{\beta_3}x_2) + 2Cov(\hat{\beta_1}, \hat{\beta_3}x_2)$$
$$Var(\delta_1) = Var(\hat{\beta_1}) + x_2^2Var(\hat{\beta_3}) + 2x_2Cov(\hat{\beta_1}, \hat{\beta_3})$$

Note that when $x_2 = 0$, $\hat{\delta_1} = \hat{\beta_1}$ and $Var(\hat{\delta_1}) = Var(\hat{\beta_1})$. The standard deviation or standard error of $\hat{\delta_1}$ is equal to the square root of this variance. Extending these formulae to the non-linear case is easy - the coefficient estimates and variances are computed the same way, and from there one can simulate relevant quantities of interest (probabilities, predicted counts).

An even simpler way to calculate the marginal effect of $x_1$ for an arbitrary value of $x_2$ is to re-center $x_2$ by subtracting from it some value $k$ and re-estimating the regression model. The coefficient and standard error of $x_1$ will be the marginal effect of x_1 on  y when $x_2 = k$. A handy trick is to mean-center $x_2$ (subtract the mean of $x_2$ from each value of $x_2$). Then, the coefficient on $x_1$ (in a linear model) is equal to the average effect of $x_1$ on $y$ over all of the values of $x_2$.*

Braumoeller's article came with Stata code to make interaction plots (though I can't seem to find it online anymore). In 2011, Stata 12 added the marginsplot command, making these sorts of figures even easier to create. Quantitative political scientists appear to have taken notice. I could not find a single article in the 2012/2013 issues of the American Political Science Review, American Journal of Political Science, and International Organization that used an interaction model without including a corresponding marginal effects plot. Correctly interpreting interaction effects is now about as easy as running the regression itself.

This is all well and good for Stata users, but what about R? Coding up these sorts of plots from scratch can get a little tedious, and no canned function (to my knowledge) exists on CRAN. Moreover, the availability of easy-to-use functions for statistical methods seems to encourage wider use among applied quantitative researchers.

So here's my code for quickly making decent-looking two-variable interaction plots in R. The first function, interaction_plot_continuous(), plots the estimated marginal effect for one variable that is interacted with a continuous "moderator." In simple terms, it plots $\delta_1$ for the range of values of $x_2$.

Below is an example of the output. For the sake of demonstration, I took the built-in R dataset airquality, which contains air quality measurements in New York taken during the 70s, and regressed maximum daily temperature on ozone content, wind speed and an interaction of ozone and wind. The plot below shows the marginal effect of wind speed moderated by ozone content:

Note that just interpreting the main effect of wind speed at zero (the regression coefficient) gives a misleading picture of the actual relationship. At 0 parts per billion of ozone, wind speed is negatively associated with temperature. But for higher values of ozone content wind speed becomes positively associated with temperature (I have no idea why this is the case, or why there would even be an interaction - my guess is there's some omitted variable). For the average value of ozone concentration (the red-dashed line), wind speed is not significantly associated with temperature.

Sometimes the moderating variable is a binary indicator. In these cases, a continuous interaction plot like the one above is probably less useful - we just want the effect when the moderator is "0" and when it's "1". The second function in the file, interaction_plot_binary(), handles this case. Again to demonstrate, I took the classic LaLonde job training experiment dataset and fitted a simple (and very much wrong) regression model. The model predicted real 1978 wages using assignment to a job training program (treatment), marital status and an interaction of the two. I then estimated the marginal "effect" of treatment assignment on wages for each of the two marital status levels. In this case, the interaction was not statistically significant.

So hopefully these two functions will save R users some time. Note that these functions also work perfectly fine with non-linear models, but the quantity plotted will be the regression coefficient and not necessarily something with substantive meaning. Unlike simple OLS, the coefficients of most non-linear models do not have a clear interpretation. You'll have to do a little bit of work to convert the coefficient estimates into something actually meaningful.

Feel free to copy and share this code, and let me know if there are any bugs. If there's enough demand, I might clean it up more and put together an R package (time permitting of course).

* I'm using the word "effect" here loosely and as shorthand for "relationship between." Assigning a causal relationship between two variables requires further conditional independence assumptions that may or may not hold.

Edit: Thanks to Patrick Lam for pointing out a typo in the variance formula (missed the 2) - fixed above and in the code.

Edit 6/18: Forgot to also include a link to Brambor, Clark, and Golder's excellent 2006 paper in Political Analysis which discussed similar issues regarding interpretation of interaction terms.

## Monday, June 10, 2013

### Will Edward Snowden be Extradited?

On Sunday, 29-year old Booz Allen Hamilton employee Edward Snowden was revealed to be the source behind the recent disclosure of highly classified documents describing top secret National Security Agency surveillance programs of telephone and internet data. Snowden disclosed his identity to the Guardian in an interview from Hong Kong, where he currently remains. His choice to leave the U.S. for Hong Kong was, in his words, driven by the belief that the Chinese-administered region has a "spirited commitment to free speech and the right of political dissent." But Snowden is not completely free from the reach of U.S. law. The United States and Hong Kong have in force an extradition treaty under which the U.S. could obtain his return for prosecution.

Why did Snowden escape to Hong Kong, likely knowing that he could be extradited? My sense is that it isn't because he was poorly informed, but rather because his no-extradition alternatives were not particularly great. The overlap between countries with "spirited commitments" to freedom and countries with which the United States does not have an extradition treaty is virtually nil. Even Iceland, Snowden's asylum target, has an extradition treaty with the U.S. in force since the 1900s.

 Countries the US has extradition treaties with (light blue, US shown here in dark blue) From Wikimedia Commons
But will Snowden be extradited back to the United States? I would argue that yes, extradition is probable*, but the process will be very tedious simply because the offense is so different from previous extradition cases. Whereas most extradition requests concern explicitly criminal matters, Snowden's may be a "political offense" for which extradition is not permitted. Working out this question will take time, but Hong Kong's track record of typically approving requests suggests that the odds are in the U.S. government's favor. Moreover, I do not think that Beijing will be able to tip the legal scales towards rejection if it so desires, despite its influence in Hong Kong.

China's preferences in this matter are certainly relevant, but the law is much more of a constraint in this process than some commentators have suggested. The security stakes are actually rather low. Snowden is not particularly useful as an intelligence asset. He no longer has access to NSA databases - all he has are whatever documents or files he could bring with him to Hong Kong, documents that were apparently selectively chosen. This is not a Cablegate-style data-dump and PRISM is hardly China's greatest intelligence fear. Certainly Beijing might want to obtain anything that Snowden still has in his possession, but it's unclear how cooperative he would be given his political leanings.

Influencing the extradition proceedings is also not costless for Beijing. Despite China's sovereignty over Hong Kong, the SAR has significant political autonomy in most areas. Indeed, it can and has concluded a number of extradition and mutual legal assistance treaties with other states. Hong Kong is a global financial hub and devotes significant legal and administrative resources to combating money laundering, financial crimes, trafficking and other such offenses. The high cross-border mobility of these types of offenders gives the Government of Hong Kong significant incentives to maintain the integrity of its extradition agreements in order to prosecute financial criminals who flee its territory. Undue influence by Beijing in the process might jeopardize the credibility of Hong Kong's other agreements. While China's ability to weigh in on extradition is clear, the decision to refuse extradition ultimately lies with the Chief Executive of Hong Kong.

Moreover, Hong Kong is also much more limited in its ability to reject extradition than some news reports have suggested. This South China Morning Post article, quoted by Doug Mataconis at OTB, gives the impression that China has de-facto veto power over any extradition request. This is not the case.

The SCMP article states that, according to the 1996 treaty, "Hong Kong has the "right of refusal when surrender implicates the 'defense, foreign affairs or essential public interest or policy'." However, this provision is irrelevant to the Snowden case since, according to Article 3, it only applies when the subject of the extradition request is a national of the PRC.
...(3) The executive authority of the Government of Hong Kong reserves the right to refuse the surrender of nationals of the State whose government is responsible for the foreign affairs relating to Hong Kong in cases in which:
(a) The requested surrender relates to the defence, foreign affairs or essential public interest or policy of the State whose government is responsible for the foreign affairs relating to Hong Kong, or...
The article also notes that Hong Kong could reject a request if it determines that the extradition is "politically motivated." However, the article omits the fact that this determination is to be made by the "competent authority of the requested Party" which, according to the committee report that accompanied the treaty's ratification, is interpreted as the judiciary and not the executive of Hong Kong.
...Notwithstanding the terms of paragraph (2) of this Article, surrender shall not be granted if the competent authority of the requested Party, which for the United States shall be the executive authority, determines:
(a) that the request was politically motivated...
Meddling with Hong Kong's independent judiciary would be politically costly for Beijing. In fact, it could jeopardize the extradition agreement itself. When the U.S. Senate ratified the treaty, it attached an understanding emphasizing the continued independence of Hong Kong's judiciary
"Any attempt by the Government of Hong Kong or the Government of the People's Republic of China to curtail the jurisdiction and power of final adjudication of the Hong Kong courts may be considered grounds for withdrawal from the Agreement."
It's unlikely that Snowden could win a political persecution argument. Regardless of whether his actions are justified, they very clearly violated U.S. statute - he is not being arbitrarily singled out. A more potentially persuasive case for refusal might be made in light of Bradley Manning's treatment in pre-trial detention. Article 7 permits the refusal of extradition "when such surrender is likely to entail exceptionally serious consequences related to age or health." However, concerns over treatment are typically resolved by bilateral legal assurances that the extradited person will not be mistreated. Indeed, U.S. authorities have strong incentives to not mistreat Snowden if he is extradited as such actions would likely jeopardize future legal cooperation with Hong Kong.

This is not to say that extradition is a sure thing, but the challenges facing Snowden's extradition are currently more legal than political. Pretty much all previous extradition proceedings between the U.S. and Hong Kong have concerned clear-cut criminal offenses - violent and white-collar crimes in particular. Hong Kong has consistently accepted U.S. requests for extradition for these cases. But the leaking of classified information might fall under the category of "political offenses" for which extradition is prohibited. Extradition treaties have, for centuries contained provisions that refuse extradition for "offenses of a political character." The exception emerged in treaties during the 1800s as a way of limiting the ability of states to pursue dissenters and political opponents. However, interpretation of what constitutes a "political offense" has always been unclear and open to interpretation. Fearing that a vague interpretation of "political offense" would unduly burden states, most treaties since then have included provisions delineating offenses that cannot be considered "political." Compared to most modern treaties, the U.S.-Hong Kong treaty has relatively few political offense exceptions: murder or other crimes committed against a head of state are exempt as are offenses criminalized by a multilateral international agreement. This leaves a lot of grey area.

Most treaties signed in the last half-century do not explicitly enumerate offenses for which extradition is granted, typically defining an extraditable offense as anything criminalized under the laws of both parties. However, the Hong Kong treaty has both a list and a provision for extradition for "dual criminality." A request for extradition under the Espionage Act would likely fall under the scope of extraditable offenses since Hong Kong's Official Secrets Ordinance has similar provisions criminalizing the release of classified information, but could potentially be ruled a political offense. Law professor Julian Ku suggests that the U.S. might choose to pursue an alternative route and seek extradition under an offense explicitly enumerated in the treaty.
The Snowden leaks have now been referred to the Justice Department, and U.S. prosecutors have several options available to them. According to Ku, prosecutors may avoid charging Snowden under the Espionage Act -- which could be considered a political prosecution by courts in Hong Kong -- and indict him under a different statute.
Among the crimes listed on the U.S.-Hong Kong agreement as within the bounds of extradition, one offense in particular stands out: "the unlawful use of computers."
Requesting extradition for an explicitly enumerated offense might help support an argument that the offense is not a political one. The patchwork of U.S. classification law also appears to have provisions relating to the "unlawful use of computers" - the Computer Fraud and Abuse Act.  According to a CRS report,
18 U.S.C. Section 1030(a)(1) punishes the willful retention, communication, or transmission, etc., of classified information retrieved by means of knowingly accessing a computer without (or in excess of) authorization, with reason to believe that such information “could be used to the injury of the United States, or to the advantage of any foreign nation.”...The provision imposes a fine or imprisonment for not more than 10 years, or both, in the case of a first offense or attempted violation. Repeat offenses or attempts can incur a prison sentence of up to 20 years.
Moreover, the sentences under the Computer Fraud and Abuse Act are comparable to those in the relevant Espionage Act provisions. Assuming U.S. prosecutors are strategic, I would expect the forthcoming request  to be for offenses under the CFAA and not the Espionage Act.

The base rate for extradition approval is high so the easiest bet is that Snowden will likely be extradited (assuming he does not successfully flee Hong Kong as well). But this case is markedly different from previous ones. There is not much "data" for extradition when the offense involves the leaking of classified information. It has certainly occurred in other contexts, but is entirely new ground for the U.S.-Hong Kong legal cooperation relationship (I'd love it if anyone could point me to more examples). Snowden may have some basis for challenging a request and extradition proceedings are likely to drag on for some time. Indeed, it's not entirely impossible for Hong Kong to deny extradition, though it would generate significant political friction with the U.S. The legal deck is stacked in the United States' favor, despite the novelty of this case. So be wary of pop-Realist foreign policy commentaries that claim it's all about China - Chinese influence in Hong Kong's affairs is neither infinite, nor cost-free.

*Sadly, I don't have a model for making a more precise claim/prediction here, though an interesting project might involve using GDELT to identify successful/failed extradition cases and model P(approval) using a combination of political and legal variables. Certainly more than a blog post can carry.