* If you have questions you can reach me (Kris Hult) at: khult@uchicago.edu /***************************************************************************************** ***************************************************************************************** ** ** ** pset.do: Problem Set 1 ** ** ** ** Project(s): Introduction To Doing Empirical Research ** ** ** ** Date Started: 1/30/2008 * hint [ctrl]+d types the date ** ** Last Update: 1/30/2008 ** ** ** ***************************************************************************************** *****************************************************************************************/ * Quick note: you will notice that I can make comments in the do file by either typing * "*" at the beginning of the line or by typing "/*" to start a block of code and "*/" to end it. /**************************************************************************************** * FILE LOCATION *****************************************************************************************/ * Create a directory: * Using a local (which is a marco that stores information) define the directory where * your project is stored. It is beneficial to use a directory in case you change * computers or share a project with other people. That way they only have to change * one line to run the do file. local dir "C:/Documents and Settings/khult/My Documents/Levitt Class" local nameoffile "pset1" /* Program that creates logs and do file backup */ * As I mentioned in session, it is a lot easier to have stata automatically create logs * and backups. This problem will create logs and backup the do file. capture program drop logger program define logger clear * clear the memory set more off * Tell Stata to not pause for --more-- messages local dir = "`2'" * Since locals are stored locally they are not available inside of programs. * Therefore, we have to redefine the directory, but this is easy because we are going to * tell the program what directory we are using (see this below). capture log close * capture is a command that supresses errors. Therefore if there is no log open, Stata * would normally stop running the do file and give me an error telling me that there is * no log open, but now it will continue running. * Hint: capture can be a very helpful command for two reasons: * 1. It stores the error as _rc * 2. It allows you to skip over lines of code (very useful in a loop) * type "creturn list" to see all the c-class returns * For instance date and time are c-class returns so we can use this to time-stamp our logs * ie. c(current_time) = "11:45:44" and c(current_date) = "30 Jan 2008", so: local hour = substr("`c(current_time)'", 1,2) local minute = substr("`c(current_time)'", 4,2) local day = substr("`c(current_date)'", 1,2) local month = substr("`c(current_date)'", 3,3) local year = substr("`c(current_date)'", 8,4) * Hint: if you want to have a numeric month you can use c(Mons), type "di c(Mons)" to see why * substr is a command that takes a substring, type "help substr" for the syntax * This is a simplified program so you can edit it however you like to make it easier to use. * Hint: If you do not already have a logs folder type: capture mkdir "`dir'/logs" log using "`dir'/logs/log_`year'`month'`day'_`hour'`minute'.log" * backup your do file: cap copy "`dir'/do files/ps1.do" "`dir'/do files/archive/`1'_`year'_`month'_`day'_`hour'_`minute'.do" * Stata on windows does not care if you use "/" or "\", but if you use Stata on a server, * such as Athens, you have to use "/" so it is easier to always be consistent end * You will notice in the program I used `1' and `2' as locals without defining them. This is because * when I tell Stata to use the program, I am going to tell it: logger "`nameoffile'" "`dir'" * The `1' calls `nameoffile' and the `2' calls the `dir' logger "`nameoffile'" "`dir'" /**************************************************************************************** * Setup *****************************************************************************************/ set mem 200m * Change amount of memory allocated to Stata * set maxvar 10000 * Set the maximum number of variables in Stata/MP and Stata/SE set matsize 1000 * Set the maximum number of variables in a model /**************************************************************************************** * Switches *****************************************************************************************/ * Switches are a great way to keep only one do file for an entire project but not * have to run the entire do file each time. I only put switches on a couple of questions * because it is cumbersome to have too many switches in code this short. local question1 = 0 local question2 = 0 local question3 = 1 local question4 = 1 local question5 = 1 local question6 = 1 local question7 = 1 local question8 = 1 local question9 = 1 * if a switch = 1 then this means I will run that section of the code, * if a switch = 0 then I skip that section of code * for this do file, I have created switches for each question, so I can work on different * questions without running the whole do-file. * Remember to make your code such that if you turn off a switch it wont affect the code * in the other switches (ie. it wont create important locals or variables that you use later) /**************************************************************************************** * Analysis *****************************************************************************************/ use "`dir'/pset1.dta", clear * switch for question 3 if `question3' == 1 { /* Question Three: */ *Divide states into three groups: *(A) those with no concealed weapons law by 1992 *(B) states which had a concealed weapons law prior to 1977, and *(C) states which did not have a concealed weapons law in 1977 but did by 1992. /* Loops */ * type "help foreach" or "help forvalues" for help on loops foreach v of varlist vio-aut { gen l`v'=ln(`v') } /* Label variables */ * most of the variables are already labeled but if you want to label them so they are easier to use type: label variable state "State" /* Create dummies */ *You can create dummies by either writing a logical test that returns 1 when true, or 0 when false: gen prior =(yradopt==77) if year<93 *Or if necessary, generate a variable equal to 1 when you want it to be, then replace: gen never = 1 if (yradopt==0 | yradopt>92) & year<93 replace never = 0 if never==. & year<93 gen law_change = (yradopt!=0 & yradopt!=77 & yradopt<93) if year<93 /* 3a) calculate the average violent crime rates for each group in 1977 and 1992. */ * You can type out each command individually, summ lvio if prior==1 summ lvio if never==1 summ lvio if law_change==1 * Or you can use a loop: foreach v of varlist prior never law_change { summ lvio if `v'==1 } /* 3b) calculate the difference between violent crime rates per 100,000 for each group in the years 1977 and 1992. */ * You can do each command individually, then calculate the differences off the screen: summ lvio if prior==1 & year==77 summ lvio if prior==1 & year==92 *... * Or you can add another loop to the loop above, and calculate the differences off the screen: foreach v of varlist prior never law_change { foreach y of numlist 77 92 { summ lvio if `v'==1 & year==`y' } } * Even better is to automate the entire process, so every time you change something or redesign, etc * we use local macros to do this. /* return list */ foreach v of varlist prior never law_change { foreach y in "77" "92" { summ lvio if `v'==1 & year==`y' local lvio`v'`y' = r(mean) } local lviodiff`v'=`lvio`v'92'-`lvio`v'77' display "The change from 1977 to 1992 for states with gun law `v' was `lviodiff`v''" } * Notice that you can imbed locals within other locals when we use `lvio`v'92' /* 3c) calculate the simple difference in difference estimate between states that change their laws and states that had laws prior to 1977. */ * If you did everything above by hand, you just subtract the differences from each other now by hand. * Using our local macros above: local d2law_changeprior=`lviodifflaw_change'-`lviodiffprior' display "The difference in difference from 1977-1992 for states that changed versus those with existing laws was `d2law_changeprior'" /* 3d) calculate the difference in difference estimate between states that change their laws and states that did not have a concealed weapons law by 1992. */ * We follow the same procedure from 3c: local d2law_changenever=`lviodifflaw_change'-`lviodiffnever' display "The difference in difference from 1977-1992 for states that changed versus those who never had a gun law was `d2law_changenever'" /* For 3c and 3d, you can also use a regression framework for the differences-in-differences. This has the advantage of giving you standard errors for your diff-in-diffs. */ * First, we have to create the log of the crime rates for each offense: gen diff3c=1 if law_change==1 replace diff3c=0 if prior==1 /* Create Dummy Variables */ gen interact3c=diff3c*(year==92) xi: reg lvio interact3c diff3c i.year if year==77 | year==92 gen diff3d=1 if law_change==1 replace diff3d=0 if never==1 gen interact3d=diff3d*(year==92) xi: reg lvio interact3d diff3d i.year if year==77 | year==92 * If you want to output regressions into Word, Excel, or Latex (pdf) you should use the command estout. * There is a very good tutorial on estout at http://fmwww.bc.edu/repec/bocode/e/estout/ * To install estout, type "ssc inst estout" } /// end of question 3 switch * switch for question 3 if `question4' == 1 { /* Question Four: Replicate the Lott & Mustard Table 4 results from 1977 - 1992 at the state level. */ * For the year and state dummies, you can either tab(year), gen(iyear) or you can use the xi: reg command * The benefit of tab, gen is that you can manually choose which year you want to omit. xi will do whatever it wants. tab year, gen(iyear) tab state, gen(istate) reg lvio shall aovio_n densitym rpcpi rpcui rpcim stpop pb* pw* pn* iyear* istate* [aweight=stpop] if year<93 * Stars (*) are wildcards in Stata which means that if you have lots of variables that start with istate then instead * of typing istate1 istate2 ... you can type istate* and it will use variables that start with istate. * You can also use stars at the beginning or in the middle of variables. } /// end of question 4 switch /* Question Five: Replicate the Lott & Mustard results from 1977 - 1992 at the state level except add in a dummy variable which is equal to one in the year before the law change happens and zero everywhere else. What is the coefficient on this new dummy? How does it compare to the coefficient on the law change variable? Is it statistically different from the shall-issue coefficient? What does this mean?*/ gen nextyear=(year-yradopt==-1) gen yearof=(year==yradopt & year!=77) * Another way to find the year before is to use underscore variables. * Type help _n to find out how to use them reg lvio shall nextyear aovio_n densitym rpcpi rpcui rpcim stpop pb* pw* pn* iyear* istate* [aweight=stpop] if year<93 testparm shall nextyear, equal /* Question Six: Now run the same regression as 4), except exclude the state of Georgia from the analysis.*/ reg lvio shall aovio_n densitym rpcpi rpcui rpcim stpop pb* pw* pn* iyear* istate* [aweight=stpop] if state!="Georgia" & year<93 /* Question Seven: Now run the same regression as 4), but cluster the standard errors by state. */ reg lvio shall aovio_n densitym rpcpi rpcui rpcim stpop pb* pw* pn* iyear* istate* [aweight=stpop] if year<93, cluster(state) /* Question Eight: How many states were there in the original Lott & Mustard dataset in which */ *a) the law changed unique state if law_change==1 & year<93 * or, if you don't like the unique command, because, for example, you hate innovation, you can do this: sort state year count if law_change==1 & year<93 & state!=state[_n-1] * b) the law was in place prior to 1977 unique state if prior==1 * c) when we expand the dataset to years after 1992, how many states do we get for part a)? gen law_change_ever=(yradopt!=0 & yradopt!=77) gen never_ever=(yradopt==0) *This is reduntant except that we created missing values for years after 92 above. gen prior_ever=(yradopt==77) unique state if law_change_ever==1 /* Question Nine: Now use all of the years of data available (1977 - 1999) and: a) replicate the two difference in difference estimators from 3) */ *We stick with the regression framework: gen diff3c6=1 if law_change_ever==1 replace diff3c6=0 if prior_ever==1 gen interact3c6=diff3c6*(year==99) xi: reg lvio interact3c6 diff3c6 i.year if year==77 | year==99 gen diff3d6=1 if law_change_ever==1 replace diff3d6=0 if never_ever==1 gen interact3d6=diff3d6*(year==99) xi: reg lvio interact3d6 diff3d6 i.year if year==77 | year==99 * b) the Lott & Mustard specification from 4) reg lvio shall aovio_n densitym rpcpi rpcui rpcim stpop pb* pw* pn* iyear* istate* [aweight=stpop] * replicate the Duggan specification from 5) reg lvio shall nextyear aovio_n densitym rpcpi rpcui rpcim stpop pb* pw* pn* iyear* istate* [aweight=stpop] testparm shall nextyear, equal * Exclude Georgia: reg lvio shall aovio_n densitym rpcpi rpcui rpcim stpop pb* pw* pn* iyear* istate* [aweight=stpop] if state!="Georgia" * Cluster standard errors on state: reg lvio shall aovio_n densitym rpcpi rpcui rpcim stpop pb* pw* pn* iyear* istate* [aweight=stpop], cluster(state) /* c) Now run a single regression including the full period (1977-1999) of log of violent crime rate on the shall-issue variable with all of the covariates from 4), along with the dummy created in 5), in which you omit Georgia and cluster standard errors by state. */ reg lvio shall nextyear aovio_n densitym rpcpi rpcui rpcim stpop pb* pw* pn* iyear* istate* [aweight=stpop] if state!="Georgia", cluster(state) testparm shall nextyear, equal log close * Final comment: if you have a coding error and you do not know why you have the error type "set trace on" and run the code again. * Trace provides a more detailed view of what is happening with your code and makes it easier to solve errors. * When you have solved the problem make sure to "set trace off" or it will significantly slow down your code.