Hello everyone, I have an excerpt of my data as the following: DATA have ;
input id year action shock;
DATALINES;
1055 1981 0 .
1055 1982 0 .
1055 1983 0 .
1055 1984 0 .
1055 1985 1 1
1055 1986 1 .
1055 1987 1 .
1055 1988 1 .
1055 1989 0 .
1055 1990 0 .
1055 1991 0 .
1055 1992 0 .
1055 1993 0 .
1085 1981 0 .
1085 1982 1 1
1085 1983 0 .
1085 1984 0 .
1085 1985 0 .
1085 1986 0 .
1085 1987 0 .
1085 1988 0 .
1085 1989 0 .
1085 1990 0 .
1085 1991 1 1
1085 1992 1 .
1085 1993 1 .
1212 1981 0 .
1212 1982 0 .
1212 1983 1 1
1212 1984 1 .
1212 1985 1 .
1212 1986 0 .
1212 1987 0 .
1212 1988 1 1
1212 1989 1 .
1212 1990 1 .
1212 1991 0 .
1212 1992 0 .
1212 1993 0 .
1842 1981 0 .
1842 1982 0 .
1842 1983 0 .
1842 1984 0 .
1842 1985 0 .
1842 1986 0 .
1842 1987 0 .
1842 1988 0 .
1842 1989 0 .
1842 1990 0 .
1842 1991 0 .
1842 1992 0 .
1842 1993 0 .
2913 1981 0 .
2913 1982 0 .
2913 1983 0 .
2913 1984 0 .
2913 1985 1 1
2913 1986 1 .
2913 1987 1 .
2913 1988 0 .
2913 1989 0 .
2913 1990 0 .
2913 1991 1 1
2913 1992 1 .
2913 1993 1 .
;
RUN; I want to create the following dataset: DATA want ;
input id event_year treat;
DATALINES;
1055 1985 1
1085 1991 1
1212 1983 1
1212 1988 1
2913 1985 1
2913 1991 1
1085 1985 0
1842 1985 0
1055 1991 0
1842 1991 0
1842 1983 0
1085 1988 0
1842 1988 0
;
RUN; The want dataset is created as follows: For each id, consider only shock = 1. If the corresponding value of action equals 1 for the corresponding year and the following two years and equals 0 for the previous two years, then in the want dataset, enter an observation for the id, the corresponding year (named as event_year) and set treat = 1. As an example, consider id = 1055, year = 1985 in have. Here shock = 1, and action = 0 for years 1983 and 1984 (the previous two years), and equals to 1 for years 1985 (the current year corresponding to shock = 1), 1986 and 1987 (the following two years), so in the want dataset, there is an entry for id = 1055, event_year = 1985 with treat = 1. Once all the rows with treat = 1 are created, then to create the rows with treat = 0, we do the following: For each treat = 1, we look at all the id's in the have dataset with year equal to the event_year but with action = 0 in that given year. If action also equals to 0 for the two years before and two years after event_year, then enter an observation for the id, the corresponding event_year and set treat = 0 in want. As an example, consider id = 1055, event_year = 1985, treat = 1 in want. In the have dataset, id = 1085 has action = 0 for year = 1985. Action also takes the value of 0 two years before and two years after 1985, so create the observation id = 1085, event_year = 1985 and treat = 0 in want. Similarly, in the have dataset, id = 1842 also has action = 0 for year = 1985 as well as for the two years before and two years after. So create the observation id = 1842, event_year = 1985 and treat = 0 in want. There is no need to output duplicates in want, so for example, for id = 1055, event_year = 1985 and treat = 1, we already have rows for id = 1085, event_year = 1985, treat = 0, and id = 1842, event_year = 1985, treat = 0. So when we look at id = 2913, event_year = 1985 and treat = 1, this corresponds to exactly the same two rows, so there is no need to output this again in want. Also, can the code be made flexible so that in the above rule, instead of being two years before and two years after, it can be any arbitrary window around the event_year? For example, one year before to two years after, or 3 years before to 3 years after.
... View more