Hello everyone,
I have the following dataset. Here, year indicates financial year, id indicates a specific firm, and treatment=1 indicates the financial year the company got merged with another company. I want to remove the previous 4 years of each merger year. But If another merger year fall in the previous 4 years, I want to keep the merger year. In the following example, company 1 has 1 merger in 1998 as indicated by treatment=1, so I want to remove the data of id 1 from the year 1994 to 1997. Now the same company 1 had another merger in the year 2000, so I want to remove the year 1999, 1997, and 1996. I am keeping 1998 because it is a merger year. I have included my expected output file just below my dataset.
This is just a sample example. My original file has 30000 US companies and 45 financial years for each company. I will really appreciate your help.
My dataset:
year id treatment
1994 1 0
1995 1 0
1996 1 0
1997 1 0
1998 1 1
1999 1 0
2000 1 1
2001 1 0
2002 1 0
1996 2 0
1997 2 0
1998 2 0
1999 2 0
2000 2 0
2001 2 0
2002 2 0
Expected result:
year id treatment
1998 1 1
2000 1 1
2001 1 0
2002 1 0
1996 2 0
1997 2 0
1998 2 0
1999 2 0
2000 2 0
2001 2 0
2002 2 0
I think that the following does what you want to achieve:
data have; input year id treatment; cards; 1994 1 0 1995 1 0 1996 1 0 1997 1 0 1998 1 1 1999 1 0 2000 1 1 2001 1 0 2002 1 0 1996 2 0 1997 2 0 1998 2 0 1999 2 0 2000 2 0 2001 2 0 2002 2 0 ; proc sort data=have; by id year; run; data want; do until (last.id); set have; by id; array drops(45) _temporary_; array nodrops(45) _temporary_; if first.id then do; counter=0; nocounter=0; call missing(of drops(*)); call missing(of nodrops(*)); end; if treatment eq 1 then do; nocounter+1; nodrops(nocounter)=year; do i=1 to 4; counter+1; drops(counter)=year-i; end; end; end; do until (last.id); set have; by id; if year in nodrops or year not in drops then output; end; run;
Art, CEO, AnalystFinder.com
I think that the following does what you want to achieve:
data have; input year id treatment; cards; 1994 1 0 1995 1 0 1996 1 0 1997 1 0 1998 1 1 1999 1 0 2000 1 1 2001 1 0 2002 1 0 1996 2 0 1997 2 0 1998 2 0 1999 2 0 2000 2 0 2001 2 0 2002 2 0 ; proc sort data=have; by id year; run; data want; do until (last.id); set have; by id; array drops(45) _temporary_; array nodrops(45) _temporary_; if first.id then do; counter=0; nocounter=0; call missing(of drops(*)); call missing(of nodrops(*)); end; if treatment eq 1 then do; nocounter+1; nodrops(nocounter)=year; do i=1 to 4; counter+1; drops(counter)=year-i; end; end; end; do until (last.id); set have; by id; if year in nodrops or year not in drops then output; end; run;
Art, CEO, AnalystFinder.com
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.