Hello everyone,
I have the following dataset. Here, year indicates financial year, id indicates a specific firm, and treatment=1 indicates the financial year the company got merged with another company. I want to remove the previous 4 years of each merger year. But If another merger year fall in the previous 4 years, I want to keep the merger year. In the following example, company 1 has 1 merger in 1998 as indicated by treatment=1, so I want to remove the data of id 1 from the year 1994 to 1997. Now the same company 1 had another merger in the year 2000, so I want to remove the year 1999, 1997, and 1996. I am keeping 1998 because it is a merger year. I have included my expected output file just below my dataset.
This is just a sample example. My original file has 30000 US companies and 45 financial years for each company. I will really appreciate your help.
My dataset:
year id treatment
1994 1 0
1995 1 0
1996 1 0
1997 1 0
1998 1 1
1999 1 0
2000 1 1
2001 1 0
2002 1 0
1996 2 0
1997 2 0
1998 2 0
1999 2 0
2000 2 0
2001 2 0
2002 2 0
Expected result:
year id treatment
1998 1 1
2000 1 1
2001 1 0
2002 1 0
1996 2 0
1997 2 0
1998 2 0
1999 2 0
2000 2 0
2001 2 0
2002 2 0
I think that the following does what you want to achieve:
data have; input year id treatment; cards; 1994 1 0 1995 1 0 1996 1 0 1997 1 0 1998 1 1 1999 1 0 2000 1 1 2001 1 0 2002 1 0 1996 2 0 1997 2 0 1998 2 0 1999 2 0 2000 2 0 2001 2 0 2002 2 0 ; proc sort data=have; by id year; run; data want; do until (last.id); set have; by id; array drops(45) _temporary_; array nodrops(45) _temporary_; if first.id then do; counter=0; nocounter=0; call missing(of drops(*)); call missing(of nodrops(*)); end; if treatment eq 1 then do; nocounter+1; nodrops(nocounter)=year; do i=1 to 4; counter+1; drops(counter)=year-i; end; end; end; do until (last.id); set have; by id; if year in nodrops or year not in drops then output; end; run;
Art, CEO, AnalystFinder.com
I think that the following does what you want to achieve:
data have; input year id treatment; cards; 1994 1 0 1995 1 0 1996 1 0 1997 1 0 1998 1 1 1999 1 0 2000 1 1 2001 1 0 2002 1 0 1996 2 0 1997 2 0 1998 2 0 1999 2 0 2000 2 0 2001 2 0 2002 2 0 ; proc sort data=have; by id year; run; data want; do until (last.id); set have; by id; array drops(45) _temporary_; array nodrops(45) _temporary_; if first.id then do; counter=0; nocounter=0; call missing(of drops(*)); call missing(of nodrops(*)); end; if treatment eq 1 then do; nocounter+1; nodrops(nocounter)=year; do i=1 to 4; counter+1; drops(counter)=year-i; end; end; end; do until (last.id); set have; by id; if year in nodrops or year not in drops then output; end; run;
Art, CEO, AnalystFinder.com
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.