BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
nazmul
Quartz | Level 8

Hello everyone,

 

I have the following dataset. Here, year indicates financial year, id indicates a specific firm,  and treatment=1 indicates the financial year the company got merged with another company. I want to remove the previous 4 years of each merger year. But If another merger year fall in the previous 4 years, I want to keep the merger year. In the following example, company 1 has 1 merger in 1998 as indicated by treatment=1, so I want to remove the data of id 1 from the year 1994 to 1997. Now the same company 1 had another merger in the year 2000, so I want to remove the year 1999, 1997, and 1996. I am keeping 1998 because it is a merger year. I have included my expected output file just below my dataset.

 

This is just a sample example. My original file has 30000 US companies and 45 financial years for each company. I will really appreciate your help.

 

My dataset:

 

year id treatment
1994 1 0
1995 1 0
1996 1 0
1997 1 0
1998 1 1
1999 1 0
2000 1 1
2001 1 0
2002 1 0

1996 2 0
1997 2 0
1998 2 0
1999 2 0
2000 2 0
2001 2 0
2002 2 0

 

Expected result:

year id treatment
1998 1 1
2000 1 1
2001 1 0
2002 1 0

1996 2 0
1997 2 0
1998 2 0
1999 2 0
2000 2 0
2001 2 0
2002 2 0

 

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

I think that the following does what you want to achieve:

data have;
  input year id treatment;
  cards;
1994 1 0
1995 1 0
1996 1 0
1997 1 0
1998 1 1
1999 1 0
2000 1 1
2001 1 0
2002 1 0
1996 2 0
1997 2 0
1998 2 0
1999 2 0
2000 2 0
2001 2 0
2002 2 0
;
proc sort data=have;
  by id year;
run;

data want;
  do until (last.id);
    set have;
    by id;
    array drops(45) _temporary_;
    array nodrops(45) _temporary_;
    if first.id then do;
      counter=0;
      nocounter=0;
      call missing(of drops(*));
      call missing(of nodrops(*));
    end;
    if treatment eq 1 then do;
      nocounter+1;
      nodrops(nocounter)=year;
      do i=1 to 4;
        counter+1;
        drops(counter)=year-i;
      end;
    end;
  end;
  do until (last.id);
    set have;
    by id;
    if year in nodrops or year not in drops then output;
  end;
run;

Art, CEO, AnalystFinder.com

View solution in original post

2 REPLIES 2
art297
Opal | Level 21

I think that the following does what you want to achieve:

data have;
  input year id treatment;
  cards;
1994 1 0
1995 1 0
1996 1 0
1997 1 0
1998 1 1
1999 1 0
2000 1 1
2001 1 0
2002 1 0
1996 2 0
1997 2 0
1998 2 0
1999 2 0
2000 2 0
2001 2 0
2002 2 0
;
proc sort data=have;
  by id year;
run;

data want;
  do until (last.id);
    set have;
    by id;
    array drops(45) _temporary_;
    array nodrops(45) _temporary_;
    if first.id then do;
      counter=0;
      nocounter=0;
      call missing(of drops(*));
      call missing(of nodrops(*));
    end;
    if treatment eq 1 then do;
      nocounter+1;
      nodrops(nocounter)=year;
      do i=1 to 4;
        counter+1;
        drops(counter)=year-i;
      end;
    end;
  end;
  do until (last.id);
    set have;
    by id;
    if year in nodrops or year not in drops then output;
  end;
run;

Art, CEO, AnalystFinder.com

nazmul
Quartz | Level 8
Thank you so much sir!!
It has perfectly solved my problem.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 773 views
  • 0 likes
  • 2 in conversation