how to keep observations with value at t=0, no value at t+1 thru +5, and no value at t-1 thru -3

Reply
Occasional Contributor
Posts: 7

how to keep observations with value at t=0, no value at t+1 thru +5, and no value at t-1 thru -3

Dear SAS community,

 

I have an unbalanced panel data with firm ID, year, and proceeds. I want to keep observations that meet the following criteria. Everything else should be deleted.

1/ proceeds>0

2/ proceeds have no values (just a blank cell) at each of the three years preceding the year where it is positive

3/ proceeds have no values (just a blank cell) at each of the five years following the year where it is positive

In the sample below, only ID = 1055 from year 1988 through 1996 are kept. All other obs are deleted. In year 1991, ID 1055 has proceeds equal 49.5, and in the three years before and five years around 1991 for that ID, proceeds are equal to a dot. Therefore, it meets all the criteria.

I'd greatly appreciate your help.

Thanks,

Hinh

ID     year PROCEEDS
1054 1981 
1054 1982 23.6
1054 1983 
1054 1984 
1054 1985 34.4
1054 1986 
1054 1987 
1054 1988 
1054 1989 
1054 1990 
1054 1991 
1054 1992 
1054 1993 
1055 1984 
1055 1985 23.8
1055 1986 
1055 1987 
1055 1988 
1055 1989 
1055 1990 
1055 1991 49.5
1055 1992 
1055 1993 
1055 1994 
1055 1995 
1055 1996 
1056 1979 
1056 1980 
1056 1981 
1056 1982 
1056 1983 
1056 1984 
1056 1985 
1056 1986 
1056 1987 
1056 1988 
1056 1989 
1056 1990 
1056 1991 
1056 1992 
1056 1993 
1056 1994 
1056 1995 
1056 1996 
1056 1997 
1056 1998 32.5
1056 1999 
1056 2000 94.3
1056 2001 
1056 2002 
1056 2003 
1056 2004 96.3
1056 2005 
1056 2006 

Grand Advisor
Posts: 9,571

Re: how to keep observations with value at t=0, no value at t+1 thru +5, and no value at t-1 thru -3

Is there a gap between two obs ? Does obs always increase one  one by one ?

 

data have;
infile cards truncover;
input ID     year PROCEEDS;
cards;
1054 1981 
1054 1982 23.6
1054 1983 
1054 1984 
1054 1985 34.4
1054 1986 
1054 1987 
1054 1988 
1054 1989 
1054 1990 
1054 1991 
1054 1992 
1054 1993 
1055 1984 
1055 1985 23.8
1055 1986 
1055 1987 
1055 1988 
1055 1989 
1055 1990 
1055 1991 49.5
1055 1992 
1055 1993 
1055 1994 
1055 1995 
1055 1996 
1056 1979 
1056 1980 
1056 1981 
1056 1982 
1056 1983 
1056 1984 
1056 1985 
1056 1986 
1056 1987 
1056 1988 
1056 1989 
1056 1990 
1056 1991 
1056 1992 
1056 1993 
1056 1994 
1056 1995 
1056 1996 
1056 1997 
1056 1998 32.5
1056 1999 
1056 2000 94.3
1056 2001 
1056 2002 
1056 2003 
1056 2004 96.3
1056 2005 
1056 2006 
;
run;
proc sql;
 create table temp as
  select id,year
   from have as a
    where PROCEEDS gt 0 and 
     (select sum(PROCEEDS) from have where id=a.id and year between a.year-3 and a.year-1)=. and
     (select count(*) from have where id=a.id and year between a.year-3 and a.year-1)=3 and
     (select sum(PROCEEDS) from have where id=a.id and year between a.year+1 and a.year+5)=. and
     (select count(*) from have where id=a.id and year between a.year+1 and a.year+5)=5 ;
quit;
data key;
 set temp;
 _year=year;
 do year=_year-3 to _year+5;
  output;
 end;
 drop _year;
run;
data want;
 if _n_ eq 1 then do;
  if 0 then set key;
  declare hash h(dataset:'key');
  h.definekey('id','year');
  h.definedone();
 end;
 set have;
 if h.check()=0;
run;
Respected Advisor
Posts: 4,606

Re: how to keep observations with value at t=0, no value at t+1 thru +5, and no value at t-1 thru -3

[ Edited ]

Do it with a single SQL query :

 

proc sql;

create table want as
select a.*
from 
    have as a inner join
    have as b on a.id=b.id
where a.proceeds > 0 and b.proceeds is missing
group by a.id, a.year, a.proceeds
having sum(b.year between a.year-3 and a.year+5) = 3+5;

select * from want;

quit;

it will include strictly years for which there is data for the three previous years and the five following years. 

PG
Occasional Contributor
Posts: 7

Re: how to keep observations with value at t=0, no value at t+1 thru +5, and no value at t-1 thru -3

PG,

Thanks so much for your prompt help. I will try and let you know.

Regards,
Khieu
Occasional Contributor
Posts: 7

Re: how to keep observations with value at t=0, no value at t+1 thru +5, and no value at t-1 thru -3

Dear PG,

Thanks again for your help with the two sets of codes. Unfortunately, however, the output was not what I wanted. The code gives me only ID with positive proceeds. What I need was 3 years before and 5 years after the positive proceeds too. That is, for every id that has positive proceeds, I need 9 observations (3 before, 1 at time when proceeds is positive, and 5 after).

I'd appreciate further help.

Regards,
Khieu
Respected Advisor
Posts: 4,606

Re: how to keep observations with value at t=0, no value at t+1 thru +5, and no value at t-1 thru -3

If you want to include years with incomplete information, i.e. assume that there are no proceeds for years not mentioned in the list :

 

proc sql;

create table want as
select a.*
from 
    have as a left join
    have as b on a.id=b.id
where a.proceeds > 0 and b.proceeds is not missing
group by a.id, a.year, a.proceeds
having sum(b.year between a.year-3 and a.year+5) = 1;

select * from want;
quit;
PG
Occasional Contributor
Posts: 7

Re: how to keep observations with value at t=0, no value at t+1 thru +5, and no value at t-1 thru -3

Dear KSharp,

Thanks so much for your prompt help. I will try and let you know.

Regards,
Khieu
Occasional Contributor
Posts: 7

Re: how to keep observations with value at t=0, no value at t+1 thru +5, and no value at t-1 thru -3

Dear Xia Keshan,

Thanks for your help again. Unfortunately, the code is not correct. It appears to get only observations that have positive proceeds. In my original question, the final answer should be only ID 1055 starting from year 1988 and ending in year 1966 for a total of 9 rows. The rest should be deleted. The code does not generate that subset.

I'd appreciate any further help.

Regards,
Hinh
Grand Advisor
Posts: 9,571

Re: how to keep observations with value at t=0, no value at t+1 thru +5, and no value at t-1 thru -3

[ Edited ]

"the final answer should be only ID 1055 starting from year 1988 and ending in year 1966 for a total of 9 rows."

Did you run my code ? That is what I am trying to do.

While PG's code only gave you "only observations that have positive proceeds".

 

One more thought, If you care about efficiency then I propose to use array. See the following code.

 

 

data have;
infile cards truncover;
input ID     year PROCEEDS;
cards;
1054 1981 
1054 1982 23.6
1054 1983 
1054 1984 
1054 1985 34.4
1054 1986 
1054 1987 
1054 1988 
1054 1989 
1054 1990 
1054 1991 
1054 1992 
1054 1993 
1055 1984 
1055 1985 23.8
1055 1986 
1055 1987 
1055 1988 
1055 1989 
1055 1990 
1055 1991 49.5
1055 1992 
1055 1993 
1055 1994 
1055 1995 
1055 1996 
1056 1979 
1056 1980 
1056 1981 
1056 1982 
1056 1983 
1056 1984 
1056 1985 
1056 1986 
1056 1987 
1056 1988 
1056 1989 
1056 1990 
1056 1991 
1056 1992 
1056 1993 
1056 1994 
1056 1995 
1056 1996 
1056 1997 
1056 1998 32.5
1056 1999 
1056 2000 94.3
1056 2001 
1056 2002 
1056 2003 
1056 2004 96.3
1056 2005 
1056 2006 
;
run;
data temp;
 set have;
 by id;
 array x{999999} _temporary_;
 array y{999999} _temporary_;
 if first.id then do;call missing(of x{*} y{*});n=0;end;
 n+1;
 x{n}=PROCEEDS;
 y{n}=year;
 if last.id then do;
  do i=4 to n-5;
   if x{i} gt 0 and x{i-1}=. and x{i-2}=. and x{i-3}=. and
    x{i+1}=. and x{i+2}=. and x{i+3}=. and x{i+4}=. and x{i+5}=. then do;
    year=y{i};output;
   end;
  end;
 end;
run;

data key;
 set temp;
 _year=year;
 do year=_year-3 to _year+5;
  output;
 end;
 drop _year;
run;
data want;
 if _n_ eq 1 then do;
  if 0 then set key;
  declare hash h(dataset:'key');
  h.definekey('id','year');
  h.definedone();
 end;
 set have;
 if h.check()=0;
run;

 

 

 

 

 

 

Occasional Contributor
Posts: 9

Re: how to keep observations with value at t=0, no value at t+1 thru +5, and no value at t-1 thru -3

Dear Xia,

Thanks for your code. I have not had time to try it yet. I will definitely let you know whether it works.

Regards,
Hinh
Occasional Contributor
Posts: 9

Re: how to keep observations with value at t=0, no value at t+1 thru +5, and no value at t-1 thru -3

Dear Xia,
I have tried your code and it does not work. I have no idea what dataset it outputs. It does not give me 9 observations as I described. I'd greatly appreciate any further help.
Regards,
Hinh
Occasional Contributor
Posts: 9

Re: how to keep observations with value at t=0, no value at t+1 thru +5, and no value at t-1 thru -3

Dear Xia,
Your code seems to ouput only the last observations of the id.
According to the small sample I posted, only the following observations should be output for id, year, and proceeds.
1055 1988
1055 1989
1055 1990
1055 1991 49.5
1055 1992
1055 1993
1055 1994
1055 1995
1055 1996
Grand Advisor
Posts: 9,571

Re: how to keep observations with value at t=0, no value at t+1 thru +5, and no value at t-1 thru -3

That is very kind of weird. If you open WANT dataset after running my code, you will see the exact thing you want.

Occasional Contributor
Posts: 9

Re: how to keep observations with value at t=0, no value at t+1 thru +5, and no value at t-1 thru -3

Dear Xia,

Yes, your code works now. Sorry it was my mistake. Thank you so much for your help.

Best,

Hinh

Ask a Question
Discussion stats
  • 13 replies
  • 661 views
  • 1 like
  • 4 in conversation