07-05-2011 04:13 PM
The above link addresses the question, "How can I create an enumeration variable by groups?" The last paragraph states that it is not difficult to create an enumeration variable by groups with multiple layers, but I am having problems doing so.
My data is organized by an ID variable, a time variable (year), and then an intra-time variable (quarter). But the data contains gaps. For example, it may be like the following:
I want to count how many consecutive quarterly observations I have. I tried following the sample code in the link above:
count + 1;
by id year quarter;
if first.id or first.year or first.quarter then count = 1;
But that did not do it for me. I think a problem I have is that there could be a consecutive string from the fourth quarter of year t - 1 to the first quarter of year t. My output was wrong regardless.
07-05-2011 06:26 PM
Art, that's what I have in "mind." I want to count how many consecutive quarterly observations I have per ID. My data is organized by ID, year, and quarter, but there may be gaps.
07-05-2011 07:16 PM
I don't have access to SAS at the moment, thus can only write pseudocode, that is totally untested and probably wrong code. That said, my general approach would be to create a pseudo date, and then use the intck function with a lag to see whether to increment the desired counter. E.g.
if first.id then count = 1;
if intck('qtr',lastdate,pseudodate) eq 1 then count+1;
Hopefully, that will give you enough direction to actually solve your problem.
07-08-2011 03:01 AM
data temp; infile datalines expandtabs ; input ID Year Quarter ; datalines; 1 2000 1 1 2000 2 1 2000 3 1 2000 4 1 2001 1 1 2004 2 2 1999 4 2 2000 1 ; run; proc sql noprint; create table all as select * from (select distinct id from temp), (select distinct year from temp), (select distinct quarter from temp) ; quit; proc sort data=temp ; by id year quarter; run; proc sort data=all; by id year quarter; run; data op; merge all temp(in=in_temp); by id year quarter; if in_temp then flag=1; run; data want(where=(flag is not missing)); set op; if missing(flag) or id ne lag(id) then count=0; if not missing(flag) then count+1; run;
Message was edited by: xia keshan
07-08-2011 06:49 AM
You can use Lag function to get the last period in the curret row, and with by statments to get the the new column count for your needs.
input ID Year Quarter ;
format date ddmmyy10.;
1 2000 1
1 2000 2
1 2000 3
1 2000 4
1 2001 1
1 2004 2
2 1999 4
2 2000 1
proc sort data=temp;
by id date;
set temp end=eof;
by id date;
if first.id then do;
if mdy(month(intnx('quarter',date_prev,1,'end')),01,year(intnx('quarter',date_prev,1,'end')))=date then count+1;
format date_prev ddmmyy10.;