Help using Base SAS procedures

Enumeration Variable by Groups?

Reply
Occasional Contributor
Posts: 5

Enumeration Variable by Groups?

SAS FAQ (UCLA)

The above link addresses the question, "How can I create an enumeration variable by groups?" The last paragraph states that it is not difficult to create an enumeration variable by groups with multiple layers, but I am having problems doing so.

My data is organized by an ID variable, a time variable (year), and then an intra-time variable (quarter). But the data contains gaps. For example, it may be like the following:

IDYearQuarter
120001
120002
120041

I want to count how many consecutive quarterly observations I have. I tried following the sample code in the link above:

data two1;
  set two;
  count + 1;
  by id year quarter;
  if first.id or first.year or first.quarter then count = 1;
run;

But that did not do it for me. I think a problem I have is that there could be a consecutive string from the fourth quarter of year t - 1 to the first quarter of year t. My output was wrong regardless.

PROC Star
Posts: 7,364

Enumeration Variable by Groups?

Can you show us what you want/expect count to look like?

Art

Occasional Contributor
Posts: 5

Re: Enumeration Variable by Groups?

IDYearQuarterCount
1200011
1200022
1200033
1200044
1200115
1200421
2199941
2200012

Art, that's what I have in "mind." I want to count how many consecutive quarterly observations I have per ID. My data is organized by ID, year, and quarter, but there may be gaps.

PROC Star
Posts: 7,364

Re: Enumeration Variable by Groups?

I don't have access to SAS at the moment, thus can only write pseudocode, that is totally untested and probably wrong code.  That said, my general approach would be to create a pseudo date, and then use the intck function with a lag to see whether to increment the desired counter.  E.g.

data two1;
  set two;
  by id;

  pseudodate=mdy(quarter*3,1,year);

  lastdate=lag(pseudodate);

  if first.id then count = 1;

   else do;

     if intck('qtr',lastdate,pseudodate) eq 1 then count+1;

     else count=1;

   end;
run;


Hopefully, that will give you enough direction to actually solve your problem.

Art

Super User
Posts: 9,687

Re: Enumeration Variable by Groups?

How about:

data temp;
infile datalines expandtabs ;
input ID     Year     Quarter     ;
datalines;
1     2000     1     
1     2000     2     
1     2000     3     
1     2000     4     
1     2001     1     
1     2004     2     
2     1999     4     
2     2000     1
;
run;
proc sql noprint;
 create table all as
  select *
   from (select distinct id from temp),
        (select distinct year from temp),
        (select distinct quarter from temp)
        ;
quit;
proc sort data=temp ;
 by id year quarter;
run;
proc sort data=all;
 by id year quarter;
run;
data op;
 merge all temp(in=in_temp);
 by id year quarter;
 if in_temp then flag=1;
run;
data want(where=(flag is not missing));
 set op;
 if missing(flag) or id ne lag(id)  then count=0;
 if not missing(flag) then count+1;
run;

Ksharp

Message was edited by: xia keshan

SAS Employee
Posts: 105

Re: Enumeration Variable by Groups?

Hi Yanagi,

You can use Lag function  to get the last period in the curret row, and with by statments to get the the new column count for your needs.

data temp;
input ID  Year Quarter  ;
date=mdy(quarter*3,01,year);
format date ddmmyy10.;
cards;
1     2000     1    
1     2000     2    
1     2000     3    
1     2000     4    
1     2001     1    
1     2004     2    
2     1999     4    
2     2000     1
;

run;

proc sort data=temp;
by id  date;
run;

data temp2; 
set temp end=eof; 
by id date; 
date_prev=lag(date);
if first.id then do;
count=1;
date_prev=date;

end;
else do;
if mdy(month(intnx('quarter',date_prev,1,'end')),01,year(intnx('quarter',date_prev,1,'end')))=date then count+1;
else count=1;
end;
format date_prev  ddmmyy10.;
run;

Ask a Question
Discussion stats
  • 5 replies
  • 3560 views
  • 0 likes
  • 4 in conversation