Hello, I'm struggling to create the appropriate groups for my data. The question is around data submission. A hospital can submit data non-consecutively and I need a way to understand the count of consecutive months per hospital. The last data step in the code below will sum the total number of months but I need to find a way to report on each segment of months reported (e.g., 2017-01 to 2017-05 (one sequence), 2017-07 to 201708 (second sequence), etc). Thank you! data flatfile2;
set flatfile;
beg_month=input(begin_date,anydtdte19.);
format beg_month yymms10.;
run;
proc sort data=flatfile2;
by hospital beg_month;
run;
data testing;
set flatfile2;
groupnum+(First.hospital or (dif(beg_month) GT 1));
run;
data flatfile3;
set testing;
by hospital;
begindate=first.hospital;
enddate=last.hospital;
lagdate=lag(groupnum);
if begindate=1 then do;
lagdate=groupnum-1;
end;
run;
/*this is the piece that is flagging the consecutive months*/
/*so, a hospital with records where diff= 1 0 0 0 0 0 (on separate rows) would be indicative of 6 months of consecutive data*/
/*I'm stuck here because I need to find a way to set this as one sequence.*/
data flatfile4 (drop=begindate enddate);
set flatfile3;
diff=groupnum-lagdate;
if diff <=1 then diff=0;
else if diff>1 then diff=1;
if begindate=1 and diff=0 then diff=1;
run;
proc sort data=flatfile4;
by hospital diff;
run;
data flatfile5;
set flatfile4;
by hospital diff;
if first.hospital and first.diff=1 then totalmo=0;
totalmo+1;
if last.hospital then output;
run;
proc print data=flatfile5;
run;
... View more