BookmarkSubscribeRSS Feed
seohyeonjeong
Obsidian | Level 7

Hi everyone,

 

I can't find any materials like this, so I need your help badly.

I want to find the start date and end date by ID if there are missing periods.

 

A sample of data is below.

ID_loan Period ...
A0001 201402 ...
A0001 201403 ...
A0001 201408 ...
A0002 201408 ...
A0002 201409 ...
A0002 201410 ...
A0003 201406 ...
A0003 201408 ...
A0003 201409 ...
A0004 201405 ...
... ... ...

 

And I want to get the result like this.

ID_loan Start_Date End_Date
A0001 201404 201407
A0003 201407 201407
... ... ...

 

Is it possible to make this?

Hopefully, I can get any help. 

 

I really appreciate any help you can provide.

 

 

4 REPLIES 4
mkeintz
PROC Star

Your sample data is not in the form of a working data step, so this program code is untested.

 

Assuming the PERIOD variable is a sas date variable aligned to the beginning of each month, then:

 

data want (keep=ID_LOAN START_DATE END_DATE);
  set have;
  by id_loan;
  start_date=intnx('month',lag(period),1);
  if first.id=0 and start_date^=period then do;
    end_date=intnx('month',period,-1);
    output;
  end;
  format start_date end_date yymmn6. ;
run;

If period is really just a 6-digit number, then you could insert

  _period_date=input(put(period,6.),yymmn6.);

just after the BY statement.  And then use _period_date instead of period in the subsequent statements.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Kurt_Bremser
Super User

Use the LAG function to retrieve the value from the preceding observation, and once you are at the second or more observation of a group, compare with the current observation and write a record if needed. Store periods as SAS dates, so you can make use of interval functions.

data have;
infile datalines dlm="09"x;
input ID_loan $ Period :yymmn6.;
format period yymmn6.;
datalines;
A0001	201402
A0001	201403
A0001	201408
A0002	201408
A0002	201409
A0002	201410
A0003	201406
A0003	201408
A0003	201409
A0004	201405
;

data want;
set have;
by id_loan;
format start_date end_date yymmn6.;
l_period = lag(period);
if not first.id_loan and intck('month',l_period,period) > 1
then do;
  start_date = intnx('month',l_period,1);
  end_date = intnx('month',period,-1);
  output;
end;
keep id_loan start_date end_date;
run;
sbxkoenk
SAS Super FREQ

Hello,

I would use PROC TIMEDATA !

data work.have0;
input ID_loan $ Period $;
cards;
A0001	201402
A0001	201403
A0001	201408
A0002	201408
A0002	201409
A0002	201410
A0003	201406
A0003	201408
A0003	201409
A0004	201405
;
run;

data work.have1;
 set work.have0;
 Period_num = input(put(Period,$6.)!!'01',yymmdd8.);
 format Period_num date9.;
 item1 = 1234 ;
run;

proc timedata data=have1 out=_NULL_ outarray=work.want;
   by ID_loan;
   id Period_num interval=month
                 accumulate=median
                 setmiss=missing;
   var item1;
   outarray itemshare;
   do i=1 to _length_;
      if item1[i]=. then itemshare[i] = 1;
	  else itemshare[i] = 0;
   end;
run;

PROC MEANS data=work.want MIN MAX nway;
 CLASS ID_loan;
 var Period_num;
 format Period_num date9.;
 output out=work.want_encore min= max= / autoname autolabel;
run;

proc print data=work.want_encore;
run;
/* end of program */

Koen

Ksharp
Super User
data have;
infile datalines expandtabs;
input ID_loan $ Period :yymmn6.;
format period yymmn6.;
datalines;
A0001	201402
A0001	201403
A0001	201408
A0002	201408
A0002	201409
A0002	201410
A0003	201406
A0003	201408
A0003	201409
A0004	201405
;

proc summary data=have nway;
class ID_loan;
var Period;
output out=temp min=min max=max;
run;
data temp1;
 set temp(drop=_:);
 do Period= min  to max;
  if month ne month(Period) then output;
  month=month(Period);
 end;
format Period yymmn6.;
drop month min max;
run;
data temp2;
 merge have(in=ina) temp1;
 by ID_loan Period;
 if not ina;
run;
data temp3;
 set temp2;
 by ID_loan;
 if first.ID_loan or intck('month',lag(Period),Period)>1 then group+1;
 run;
 proc summary data=temp3 nway;
 class ID_loan group;
 var Period;
 output out=want(drop=_:) min=start_date max=end_date;
 run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 613 views
  • 0 likes
  • 5 in conversation