BookmarkSubscribeRSS Feed
sustagens
Pyrite | Level 9

Hi I have a dataset "Results" that has the cost for each ID per month, along with some other variables (Class1, Class 2, OIC)

Because it is possible to have no cost for a particular month, I want to to fill in these missing rows for each ID, based on another table,  "List" which is all of the months already passed for this financial year (starts at April 2019, up to present, January 2020). For the rest of the variables, I just want to retain the values from the row above. 

 

If it is also possible to just start at the initial month in the original data, for example, ID 0002232 example should not have Apr-Aug data because data for it only started at Sep.

 

data Results;
input ID $ Class1 $ Class2 $ OIC $ Base_YM $ Cost;
datalines;
0901708 JP J Amanda 201904 56
0901708 JP J Amanda 201905 84
0901708 JP J Amanda 201906 26
0901708 JP J Amanda 201908 94
0901708 JP J Amanda 201910 47
0002232 NJP NJ Steve 201909 56
0002232 NJP NJ Steve 201911 84
5548763 NJP NJ Mark 201907 26
5548763 NJP NJ Mark 201908 94
5548763 NJP NJ Mark 201909 47
;
run;

data List;
input Base_YM $;
datalines;
201904
201905
201906
201907
201908
201909
201910
201911
201912
202001
;
run;

data want;
input ID $ Class1 $ Class 2 $ OIC $ Base_YM $ Cost;
datalines;
0901708 JP J Amanda 201904 56
0901708 JP J Amanda 201905 84
0901708 JP J Amanda 201906 26
0901708 JP J Amanda 201907 26
0901708 JP J Amanda 201908 94
0901708 JP J Amanda 201909 94
0901708 JP J Amanda 201910 47
0901708 JP J Amanda 201911 47
0901708 JP J Amanda 201912 47
0901708 JP J Amanda 202001 47
0002232 NJP NJ Steve 201909 56
0002232 NJP NJ Steve 201910 56
0002232 NJP NJ Steve 201911 84
0002232 NJP NJ Steve 201912 84
0002232 NJP NJ Steve 202001 84
5548763 NJP NJ Mark 201907 26
5548763 NJP NJ Mark 201908 94
5548763 NJP NJ Mark 201909 47
5548763 NJP NJ Mark 201910 47
5548763 NJP NJ Mark 201911 47
5548763 NJP NJ Mark 201912 47
5548763 NJP NJ Mark 202001 47
;
run;

 

 

4 REPLIES 4
Patrick
Opal | Level 21

Below code should do the job.

Please note that I'm using informat yymmn6. for reading the source data into variable Base_YM. This converts the source string into a SAS Date value.

data List;
  input Base_YM :yymmn6.;
  format Base_YM yymmn6.;
  datalines;
201904
201905
201906
201907
201908
201909
201910
201911
201912
202001
;

data Results;
  input ID $ Class1 $ Class2 $ OIC $ Base_YM :yymmn6. Cost;
  format Base_YM yymmn6.;
  datalines;
0901708 JP J Amanda 201904 56
0901708 JP J Amanda 201905 84
0901708 JP J Amanda 201906 26
0901708 JP J Amanda 201908 94
0901708 JP J Amanda 201910 47
0002232 NJP NJ Steve 201909 56
0002232 NJP NJ Steve 201911 84
5548763 NJP NJ Mark 201907 26
5548763 NJP NJ Mark 201908 94
5548763 NJP NJ Mark 201909 47
;

proc sort data=results;
  by id Base_YM;
run;

proc sql noprint;
  select put(max(Base_YM),32.) into :max_base_ym trimmed
  from list
  ;
quit;
%put &=max_base_ym is SAS Date value for yyyymm: %sysfunc(putn(&max_base_ym,yymmn6.));

data want(drop=_:);
  set results;
  by id;
  output;

  if not last.id then
    do;
      _n_+1;
      set results(keep=base_ym rename=(base_ym=_base_ym)) point=_n_;
      _cnt= intck('month',base_ym,_base_ym)-1;
      do _i=1 to _cnt;
        base_ym=intnx('month',base_ym,_i,'b');
        output;
      end;
    end;
  else
  if last.id then
    do;
      _cnt= intck('month',base_ym,&max_base_ym);
      do _i=1 to _cnt;
        base_ym=intnx('month',base_ym,1,'b');
        output;
      end;
    end;
run;

proc print;
run;
sustagens
Pyrite | Level 9

Thank you @Patrick! I modified it slightly after realising I need to set a zero value for cost, so as not to mess up the cumulative values in my next step.

 

Did this for the do blocks that add a new row:

do _i=1 to _cnt;
base_ym=intnx('month',base_ym,_i,'b');
Cost=0; /*****/
output;
end;
hashman
Ammonite | Level 13

@sustagens:

 

Try the scheme below. Note:

  • base_ym is assumed numeric and inputed as such
  • the input data set is assumed grouped by ID
data results ;                                                                                                                          
  input (id class1 class2 oic) ($) base_ym cost ;                                                                                       
  cards ;                                                                                                                               
0901708  jp  j amanda 201904 56                                                                                                         
0901708  jp  j amanda 201905 84                                                                                                         
0901708  jp  j amanda 201906 26                                                                                                         
0901708  jp  j amanda 201908 94                                                                                                         
0901708  jp  j amanda 201910 47                                                                                                         
0002232 njp nj steve  201909 56                                                                                                         
0002232 njp nj steve  201911 84                                                                                                         
5548763 njp nj mark   201907 26                                                                                                         
5548763 njp nj mark   201908 94                                                                                                         
5548763 njp nj mark   201909 47                                                                                                         
;                                                                                                                                       
data list ;                                                                                                                             
  input base_ym ;                                                                                                                       
  cards ;                                                                                                                               
201904                                                                                                                                  
201905                                                                                                                                  
201906                                                                                                                                  
201907                                                                                                                                  
201908                                                                                                                                  
201909                                                                                                                                  
201910                                                                                                                                  
201911                                                                                                                                  
201912                                                                                                                                  
202001                                                                                                                                  
;                                                                                                                                       
data want (drop = _:) ;                                                                                                                 
  dcl hash h (dataset:"list", ordered:"a") ;                                                                                            
  h.definekey ("base_ym") ;                                                                                                             
  h.definedone () ;                                                                                                                     
  dcl hiter i ("h") ;                                                                                                                   
  i.last() ;                                                                                                                            
  _bl = base_ym ;                                                                                                                       
  do p = 2 by 1 until (z) ;                                                                                                             
    set results end = z ;                                                                                                               
    by id notsorted ;                                                                                                                   
    if not last.id then set results (keep=base_ym rename=base_ym=_bm) point = p ;                                                       
    i.setcur() ;                                                                                                                        
    output ;                                                                                                                            
    do while (i.next() = 0) ;                                                                                                           
      if (last.id and base_ym <= _bl) or base_ym < _bm then output ;                                                                      
    end ;                                                                                                                               
  end ;                                                                                                                                 
run ;                                                             

Kind regards

Paul D.

 

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 830 views
  • 2 likes
  • 3 in conversation