BookmarkSubscribeRSS Feed
shanmukh2
Fluorite | Level 6

Hi.

 

I have the following table 

shanmukh2_0-1590598719308.png

 

 

 

 

For a given id, the RSN codes and RSN amounts remain constant across all the lines (line). I need to create additional lines based on following conditions.

 

For an ID, If RSN_CD_1 = XA and RSN_AMT_1 is not NULL , insert a row with line = 1001, SVC_DT as max(svc_dt) of that ID and create a new column FINAL_AMT and populate it with RSN_AMT_1.

 

For an ID, If RSN_CD_2 = XZ and RSN_AMT_2 is not NULL, insert a row with line = 1002, SVC_DT as max(svc_dt) of that ID and create a new column FINAL_AMT and populate it with RSN_AMT_2.

 

For an ID, If RSN_CD_1 and RSN_CD_2 are populated with different values (XY and XZ) including their amounts, then insert 2 rows with line = 1001 and 1002, SVC_DT as max(svc_dt) of that ID and create a new column FINAL_AMT and populate it with individual RSN_AMTs.

 

For an ID, If RSN_CD_1 and RSN_CD_2 are populated with same values (XZ for eg) including their amounts, then insert 1 row with line = 1002, SVC_DT as max(svc_dt) of that ID and create a new column FINAL_AMT and populate it with sum of  RSN_AMT_1 and RSN_AMT_2.

 

In the new lines created, the rest of the amt columns/ SVC_CD columns are set to NULL or 0. Below is the final output I am looking for

 

shanmukh2_1-1590598810581.png

 

Can some one please help me out how to do it?

 

7 REPLIES 7
biopharma
Quartz | Level 8
Can you post code to create the dataset that you HAVE? For example see how this poster has created.

https://communities.sas.com/t5/SAS-Programming/Covert-date-from-character-or-numeric-before-append/m...

General instructions to create code out of a dataset are here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...
shanmukh2
Fluorite | Level 6

Sure, here you go

 

data demo;
infile datalines missover dlm='';
input @1 id $1-4 @6 line @8 rsn_cd_1 $8-9 @11 rsn_cd_2 $11-12 @14 svc_cd $14-16 @18 svc_dt date9. amt_1 amt_2 amt_3 RSN_AMT_1 RSN_AMT_2;
format svc_dt mmddyy10.;
datalines;
D100 1 XA ABC 11JAN2020 10 20 30 40 0
D100 2 XA ABC 15JAN2020 0 0 0 40 0
D200 1 XZ DEF 13MAY2020 0 0 0 0 50
D200 2 XZ GHI 25MAY2020 0 0 0 0 50
D500 1 XA XZ PQR 20JUN2020 0 0 0 10 30
D500 2 XA XZ GHI 28JUN2020 0 0 0 10 30
D800 1 XZ XZ ABC 27MAY2020 0 0 0 20 40
D800 2 XZ XZ PQR 20JUN2020 0 0 0 20 40
;
run;

 

shanmukh2
Fluorite | Level 6

Sorry, this is much better formatted.

 

data demo;
infile datalines missover dlm='';
input @1 id $1-4 @6 line @8 rsn_cd_1 $8-9 @11 rsn_cd_2 $11-12 @14 svc_cd $14-16 @18 svc_dt date9. amt_1 amt_2 amt_3 RSN_AMT_1 RSN_AMT_2;
format svc_dt mmddyy10.;
datalines;
D100 1 XA       ABC 11JAN2020 10 20 30 40 0
D100 2 XA       ABC 15JAN2020 0 0 0 40 0
D200 1       XZ DEF 13MAY2020 0 0 0 0 50

D200 2       XZ GHI 25MAY2020 0 0 0 0 50
D500 1 XA XZ PQR 20JUN2020 0 0 0 10 30
D500 2 XA XZ GHI 28JUN2020 0 0 0 10 30
D800 1 XZ XZ ABC 27MAY2020 0 0 0 20 40
D800 2 XZ XZ PQR 20JUN2020 0 0 0 20 40
;
run;

shanmukh2
Fluorite | Level 6
I have formatted better, in the below message
biopharma
Quartz | Level 8

If something more elegant occurs, I will post later. I am sure you will hear from some more people with nifty solutions. In the meantime try this:

 

data want ;
   do _n_ = 1 by 1 until (last.id) ;
      set demo end = eof1 ;
      by id ;
      array r(2) ;
      array rsn_cd_ (2) $ ;
      array rsn_amt_ (2) ;
      array chkc (2) $ ("XA" "XZ") ;
      r(_n_) = (rsn_cd_(_n_) = chkc(_n_)) * rsn_amt_(_n_) ;
      maxdt = max(maxdt,svc_dt) ;
   end ;
   
   do _n_ = 1 by 1 until (last.id) ;
      set demo end = eof2 ;
      by id ;
      final_amt = 0 ;
      output ;
      if r(_n_) then 
         do ;
            line = 1000 + _n_ ;
            svc_dt = maxdt ;
            final_amt = rsn_amt_(_n_) ;
            call missing (of rsn_cd_(*)) ;
            do i = 1 to 2 ; rsn_amt_(i) = min(0,rsn_amt_(i)) ; end ;
            output ;
         end ;
   end ;
   drop maxdt chkc1 chkc2 r1 r2 i ;
run ;

proc sort ; 
  by id line ;
run ;
biopharma
Quartz | Level 8

Earlier code was copied and pasted code from one version too early:

data want ;
   do _n_ = 1 by 1 until (last.id) ;
      set demo end = eof1 ;
      by id ;
      array r(2) ;
      array rsn_cd_ (2) $ ;
      array rsn_amt_ (2) ;
      array chkc (2) $ ("XA" "XZ") ;
      r(_n_) = (rsn_cd_(_n_) = chkc(_n_)) * rsn_amt_(_n_) ;
      maxdt = max(maxdt,svc_dt) ;
   end ;
   
   do _n_ = 1 by 1 until (last.id) ;
      set demo end = eof2 ;
      by id ;
      final_amt = 0 ;
      output ;
      if r(_n_) then 
         do ;
            line = 1000 + _n_ ;
            svc_dt = maxdt ;
            final_amt = rsn_amt_(_n_) ;
            if rsn_cd_1 = rsn_cd_2 then final_amt = sum(of rsn_amt_(*)) ;
            call missing (of rsn_cd_(*)) ;
            do i = 1 to 2 ; rsn_amt_(i) = min(0,rsn_amt_(i)) ; end ;
            output ;
         end ;
   end ;
   drop maxdt chkc1 chkc2 r1 r2 i ;
run ;

proc sort ; 
  by id line ;
run ;

 

novinosrin
Tourmaline | Level 20

Not recommending this approach, however methinks this is no more than a solution that encompasses a one pass of by group for the Max(svc date) and then followed by boring IF THEN verbose, which the following is little incomplete for the non missing RSN_CD series as I am very lazy. Feel free to play around and complete if you would like to try.

 


data demo;
infile datalines missover dlm='';
input @1 id $ line  rsn_cd_1 $  rsn_cd_2 $ svc_cd $ svc_dt date9. amt_1 amt_2 amt_3 RSN_AMT_1 RSN_AMT_2;
format svc_dt mmddyy10.;
datalines;
D100 1 XA  .  ABC 11JAN2020 10 20 30 40 0
D100 2 XA  .  ABC 15JAN2020 0 0 0 40 0
D200 1  .  XZ DEF 13MAY2020 0 0 0 0 50
D200 2  .  XZ GHI 25MAY2020 0 0 0 0 50
D500 1 XA XZ PQR 20JUN2020 0 0 0 10 30
D500 2 XA XZ GHI 28JUN2020 0 0 0 10 30
D800 1 XZ XZ ABC 27MAY2020 0 0 0 20 40
D800 2 XZ XZ PQR 20JUN2020 0 0 0 20 40
;
run;

data want;
 do until(last.id);
  set demo;
  by id;
  _m=svc_dt max _m;
  Final_amt=0;
  output;
 end;
 retain _iorc_ 1000;
 array t RSN_AMT_:;
 if cmiss(of rsn_cd_:) then do;
  _n_=whichc(coalescec(of rsn_cd_:),of rsn_cd_:);
  line=_iorc_+_n_;
  Final_amt=t(_n_);
  svc_dt=_m;
  call missing(of rsn_cd_:,of t(*),svc_cd);
  call pokelong(repeat(put(0,rb8.),1),addrlong(t(1)),16);
  output;
 end;
/*One can continue with else do; for non missing rsn_cd: series*/
/*blah blah blah and so forth*/
 drop _:;
run;

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 3860 views
  • 0 likes
  • 3 in conversation