BookmarkSubscribeRSS Feed
shanmukh2
Fluorite | Level 6

Hi.

 

I have the following table 

shanmukh2_0-1590598719308.png

 

 

 

 

For a given id, the RSN codes and RSN amounts remain constant across all the lines (line). I need to create additional lines based on following conditions.

 

For an ID, If RSN_CD_1 = XA and RSN_AMT_1 is not NULL , insert a row with line = 1001, SVC_DT as max(svc_dt) of that ID and create a new column FINAL_AMT and populate it with RSN_AMT_1.

 

For an ID, If RSN_CD_2 = XZ and RSN_AMT_2 is not NULL, insert a row with line = 1002, SVC_DT as max(svc_dt) of that ID and create a new column FINAL_AMT and populate it with RSN_AMT_2.

 

For an ID, If RSN_CD_1 and RSN_CD_2 are populated with different values (XY and XZ) including their amounts, then insert 2 rows with line = 1001 and 1002, SVC_DT as max(svc_dt) of that ID and create a new column FINAL_AMT and populate it with individual RSN_AMTs.

 

For an ID, If RSN_CD_1 and RSN_CD_2 are populated with same values (XZ for eg) including their amounts, then insert 1 row with line = 1002, SVC_DT as max(svc_dt) of that ID and create a new column FINAL_AMT and populate it with sum of  RSN_AMT_1 and RSN_AMT_2.

 

In the new lines created, the rest of the amt columns/ SVC_CD columns are set to NULL or 0. Below is the final output I am looking for

 

shanmukh2_1-1590598810581.png

 

Can some one please help me out how to do it?

 

7 REPLIES 7
biopharma
Quartz | Level 8
Can you post code to create the dataset that you HAVE? For example see how this poster has created.

https://communities.sas.com/t5/SAS-Programming/Covert-date-from-character-or-numeric-before-append/m...

General instructions to create code out of a dataset are here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...
shanmukh2
Fluorite | Level 6

Sure, here you go

 

data demo;
infile datalines missover dlm='';
input @1 id $1-4 @6 line @8 rsn_cd_1 $8-9 @11 rsn_cd_2 $11-12 @14 svc_cd $14-16 @18 svc_dt date9. amt_1 amt_2 amt_3 RSN_AMT_1 RSN_AMT_2;
format svc_dt mmddyy10.;
datalines;
D100 1 XA ABC 11JAN2020 10 20 30 40 0
D100 2 XA ABC 15JAN2020 0 0 0 40 0
D200 1 XZ DEF 13MAY2020 0 0 0 0 50
D200 2 XZ GHI 25MAY2020 0 0 0 0 50
D500 1 XA XZ PQR 20JUN2020 0 0 0 10 30
D500 2 XA XZ GHI 28JUN2020 0 0 0 10 30
D800 1 XZ XZ ABC 27MAY2020 0 0 0 20 40
D800 2 XZ XZ PQR 20JUN2020 0 0 0 20 40
;
run;

 

shanmukh2
Fluorite | Level 6

Sorry, this is much better formatted.

 

data demo;
infile datalines missover dlm='';
input @1 id $1-4 @6 line @8 rsn_cd_1 $8-9 @11 rsn_cd_2 $11-12 @14 svc_cd $14-16 @18 svc_dt date9. amt_1 amt_2 amt_3 RSN_AMT_1 RSN_AMT_2;
format svc_dt mmddyy10.;
datalines;
D100 1 XA       ABC 11JAN2020 10 20 30 40 0
D100 2 XA       ABC 15JAN2020 0 0 0 40 0
D200 1       XZ DEF 13MAY2020 0 0 0 0 50

D200 2       XZ GHI 25MAY2020 0 0 0 0 50
D500 1 XA XZ PQR 20JUN2020 0 0 0 10 30
D500 2 XA XZ GHI 28JUN2020 0 0 0 10 30
D800 1 XZ XZ ABC 27MAY2020 0 0 0 20 40
D800 2 XZ XZ PQR 20JUN2020 0 0 0 20 40
;
run;

shanmukh2
Fluorite | Level 6
I have formatted better, in the below message
biopharma
Quartz | Level 8

If something more elegant occurs, I will post later. I am sure you will hear from some more people with nifty solutions. In the meantime try this:

 

data want ;
   do _n_ = 1 by 1 until (last.id) ;
      set demo end = eof1 ;
      by id ;
      array r(2) ;
      array rsn_cd_ (2) $ ;
      array rsn_amt_ (2) ;
      array chkc (2) $ ("XA" "XZ") ;
      r(_n_) = (rsn_cd_(_n_) = chkc(_n_)) * rsn_amt_(_n_) ;
      maxdt = max(maxdt,svc_dt) ;
   end ;
   
   do _n_ = 1 by 1 until (last.id) ;
      set demo end = eof2 ;
      by id ;
      final_amt = 0 ;
      output ;
      if r(_n_) then 
         do ;
            line = 1000 + _n_ ;
            svc_dt = maxdt ;
            final_amt = rsn_amt_(_n_) ;
            call missing (of rsn_cd_(*)) ;
            do i = 1 to 2 ; rsn_amt_(i) = min(0,rsn_amt_(i)) ; end ;
            output ;
         end ;
   end ;
   drop maxdt chkc1 chkc2 r1 r2 i ;
run ;

proc sort ; 
  by id line ;
run ;
biopharma
Quartz | Level 8

Earlier code was copied and pasted code from one version too early:

data want ;
   do _n_ = 1 by 1 until (last.id) ;
      set demo end = eof1 ;
      by id ;
      array r(2) ;
      array rsn_cd_ (2) $ ;
      array rsn_amt_ (2) ;
      array chkc (2) $ ("XA" "XZ") ;
      r(_n_) = (rsn_cd_(_n_) = chkc(_n_)) * rsn_amt_(_n_) ;
      maxdt = max(maxdt,svc_dt) ;
   end ;
   
   do _n_ = 1 by 1 until (last.id) ;
      set demo end = eof2 ;
      by id ;
      final_amt = 0 ;
      output ;
      if r(_n_) then 
         do ;
            line = 1000 + _n_ ;
            svc_dt = maxdt ;
            final_amt = rsn_amt_(_n_) ;
            if rsn_cd_1 = rsn_cd_2 then final_amt = sum(of rsn_amt_(*)) ;
            call missing (of rsn_cd_(*)) ;
            do i = 1 to 2 ; rsn_amt_(i) = min(0,rsn_amt_(i)) ; end ;
            output ;
         end ;
   end ;
   drop maxdt chkc1 chkc2 r1 r2 i ;
run ;

proc sort ; 
  by id line ;
run ;

 

novinosrin
Tourmaline | Level 20

Not recommending this approach, however methinks this is no more than a solution that encompasses a one pass of by group for the Max(svc date) and then followed by boring IF THEN verbose, which the following is little incomplete for the non missing RSN_CD series as I am very lazy. Feel free to play around and complete if you would like to try.

 


data demo;
infile datalines missover dlm='';
input @1 id $ line  rsn_cd_1 $  rsn_cd_2 $ svc_cd $ svc_dt date9. amt_1 amt_2 amt_3 RSN_AMT_1 RSN_AMT_2;
format svc_dt mmddyy10.;
datalines;
D100 1 XA  .  ABC 11JAN2020 10 20 30 40 0
D100 2 XA  .  ABC 15JAN2020 0 0 0 40 0
D200 1  .  XZ DEF 13MAY2020 0 0 0 0 50
D200 2  .  XZ GHI 25MAY2020 0 0 0 0 50
D500 1 XA XZ PQR 20JUN2020 0 0 0 10 30
D500 2 XA XZ GHI 28JUN2020 0 0 0 10 30
D800 1 XZ XZ ABC 27MAY2020 0 0 0 20 40
D800 2 XZ XZ PQR 20JUN2020 0 0 0 20 40
;
run;

data want;
 do until(last.id);
  set demo;
  by id;
  _m=svc_dt max _m;
  Final_amt=0;
  output;
 end;
 retain _iorc_ 1000;
 array t RSN_AMT_:;
 if cmiss(of rsn_cd_:) then do;
  _n_=whichc(coalescec(of rsn_cd_:),of rsn_cd_:);
  line=_iorc_+_n_;
  Final_amt=t(_n_);
  svc_dt=_m;
  call missing(of rsn_cd_:,of t(*),svc_cd);
  call pokelong(repeat(put(0,rb8.),1),addrlong(t(1)),16);
  output;
 end;
/*One can continue with else do; for non missing rsn_cd: series*/
/*blah blah blah and so forth*/
 drop _:;
run;

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 2538 views
  • 0 likes
  • 3 in conversation