BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dearkyr
Fluorite | Level 6

Hi, I dont know how to describe in the subject but here is what I'm struggling now.

 

So I have this big table (24 months average balance (see below sample)

 

id |active_dt|avgbal_201804|avgbal_201805|.........|avgbal_201902
001|   201807|         5000|         5300|.........|         7000
002|   201901|         3200|         3000|.........|         4500

 

I need to get avgbal of 3 months before and 3 months after from the active_dt

The new fields name are avgbal_minus3, avgbal_minus2, avgbal_minus1, avgbal_plus1, avgbal_plus2, avgbal_plus3

 

id 001 should have

avgbal_minus3= take from avgbal_201804

avgbal_minus2= take from avgbal_201805

avgbal_minus1= take from avgbal_201806

avgbal_plus1= take from avgbal_201808

avgbal_plus2= take from avgbal_201809

avgbal_plus3= take from avgbal_201810

 

while id 002 will take from avgbal_201810, avgbal_201811, avgbal_201812 and avgbal_201902 (this id will have blank avgbal_plus2 and avgbal_plus3 since the data only up to February.

 

The final table should look like below

 

id |active_dt|avgbal_minus3|avgbal_minus2|avgbal_minus1|avgbal_plus1|avgbal_plus2|avgbal_plus3|
001| 201807| 5000| 5300| XXXX| XXXX| XXXX| XXXX|
002| 201901| XXXX| XXXX| XXXX| 4500| | |

Please help

 

Thanks

 

1 ACCEPTED SOLUTION

Accepted Solutions
andreas_lds
Jade | Level 19

Although transposing the data and getting rid of data in variable names, as suggested by @Kurt_Bremser, is the recommended why to solve the problem, your could skip transposing:

 

data work.have;
   length id $ 3 active_dt 8 
      avgbal_201804 avgbal_201805 avgbal_201806 avgbal_201807 avgbal_201808 avgbal_201809 
      avgbal_201810 avgbal_201811 avgbal_201812 avgbal_201901 avgbal_201902 8
   ;

   input id active_dt avgbal_201804 avgbal_201805 avgbal_201806 avgbal_201807 avgbal_201808 avgbal_201809 
      avgbal_201810 avgbal_201811 avgbal_201812 avgbal_201901 avgbal_201902;
   datalines;
001 201807 5000 5300 . . . . . . . . 7000
002 201901 3200 3000 . . . . . . . . 4500
;
run;


/* Get the names of avgval-variables */
proc sql noprint;
   select Name
      into :avgList separated by '#'
      from sashelp.vcolumn
         where LibName = 'WORK' and MemName = 'HAVE' and Name like 'avgbal%'
   ;
quit;

/* to distinguish between missing data and missing variable and to format both according to specification */ 
proc format;
   value avgbal
      LOW-HIGH = [best32.]
      . = 'XXXX' /* data is missing in value */
      .v = ' ' /* variable is missing */
   ;
run;


data work.want;
   set work.have;

   length 
      avgbal_minus3 avgbal_minus2 avgbal_minus1 avgbal_plus1 avgbal_plus2 avgbal_plus3 8
      active_date i dummy 8
      varname $ 32 
   ;

   keep id active_dt avgbal_minus3 avgbal_minus2 avgbal_minus1 avgbal_plus1 avgbal_plus2 avgbal_plus3;

   /*  dummy is required to get a continuous array */
   retain dummy 0;

   format active_date yymmn6.;

   array targets[-3:3] avgbal_minus3 avgbal_minus2 avgbal_minus1 dummy avgbal_plus1 avgbal_plus2 avgbal_plus3;

   active_date = input(cats(active_dt, '01'), yymmdd8.);
   
   do i = -3 to 3;
      if i ^= 0 then do;
         varname = cats('avgbal_', put(intnx('month', active_date, i), yymmn6.));

         if findw("&avgList.", varname, '#', 'it') then do;
            targets[i] = vvaluex(varname);
         end;
         else do;
            targets[i] = .v;
         end;
      end;
   end;

run;


proc print data=work.want;
   format avgbal: avgbal.;
run;

View solution in original post

5 REPLIES 5
Kurt_Bremser
Super User

First, transpose the data to a long format where you have id, active_dt, balance_dt (derived form the avgbal variable names) and balance.

You can then retrieve the balance grouped by id and active_dt:

data have;
infile datalines dlm='|';
input id active_dt :yymmn6. avgbal_201804 avgbal_201805 avgbal_201806 avgbal_201807 avgbal_201808 avgbal_201809 avgbal_201810 avgbal_201811 avgbal_201812 avgbal_201901 avgbal_201902;
format active_dt yymmddd10.;
datalines;
001|201807|5000|5300|5500|5600|5700|5800|5900|6000|6200|6500|7000
;
run;

proc transpose data=have out=trans;
by id active_dt;
var avgbal:;
run;

data trans2;
set trans;
balance_dt = input(scan(_name_,2,'_'),yymmn6.);
format balance_dt yymmddd10.;
drop _name_;
rename col1=avg_balance;
run;

data want;
set trans2;
by id active_dt;
retain avgbal_minus3 avgbal_minus2 avgbal_minus1 avgbal_plus1 avgbal_plus2 avgbal_plus3;
array balances_minus {3} avgbal_minus1-avgbal_minus3;
array balances_plus {3} avgbal_plus1-avgbal_plus3;
if first.active_dt
then do i = 1 to 3;
  balances_minus{i} = .;
  balances_plus{i} = .;
end;
do i = 1 to 3;
  if intck('month',active_dt,balance_dt) = i then balances_plus{i} = avg_balance;
  if intck('month',balance_dt,active_dt) = i then balances_minus{i} = avg_balance;
end;
if last.active_dt then output;
keep id active_dt avgbal_minus: avgbal_plus:;
run;

Note that a "long" dataset format without data (dates in your case) in structure (variable names) is always easier to code against.

See Maxims 19 & 33.

andreas_lds
Jade | Level 19

Although transposing the data and getting rid of data in variable names, as suggested by @Kurt_Bremser, is the recommended why to solve the problem, your could skip transposing:

 

data work.have;
   length id $ 3 active_dt 8 
      avgbal_201804 avgbal_201805 avgbal_201806 avgbal_201807 avgbal_201808 avgbal_201809 
      avgbal_201810 avgbal_201811 avgbal_201812 avgbal_201901 avgbal_201902 8
   ;

   input id active_dt avgbal_201804 avgbal_201805 avgbal_201806 avgbal_201807 avgbal_201808 avgbal_201809 
      avgbal_201810 avgbal_201811 avgbal_201812 avgbal_201901 avgbal_201902;
   datalines;
001 201807 5000 5300 . . . . . . . . 7000
002 201901 3200 3000 . . . . . . . . 4500
;
run;


/* Get the names of avgval-variables */
proc sql noprint;
   select Name
      into :avgList separated by '#'
      from sashelp.vcolumn
         where LibName = 'WORK' and MemName = 'HAVE' and Name like 'avgbal%'
   ;
quit;

/* to distinguish between missing data and missing variable and to format both according to specification */ 
proc format;
   value avgbal
      LOW-HIGH = [best32.]
      . = 'XXXX' /* data is missing in value */
      .v = ' ' /* variable is missing */
   ;
run;


data work.want;
   set work.have;

   length 
      avgbal_minus3 avgbal_minus2 avgbal_minus1 avgbal_plus1 avgbal_plus2 avgbal_plus3 8
      active_date i dummy 8
      varname $ 32 
   ;

   keep id active_dt avgbal_minus3 avgbal_minus2 avgbal_minus1 avgbal_plus1 avgbal_plus2 avgbal_plus3;

   /*  dummy is required to get a continuous array */
   retain dummy 0;

   format active_date yymmn6.;

   array targets[-3:3] avgbal_minus3 avgbal_minus2 avgbal_minus1 dummy avgbal_plus1 avgbal_plus2 avgbal_plus3;

   active_date = input(cats(active_dt, '01'), yymmdd8.);
   
   do i = -3 to 3;
      if i ^= 0 then do;
         varname = cats('avgbal_', put(intnx('month', active_date, i), yymmn6.));

         if findw("&avgList.", varname, '#', 'it') then do;
            targets[i] = vvaluex(varname);
         end;
         else do;
            targets[i] = .v;
         end;
      end;
   end;

run;


proc print data=work.want;
   format avgbal: avgbal.;
run;
dearkyr
Fluorite | Level 6

I keep getting V as the value. Do I miss something here?

Thanks

Kurt_Bremser
Super User

Run @andreas_lds' code AS IS, and you'll get this result:

 

 

Energy Expenditures for Each Region

(millions of dollars)

Beob.

id

active_dt

avgbal_minus3

avgbal_minus2

avgbal_minus1

avgbal_plus1

avgbal_plus2

avgbal_plus3

1

001

201807

5000

5300

XXXX

XXXX

XXXX

XXXX

2

002

201901

XXXX

XXXX

XXXX

4500

 

 

 

 

 

 



Keep in mind that we have no clue how your original dataset really looks, as you did not provide it in usable form (data step with datalines).

dearkyr
Fluorite | Level 6

Got it.

Because I'm using the original table but didnt change the table name at avgList part.

Thank you so much!!!

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 737 views
  • 3 likes
  • 3 in conversation