BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
joebacon
Pyrite | Level 9

As suggested by Reeza, I am trying to reorganize my data into a long structure opposed to wide to reduce processing time as well as make the coding faster. 

 

Currently, the data is organized a structure where it has main variables such as Recreation, Financial Plan, Legal, Housing, Gift, Education. However, these variables were created in a way that works like Pre_ResYr6thM_Recre and so forth.

 

I am looking for some sample code to pull the "Pre_ResYrNthM" part from each variable to make a several new variables for the Month, Quarter, Half Year and Year as well as variables for Recreation.

 

I have tried something to the effect of :

data Org;
set infl.arc_inflation_jb_102618;
if Preresyr1stM: then Month = 1;
else if Preresyr2ndM: then Month = 2;
else if Preresyr3rdM: then Month = 3;
else if Preresyr4thM: then Month =4;
else if Preresyr5thM: then Month =5;
else if Preresyr6thM: then Month =6;
else if Preresyr7thM: then Month =7;
else if Preresyr8thM: then Month =8;
else if Preresyr9thM: then Month =9;
else if Preresyr10thM: then Month =10;
else if Preresyr11thM: then Month =11;
else if Preresyr12thM: then Month =12;
run;


 

I was then going to just keep the main variables and have a structure like

StudyNo PID Day_1826 Day_InitialIW Month Quarter Half_year PreRes_Consume PreRes_COnsume_Total PostRes_Consume
PostRes_Consume_Total PreRes_Durable PreRes_Durable_Total

 

1 1 30Dec1992 05Oct1993 1 1 1 600 600 

1 1 30Dec1992 05Oct1993 2 1 1 600 1200 

 

 

I was hoping to find some sample code or an old forum for the code but not sure how to pull main character strings from the variables to create the new variables

 

I also tried an iteration that looked like this:

data Org;
set infl.arc_inflation_jb_102618;
if var like %Preresyr1stM% then Month = 1;
.
.
.
.

run;
1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Actually what you should do there is an array approach. 

 

data want;
set have;

array _month(12) <list of variables in order>;

do i=1 to dim(_month);
    interval='Month';
    Period = i;
    Value = _month(i);
     output;
end;

run;

View solution in original post

13 REPLIES 13
utrocketeng
Quartz | Level 8

could you try something using the substr function?  

here would be a start:

var2 = substr(var1, 9, length(var1)-12) 

 

the 9th character is where the month begins... and then you want to count for the length of your variable (var1 in this case) minus the length of the unnecessary characters (the first 9 and the last3).   i hope this helps give you some direction.

Reeza
Super User
Since it's the only number in the variable you could just COMPRESS() out all letters and then use INPUT() to convert it to a number.

x = input(compress(var, , 'kd'), 8.);
utrocketeng
Quartz | Level 8

that is cool, i need to remember this for the future.

joebacon
Pyrite | Level 9

Also, this is really cool. Good tip and bookmarked.

Reeza
Super User

Actually what you should do there is an array approach. 

 

data want;
set have;

array _month(12) <list of variables in order>;

do i=1 to dim(_month);
    interval='Month';
    Period = i;
    Value = _month(i);
     output;
end;

run;

joebacon
Pyrite | Level 9

This seems to be working well, except I have a few questions.

 

First, the first month value is being doubled and the last is being taken away. Could this be because the DO Loop is stopping at 11 and not 12?

 

Secondly, when I run a second array, how would I keep the previous column while generating a new one?

 

Obviously, I have never really used arrays. I appreciate all your help, I am trying to learn as much as possible in a short time and you've been very patient.

 

data Tall;
set infl.arc_inflation_jb_102618;
array _month(12) N_PreResYr1stM_House
N_PreResYr2ndM_House
N_PreResYr3rdM_House
N_PreResYr4thM_House
N_PreResYr5thM_House
N_PreResYr6thM_House
N_PreResYr7thM_House
N_PreResYr8thM_House
N_PreResYr9thM_House
N_PreResYr10thM_House
N_PreResYr11thM_House
N_PreResYr12thM_House
;

do i=1 to dim(_month);
    interval='Month';
    Month = i;
    PreResYr_House = _month(i);
     output;
end;

run;


 
data Tall;
set infl.arc_inflation_jb_102618;
array _month(12) N_PreResYr1stM_CONSUME
N_PreResYr2ndM_CONSUME
N_PreResYr3rdM_CONSUME
N_PreResYr4thM_CONSUME
N_PreResYr5thM_CONSUME
N_PreResYr6thM_CONSUME
N_PreResYr7thM_CONSUME
N_PreResYr8thM_CONSUME
N_PreResYr9thM_CONSUME
N_PreResYr10thM_CONSUME
N_PreResYr11thM_CONSUME
N_PreResYr12thM_CONSUME;

do i=1 to dim(_month);
    interval='Month';
    Month = i;
    PreResYr_CONSUME = _month(i);
     output;
end;
run;

This code generates just the PreResYr_Consume without the PreResYr_House and it also doubles the first month value as seen below:

 

i  Interval Month PreResYr_Consume

 

1  Month 1 767.101074
2 Month 2 383.550537
3 Month 3 383.550537
4 Month 4 383.550537
5 Month 5 383.550537
6 Month 6 383.550537
7 Month 7 383.550537
8 Month 8 383.550537
9 Month 9 383.550537
10 Month 10 383.550537
11 Month 11 383.550537
12 Month 12 .

Reeza
Super User

You can have multiple arrays and steps in one data step. In fact, I think you'll have to here otherwise you're going to get weird results.

 

 

 


data Tall;
set infl.arc_inflation_jb_102618;
array _month(12) N_PreResYr1stM_House
N_PreResYr2ndM_House
N_PreResYr3rdM_House
N_PreResYr4thM_House
N_PreResYr5thM_House
N_PreResYr6thM_House
N_PreResYr7thM_House
N_PreResYr8thM_House
N_PreResYr9thM_House
N_PreResYr10thM_House
N_PreResYr11thM_House
N_PreResYr12thM_House
;

do i=1 to dim(_month);
    interval='Month';
    Month = i;
    PreResYr_House = _month(i);
     output;
end;


array _month(12) N_PreResYr1stM_CONSUME
N_PreResYr2ndM_CONSUME
N_PreResYr3rdM_CONSUME
N_PreResYr4thM_CONSUME
N_PreResYr5thM_CONSUME
N_PreResYr6thM_CONSUME
N_PreResYr7thM_CONSUME
N_PreResYr8thM_CONSUME
N_PreResYr9thM_CONSUME
N_PreResYr10thM_CONSUME
N_PreResYr11thM_CONSUME
N_PreResYr12thM_CONSUME;

do i=1 to dim(_month);
    interval='Month';
    Month = i;
    PreResYr_CONSUME = _month(i);
     output;
end;
run;

 

 


@joebacon wrote:

This seems to be working well, except I have a few questions.

 

First, the first month value is being doubled and the last is being taken away. Could this be because the DO Loop is stopping at 11 and not 12?

 

Secondly, when I run a second array, how would I keep the previous column while generating a new one?

 

Obviously, I have never really used arrays. I appreciate all your help, I am trying to learn as much as possible in a short time and you've been very patient.

 

data Tall;
set infl.arc_inflation_jb_102618;
array _month(12) N_PreResYr1stM_House
N_PreResYr2ndM_House
N_PreResYr3rdM_House
N_PreResYr4thM_House
N_PreResYr5thM_House
N_PreResYr6thM_House
N_PreResYr7thM_House
N_PreResYr8thM_House
N_PreResYr9thM_House
N_PreResYr10thM_House
N_PreResYr11thM_House
N_PreResYr12thM_House
;

do i=1 to dim(_month);
    interval='Month';
    Month = i;
    PreResYr_House = _month(i);
     output;
end;

run;


 
data Tall;
set infl.arc_inflation_jb_102618;
array _month(12) N_PreResYr1stM_CONSUME
N_PreResYr2ndM_CONSUME
N_PreResYr3rdM_CONSUME
N_PreResYr4thM_CONSUME
N_PreResYr5thM_CONSUME
N_PreResYr6thM_CONSUME
N_PreResYr7thM_CONSUME
N_PreResYr8thM_CONSUME
N_PreResYr9thM_CONSUME
N_PreResYr10thM_CONSUME
N_PreResYr11thM_CONSUME
N_PreResYr12thM_CONSUME;

do i=1 to dim(_month);
    interval='Month';
    Month = i;
    PreResYr_CONSUME = _month(i);
     output;
end;
run;

This code generates just the PreResYr_Consume without the PreResYr_House and it also doubles the first month value as seen below:

 

i  Interval Month PreResYr_Consume

 

1  Month 1 767.101074
2 Month 2 383.550537
3 Month 3 383.550537
4 Month 4 383.550537
5 Month 5 383.550537
6 Month 6 383.550537
7 Month 7 383.550537
8 Month 8 383.550537
9 Month 9 383.550537
10 Month 10 383.550537
11 Month 11 383.550537
12 Month 12 .


 

joebacon
Pyrite | Level 9

I then get the error that _month has already been defined.

 

44 end;
45
46
47 array _month(12) N_PreResYr1stM_CONSUME
______
124
ERROR 124-185: The variable _month has already been defined.

 

 

However, I want the months to stay but to input the value for each month under the new Consume category as well.

 

 

joebacon
Pyrite | Level 9

I'm a dummy. I just need to use the months and create the new variables in an If when statement. Sorry.

joebacon
Pyrite | Level 9

Reeza, I apologize for all the questions, but could you point me to where my Macro went wrong?

 

%Macro Transform(x);
When(&x.) do;
PreResYr_Consume =N_PreResYr&x.thM_CONSUME;
PreResYr_Durable =N_PreResYr&x.thM_DURABLE;
PreResYr_Trans =N_PreResYr&x.thM_TRANS;
PreResYr_Health =N_PreResYr&x.thM_HEALTH;
PreResYr_Insure =N_PreResYr&x.thM_INSURE;
PreResYr_Loan =N_PreResYr&x.thM_LOAN;
PreResYr_Tax =N_PreResYr&x.thM_TAX;
PreResYr_Educa =N_PreResYr&x.thM_EDUCA;
PreResYr_Gift =N_PreResYr&x.thM_GIFT;
PreResYr_Enter =N_PreResYr&x.thM_ENTER;
PreResYr_Recre =N_PreResYr&x.thM_RECRE;
PreResYr_Finplan =N_PreResYr&x.thM_FINPLAN;
PreResYr_Legala =N_PreResYr&x.thM_LEGALA;
PreResYr_IncomeTL =N_PreResYr&x.thM_INCOMETL;
end;
%Mend Transform;
data Tall2;
set Tall;
Select(Month);
When(1) do;
PreResYr_Consume =N_PreResYr1stM_CONSUME;
PreResYr_Durable =N_PreResYr1stM_DURABLE;
PreResYr_Trans =N_PreResYr1stM_TRANS;
PreResYr_Health =N_PreResYr1stM_HEALTH;
PreResYr_Insure =N_PreResYr1stM_INSURE;
PreResYr_Loan =N_PreResYr1stM_LOAN;
PreResYr_Tax =N_PreResYr1stM_TAX;
PreResYr_Educa =N_PreResYr1stM_EDUCA;
PreResYr_Gift =N_PreResYr1stM_GIFT;
PreResYr_Enter =N_PreResYr1stM_ENTER;
PreResYr_Recre =N_PreResYr1stM_RECRE;
PreResYr_Finplan =N_PreResYr1stM_FINPLAN;
PreResYr_Legala =N_PreResYr1stM_LEGALA;
PreResYr_IncomeTL =N_PreResYr1stM_INCOMETL;
end;
When(2) do;
PreResYr_Consume =N_PreResYr2ndM_CONSUME;
PreResYr_Durable =N_PreResYr2ndM_DURABLE;
PreResYr_Trans =N_PreResYr2ndM_TRANS;
PreResYr_Health =N_PreResYr2ndM_HEALTH;
PreResYr_Insure =N_PreResYr2ndM_INSURE;
PreResYr_Loan =N_PreResYr2ndM_LOAN;
PreResYr_Tax =N_PreResYr2ndM_TAX;
PreResYr_Educa =N_PreResYr2ndM_EDUCA;
PreResYr_Gift =N_PreResYr2ndM_GIFT;
PreResYr_Enter =N_PreResYr2ndM_ENTER;
PreResYr_Recre =N_PreResYr2ndM_RECRE;
PreResYr_Finplan =N_PreResYr2ndM_FINPLAN;
PreResYr_Legala =N_PreResYr2ndM_LEGALA;
PreResYr_IncomeTL =N_PreResYr2ndM_INCOMETL;
end;
When(3) do;
PreResYr_Consume =N_PreResYr3rdM_CONSUME;
PreResYr_Durable =N_PreResYr3rdM_DURABLE;
PreResYr_Trans =N_PreResYr3rdM_TRANS;
PreResYr_Health =N_PreResYr3rdM_HEALTH;
PreResYr_Insure =N_PreResYr3rdM_INSURE;
PreResYr_Loan =N_PreResYr3rdM_LOAN;
PreResYr_Tax =N_PreResYr3rdM_TAX;
PreResYr_Educa =N_PreResYr3rdM_EDUCA;
PreResYr_Gift =N_PreResYr3rdM_GIFT;
PreResYr_Enter =N_PreResYr3rdM_ENTER;
PreResYr_Recre =N_PreResYr3rdM_RECRE;
PreResYr_Finplan =N_PreResYr3rdM_FINPLAN;
PreResYr_Legala =N_PreResYr3rdM_LEGALA;
PreResYr_IncomeTL =N_PreResYr3rdM_INCOMETL;
end;
%Transform(4-11);

Otherwise do;
PreResYr_Consume =N_PreResYr12thM_CONSUME;
PreResYr_Durable =N_PreResYr12thM_DURABLE;
PreResYr_Trans =N_PreResYr12thM_TRANS;
PreResYr_Health =N_PreResYr12thM_HEALTH;
PreResYr_Insure =N_PreResYr12thM_INSURE;
PreResYr_Loan =N_PreResYr12thM_LOAN;
PreResYr_Tax =N_PreResYr12thM_TAX;
PreResYr_Educa =N_PreResYr12thM_EDUCA;
PreResYr_Gift =N_PreResYr12thM_GIFT;
PreResYr_Enter =N_PreResYr12thM_ENTER;
PreResYr_Recre =N_PreResYr12thM_RECRE;
PreResYr_Finplan =N_PreResYr12thM_FINPLAN;
PreResYr_Legala =N_PreResYr12thM_LEGALA;
PreResYr_IncomeTL =N_PreResYr12thM_INCOMETL;
end;

run;

Can I not do a range from 4-11? I am getting 3 errors but i specified an "when" and "end" statement in the macro and the otherwise do is in context to the previous "select" :

NOTE: Line generated by the macro variable "X".
92         N_PreResYr4-11thM_INCOMETL
                         ____________
                         22
ERROR: Expecting "WHEN", "OTHERWISE", or "END".
ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, *, **, +, -, /, <, <=, <>, =, >, ><, >=, AND, EQ, GE, GT, IN, 
              LE, LT, MAX, MIN, NE, NG, NL, NOTIN, OR, ^=, |, ||, ~=.  

93         end;
94         Otherwise do;
           _________
           161
ERROR 161-185: No matching DO/SELECT statement.

 

 

 

Reeza
Super User
%Transform(4-11); *this is wrong. You cannot call it like that. It's literal text replacement and you have no loop so it's not looping. 4-11 gets replaced everywhere you have the macro variable and that's not what you want. There's no loop code that I see.

To write a macro, first ensure it works in a base case, then start adding macro variables and loops.

I wrote a short tutorial here, let me know if it helps. This assumes you start with working code though.

https://github.com/statgeek/SAS-Tutorials/blob/master/Turning%20a%20program%20into%20a%20macro.md

joebacon
Pyrite | Level 9

Once more, I can't thank you enough! This is so helpful - clear, simple, and informative.

 

 

s_lassen
Meteorite | Level 14

I think you may get away with something like this:

data org;
  set infl.arc_inflation_jb_102618;
  array Pre_ResYr Pre_ResYr:;
  do _N_=1 to dim(Pre_ResYr);
    name=upcase(vname(Pre_ResYr(_N_)));
    select(substr(scan(name,1,'_'),10);
      when('1STM') month=1;
      when('2NDM') month=2;
      /* etcetera */
      otherwise month=0; /* for the stuff that you do not want */
      end;
   if month>0 then do;
      Value=Pre_ResYr(_N_);
      category=scan(name,2,'_');
      output;
      end;
   end;
  keep /* key variables here! */ month value category;
run;
 

You can then (if you want to) use PROC TRANSPOSE to create variable names from the CATEGORY variable.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 1215 views
  • 7 likes
  • 4 in conversation