DATA Step, Macro, Functions and more

Initializing a large list of variables?

Reply
Occasional Contributor
Posts: 12

Initializing a large list of variables?

[ Edited ]

SAS noob here.

Say I have a list of variable names store in a data set NEWVARS under the single variable NEWVAR. I want to take OLDSET and cycle through these names to initialize a new variable of each name (stored as NEWSET). I'm still learning about macros, macro variables, the order that things happen during compilation, etc. Here's what I tried (that didn't work):

DATA _null_;
    SET NEWVARS;
    CALL SYMPUT ('var'||_n_, newvar);
    CALL SYMPUT ('numvars', _n_);
run;

%MACRO init(i);
    &&var&i = .;
%MEND

DATA NEWSET;
    SET OLDSET;
    do i = 1 to &numvars;
        %init(i)
    end;
run;

I've tried a few other ways to do this, but nothing's doing what I want. I could potentially generate a sting of all of the variables I'm tying to initialize; there must be a better way. Plus, I'm going to have a similar problem with the next step that I need to do. How can I do something like this?

PROC Star
Posts: 8,114

Re: Initializing a large list of variables?

Are they all character, all numeric, or a mixture of the two?

 

Art, CEO, AnalystFinder.com

 

Occasional Contributor
Posts: 12

Re: Initializing a large list of variables?

All of them are numeric in this case.

PROC Star
Posts: 8,114

Re: Initializing a large list of variables?

Actually, it doesn't matter if you just want to initialize all of the variables. e.g.:

data have;
  input (x y z) ($);
  cards;
1 2 3
4 5 6
;

data want;
  set have;
  call missing(of _all_);
run;

Art, CEO, AnalystFinder.com

 

Super User
Posts: 22,853

Re: Initializing a large list of variables?

Why do you need to initialize variables and what do you mean by that? 

SAS doesn't require variable initialization or declaration. I'm assuming this is the same person posting this question from SO. 

 

If you have the list of variables in a data set you can easily convert that into a macro variable list and use it in your array statements to initialize it, but there's really no need to do that except in a few situations. 

Occasional Contributor
Posts: 12

Re: Initializing a large list of variables?

Yes, same person.

 

The main reason I want to initialize my variables is so that they are in a specific order. I have population data with variables related to region, year (8 possibilities), age range (8 possibilities), and then 3 variables that I will keep (call them 'a', 'b', and 'c'). I need to make the data "wide" instead of "long". In my new data set, I will have one observation for each region, and then 8*8*3=192 variables for each year/age-range/"a,b,c" combination, so each observation from my original data set will ultimately supply 3 entries in my new data set.. (This is for my work, not cheating  for school or anything.)

Super User
Posts: 22,853

Re: Initializing a large list of variables?

If you're trying to control order, select the variables in the order you want from the SASHELP.VCOLUMN table and then use that within a RETAIN statement or SELECT statement after your transpose.

 

Generating the list of names in your desired order seems to be the issue then, which is very different.

Does this sound correct?

 

proc transpose data=have out=wide;

run;

proc sql;
create table want as
select ...list of variables in desired order
from wide;
quit;
Occasional Contributor
Posts: 12

Re: Initializing a large list of variables?

I could also potentially PROC SORT my original set in a way that as the variables get added for the first time they will be initialized in the correct order. I'm seeing that my true problem here is either not understanding enough about how macros/macrovariables "work", or just trying to implement this in a completely backwards way.

Super User
Super User
Posts: 7,858

Re: Initializing a large list of variables?


Joshua_Sanders wrote:

I could also potentially PROC SORT my original set in a way that as the variables get added for the first time they will be initialized in the correct order. I'm seeing that my true problem here is either not understanding enough about how macros/macrovariables "work", or just trying to implement this in a completely backwards way.

 


SAS will define the order of the variables based on the first time you reference them.  If the order of the variables in your data set is important to you then just make sure to reference them in the order you want them created.  This is good habit to get into anyway as waiting for SAS to guess what type variable you actually want based on how you first use it can lead to some really strange behavior.  For example if the first reference to a variable is in the condition of a IF statement like:

if x=' ' then x='Unknown';

You end up with X defined as length=$1 and so the value is 'U' instead of 'Unknown'.

 

Also don't get fooled into thinking that FORMAT or INFORMAT define the variables.  They can help get the type defined right, but they can get the length totally wrong.  What if you have variable gender with values 'M' and 'F' and a format $GENDER that displayed those as 'Male' and 'Female'.  If the first reference to the variable is a statement like:

format gender $gender. ;

SAS will define GENDER as having a length of $6 when really you only want it to have a length of $1.

 

If you want to setup a structure for your dataset that you reuse like @ballardw suggested then just do that.  So you might create this bit of code to define a dataset with no observations.

data template ;
   length id 8 age 8 gender $1 name $50 test1-test50 8 result1-result50 $20 ;
   stop;
run;

Make it as complex as you want. Just use the code as your template, much easier to read than a table.

 

Then if you want make a new version of this your data step might look like:

data new_standard_data ;
  if 0 then set template ;
  set new_raw_data;
  * code to populate the varaibles ;
run;

So then SAS will define all of the variables it sees in TEMPLATE even though it never reads any of the lines because the IF condition is always false.

Super User
Posts: 6,542

Re: Initializing a large list of variables?

[ Edited ]

As you can see, we are all getting a different picture of what needs to be done.  It would help if you were to post a small version of OLDSET, NEWSET, and the desired outcome.

 

At any rate, I lean toward Reeza's interpretation with this being one possibility:

 

proc sql;

select trim(newvar) into : varlist separated by ' ' from newvars;

quit;

 

data want;

retain &varlist  . ;

set oldest;

run;

Super User
Posts: 13,066

Re: Initializing a large list of variables?

You might describe the desired result. Your attempted code looks like you want to copy the number of records from a data set and set all values to missing. I have to seriously question the value of a data set with many records and all missing values.

 

Perhaps you wanted a data set with the same structure and properties but no values at all.

DATA NEWSET;
    SET OLDSET (obs=0);
run;

 

Will make set with the same properties as oldest but all values will be missing and no actual records.

PROC Star
Posts: 8,114

Re: Initializing a large list of variables?

While I agree with the other responders that you may not want to do what you ask, if you actually need the data in a wide format like that, a group of us (including @Astounding) happened to write a macro a couple of years ago that can do what I think you're asking for.

 

The macro can be downloaded at: http://www.sascommunity.org/wiki/A_Better_Way_to_Flip_(Transpose)_a_SAS_Dataset

 

If I correctly understand what you have and want, the following code would do it if you first downloaded and ran that macro:

data have;
  input region year age_range a b c;
  cards;
1 2010 1 1 2 3
2 2011 2 4 5 6
;

data need;
  set have;
  newvar=catt('Year_',year,'_age_',age_range);
run;

data newvars;
  informat newvar $15.;
  input newvar $;
  order=_n_;
  cards;
Year_2010_age_1
Year_2010_age_2
Year_2010_age_3
Year_2010_age_4
Year_2010_age_5
Year_2010_age_6
Year_2010_age_7
Year_2010_age_8
Year_2011_age_1
Year_2011_age_2
Year_2011_age_3
Year_2011_age_4
Year_2011_age_5
Year_2011_age_6
Year_2011_age_7
Year_2011_age_8
Year_2012_age_1
Year_2012_age_2
Year_2012_age_3
Year_2012_age_4
Year_2012_age_5
Year_2012_age_6
Year_2012_age_7
Year_2012_age_8
Year_2013_age_1
Year_2013_age_2
Year_2013_age_3
Year_2013_age_4
Year_2013_age_5
Year_2013_age_6
Year_2013_age_7
Year_2013_age_8
Year_2014_age_1
Year_2014_age_2
Year_2014_age_3
Year_2014_age_4
Year_2014_age_5
Year_2014_age_6
Year_2014_age_7
Year_2014_age_8
Year_2015_age_1
Year_2015_age_2
Year_2015_age_3
Year_2015_age_4
Year_2015_age_5
Year_2015_age_6
Year_2015_age_7
Year_2015_age_8
Year_2016_age_1
Year_2016_age_2
Year_2016_age_3
Year_2016_age_4
Year_2016_age_5
Year_2016_age_6
Year_2016_age_7
Year_2016_age_8
Year_2017_age_1
Year_2017_age_2
Year_2017_age_3
Year_2017_age_4
Year_2017_age_5
Year_2017_age_6
Year_2017_age_7
Year_2017_age_8
;

%transpose(data=need, out=want, by=region, id=newvar, var=a b c, 
           var_first=no,delimiter=_,preloadfmt=newvars)

Art, CEO, AnalystFinder.com

 

Ask a Question
Discussion stats
  • 11 replies
  • 119 views
  • 0 likes
  • 6 in conversation