SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Most efficient way to recode large number of variables

Reply
Occasional Contributor
Posts: 15

Most efficient way to recode large number of variables

I have a data set with over 400 variables that I need to recode.  The data looks something like this:

 

Var_A = values from 1-44

Var_B = Values from 45-98

etc.

 

I need to create a new variable by adding a specific value to each old variable, for example New_Var_A = Var_A +44.  Rather than creating over 400 if then statements, is there a more efficient way to recode these new variables with the new values?  I attempted to write a code like this:

 

data new;

set old;

New_Var_A = Var_A +44;

New_Var_B = Var_B + 98;

run;

 

but the new variables are not addiing up to the proper  values.  Any suggestions are greatly appreciated!

Super User
Super User
Posts: 7,039

Re: Most efficient way to recode large number of variables

You can use an ARRAY to make the coding easier, but the end result will be the same.  So I am not sure what your issue is with the values not being created properly.  Just create arrays for the variable pairs with 44 different and separate set of arrays for the 98 different pairs.

data want ;
  set have ;
  array old1 VARA VARC ... ;
  array new1 new_VARA new_VARC ... ;
  array old2 VARB VARD ... ;
  array new2 new_VARB new_VARD ... ;
  do i=1 to dim(old1) ;
      new1(i) = old1(i) + 44 ;
  end;
  do i=1 to dim(old2);
      new2(i) = old2(i) + 98 ;
  end;
run;
Super User
Posts: 5,426

Re: Most efficient way to recode large number of variables

Your problem is mainly die to that fact that you have 400 variables. That's not a convenient way to store, and maintain data.

Transpose it to Long, and your data management will be much easier.

Data never sleeps
Super User
Posts: 11,343

Re: Most efficient way to recode large number of variables


simkinm2 wrote:

I have a data set with over 400 variables that I need to recode.  The data looks something like this:

 

Var_A = values from 1-44

Var_B = Values from 45-98

etc.

 

I need to create a new variable by adding a specific value to each old variable, for example New_Var_A = Var_A +44.  Rather than creating over 400 if then statements, is there a more efficient way to recode these new variables with the new values?  I attempted to write a code like this:

 

data new;

set old;

New_Var_A = Var_A +44;

New_Var_B = Var_B + 98;

run;

 

but the new variables are not addiing up to the proper  values.  Any suggestions are greatly appreciated!


What are the values of the variables in the Old data set? What are the actual values in the New data set. It would help to provide an example Old data set, in the form of data step code; Your full code for those variables generating values "not addiing up to the proper  values"; and the actual desired output. A sample of 5 variable and five or ten rows of data should be sufficient.

 

BTW if the values of your shown VarA and VarB actually are that way and you request to increment by 44 makes be think that you may inserting a value at the beginning of some sort of list and just shifing a bunch of possitions into the next one. If so, the whole process maybe requires some consideration.

Occasional Contributor
Posts: 15

Re: Most efficient way to recode large number of variables

Thanks for all your responses.  I realized that the issue was actually in vlookup table I had imported, which had some incorrect values.. My first step of adding a value to the old variable worked after all.

 

Thanks!

 

Ask a Question
Discussion stats
  • 4 replies
  • 336 views
  • 2 likes
  • 4 in conversation