BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Sas_R
Calcite | Level 5

Hello,

I have several data sets that have more than 7000 variables in each and I need to convert Variables with prefix XXXX  from character to numeric type.

What could be the optimal solution for this process?

Thanks in advance for your help.

1 ACCEPTED SOLUTION

Accepted Solutions
Sas_R
Calcite | Level 5

Thank you for the feedback. Here is more details on what I'm trying to accomplish;

Originally, I tried to apply below macro which is automaticallt converts char to numeric data type with original names and labels which I need it for the dataset.

http://www.nesug.org/Proceedings/nesug10/ff/ff01.pdf

However since I have too many variables ( around 4000 out of 7000 with Prefix XXXX) , this macro fails because of limit on variable counts.

I tried to modify the macro with variable creation number like first 1 to 500 and then 501 to 1000 and so on, but this is making process much more complex.

I was wondering if there is an optimal way of converting these variables from char to numeric with keeping their original name and labels ?

View solution in original post

6 REPLIES 6
snoopy369
Barite | Level 11

With that many, it may be easiest to write out to a (temporary) text file and read back in, if there are naming conveniences (like VAR1-VAR1000 or something, name-wise).  You don't provide much detail so it's hard to say exactly but that's often easier than setting up the renames.

Jagadishkatam
Amethyst | Level 16

You can try arrays to convert those many variables from character to numeric. Please check the sample code considering the variables with names var1 - var1000.

data want;

     set have;

     array chars(*) $ var1-var1000;

     array nums(*) num1-num1000;

     do i = 1 to dim(chars);

     nums(i)=input(chars(i),best.);

     end;

     drop var:;

run;

Thanks,

Jagadish

Thanks,
Jag
Astounding
PROC Star

SAS doesn't allow this.  Once a variable is character, it remains character forever.  You will need to decide upon new names for the numeric versions.  It could be as simple as using YYYY instead of XXXX as the prefix, but that has to be your first step.  The second step would be to verify that the variables you wish to convert are short enough.  SAS has a limit of roughly 15 to 16 significant digits for accurate storage of integer values.

After the conversion (and as part of the same DATA step), it would be possible to drop the XXXX variables and rename the YYYY variables to the old XXXX names.   So the net effect would be that the original names become numeric instead of character.  All of this can be automated using macro language, but the first couple of steps are up to you.

Good luck.

Sas_R
Calcite | Level 5

Thank you for the feedback. Here is more details on what I'm trying to accomplish;

Originally, I tried to apply below macro which is automaticallt converts char to numeric data type with original names and labels which I need it for the dataset.

http://www.nesug.org/Proceedings/nesug10/ff/ff01.pdf

However since I have too many variables ( around 4000 out of 7000 with Prefix XXXX) , this macro fails because of limit on variable counts.

I tried to modify the macro with variable creation number like first 1 to 500 and then 501 to 1000 and so on, but this is making process much more complex.

I was wondering if there is an optimal way of converting these variables from char to numeric with keeping their original name and labels ?

snoopy369
Barite | Level 11

Just to be clear, it is not possible to _directly_ change a variable's type.  You must go through some intermediary, whether it be a differently named variable or a text file, to convert.

The general approach would be:

1.  Actually convert them.

2.  Obtain the old labels/names and order.

3.  Apply the labels/names and order.

Step 1 can be done a number of different ways; something like

data want;

set have;

newvar1 = input(oldvar1,best12.);

rename newvar1=oldvar1;

drop oldvar1;

run;

accomplishes this.  You could macro-ize this:

%macro convert(var);

&var._new = input(&var.,best12.);

rename &var._new = &var.;

drop &var.;

%mend convert;

Then you could include some method of fixing the labels.  Here's an example.  This may not work perfectly because you may have to break up the two long macro variables (&namelist and &convertlist) if they are more than 20k characters long - but you can use varnum to filter that easily like you suggest (do first 500 then next 500 etc.)  Important is that &namelist contains ALL variables, and &convertlist contains the ones you are converting.

data class;

set sashelp.class;

newage=put(age,3.);

newheight=put(height,3.);

newweight=put(weight,3.);

rename

newage=age

newheight=height

newweight=weight

;

label

newage='Age of Person'

newheight='Height of Person'

newweight='Weight of Person'

;

drop age height weight;

run;

%macro convert(var);

&var._new = input(&var.,best12.);

rename &var._new = &var.;

drop &var.;

callstr=cats("label &var.='",vlabel(&var.),"';");

if _n_ = 1 then call execute(callstr);

%mend convert;

proc sql;

select name into :namelist separated by ' '

from dictionary.columns

where libname='WORK' and memname='CLASS'

order by varnum;

select cats('%convert(',name,')') into :convertlist separated by ' '

from dictionary.columns

where libname='WORK' and memname='CLASS'

and upcase(name) in ('AGE','HEIGHT','WEIGHT');

quit;

data class_n;

set class end=end;

if _n_ = 1 then do;

  call execute ("data class_n_f; retain &namelist.; set class_n;");

end;

&convertlist;

if end then do;

  call execute("run;");

end;

run;

Astounding
PROC Star

You can get there, although there are several pitfalls.  You noticed the first one, the number of variables, since a single macro variable may not be long enough to hold the names of all the XXXX variables.  Another pitfall is that some of the names may be long.  This particular macro depends on being able to create a new variable named NEW_oldname which will not be possible with longer variable names.  Here is one approach:

filename myprog 'path to a file that can hold a SAS program';

proc contents data=have noprint out=xxxx_list (keep=name where=(upcase(name)=:'XXXX'));

run;

data _null_;

   file myprog noprint;

      set xxxx_list end=done;

      newname = 'v' || strip(put(_n_,4.));

      put newname '= input(' name ', 16.);';

      put 'drop ' name ';' ;

      put 'rename ' newname '=' name ';';

run;

data want;

set have;

%include myprog;

run;

The code is untested, so may need a small amount of debugging.  You may want to test it on a smaller data set that contains only a few XXXX variables.

I chose FILE/PUT rather than CALL EXECUTE because I thought it would be easier to visualize the code that you are generating.

Good luck.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 3867 views
  • 0 likes
  • 4 in conversation