Hi, I have a data set with a large amount of character variables; I want to convert them to numerical variables.
Can someone show me how to do this? I'm assuming you have to use arrays.
For ease of exposition suppose I have ten character variables.
Thanks!
This is clearly and example of bad data strcuture. Why do you have "a large amount of character variables" on which you need to do this processing? Why not have your data elements going fown the page as observations, it makes almost all programming tasks so much easier:
data have;
  x="1"; y="2"; z="3";
run;
proc transpose data=have out=t_have;
  var x y z;
run;
/* First example you need to know all the variables */
data want1 (drop=i);
  set have;
  array xyz{3} x y z;
  array xyz_num{3} 8;
  do i=1 to 3;
    xyz_num{i}=input(xyz{i},best.);
  end;
run;
/* Second example you only need to know there is two columns */
data want2;
  set t_have
  xyz_num=input(col1,best.);
run;
Remember the data you work with and program with does NOT need to be the same as what is in the output - make life simple for you.
If all character variables carry numeric values (or are empty), then you can use this as a blueprint:
data _null_;
set have;
array charvar {*} _character_;
call symput('numvars',strip(put(dim(charvar),best.)));
stop;
run;
data want;
set have;
array charvar {*} _character_;
array newvar {&numvars} newvar1-newvar&numvars;
do i = 1 to &numvars;
  newvar{i} = input(charvar{i},best.);
end;
drop i;
run;Hi thanks...I'm new to programming, so struggling a bit in understanding your code.
If you assume that the names of the character variables that I'm trying to convert are X,Y and Z...would you mind subsituting those variable names in their respective places in the code you have posted? Just so that I can clearly understand what needs to be done. Thank you so much!
OK, lets dissect my code a little:
/* the first step gets the number of character variables present in the
dataset, so I can later define the array of new variables with the correct
number of members */
data _null_; * do not create an output dataset;
set have;
array charvar {*} _character_; * define an array that contains all character variables,
 no need to know their names;;
call symput('numvars',strip(put(dim(charvar),best.))); * put the value into a macro variable;
stop; * end execution in the first iteration of the data step;
run;
data want;
set have;
array charvar {*} _character_; * see above;
array newvar {&numvars} newvar1-newvar&numvars;* here I have to define names for the new (numeric) variables,
for the size I use the macro variable created in the first step;
* also note that the default for a newly defined array is numeric;
do i = 1 to &numvars; * iterate through both arrays;
  newvar{i} = input(charvar{i},best.); * convert;
end;
drop i; * get rid of the index variable;
run;Thanks a lot! Truly appreciate it.
@aaou wrote:
Hi thanks...I'm new to programming, so struggling a bit in understanding your code.
If you assume that the names of the character variables that I'm trying to convert are X,Y and Z...would you mind subsituting those variable names in their respective places in the code you have posted? Just so that I can clearly understand what needs to be done. Thank you so much!
If you do not have a naming pattern for your variables (that enables simple iteration with an index), it makes things more complicated. To fully automate the task, you will need to use the dataset metadata to create a list of variables to be converted.
Assume you have stored the library and dataset name in macro variables:
data _null_;
set sashelp.vcolumn (
  where=(libname = "&mylibname" and memname = "&mydataset" and type = 'char')
) end=done;
/* get data from the view in SASHELP that describes columns;
take only those from your dataset with type character */
/* also define a variable that signals the end */
if _n_ = 1 then call execute("data &mylibname..&mydataset._num; set &mylibname..&mydataset;");
/* call execute pushes code into the execution queue to be executed
immediately after the current data step ends */
/* this one starts a data step */
call execute(trim(name)!!'_num = input('!!trim(name)!!',best.);');
/* do the conversions */
if done then call execute('run;');
/* finish the data step */
run;
 Assume that you have 10 char variables(char1-char10), then you could try the below untested code
which creates 10 numeric variables
Data want;
set have;
array ch(10) char1-char10;
array nu(10) num1-num10;
do i = 1 to 10;
nu(i)=input(ch(i),best.);
end;
run;
This is clearly and example of bad data strcuture. Why do you have "a large amount of character variables" on which you need to do this processing? Why not have your data elements going fown the page as observations, it makes almost all programming tasks so much easier:
data have;
  x="1"; y="2"; z="3";
run;
proc transpose data=have out=t_have;
  var x y z;
run;
/* First example you need to know all the variables */
data want1 (drop=i);
  set have;
  array xyz{3} x y z;
  array xyz_num{3} 8;
  do i=1 to 3;
    xyz_num{i}=input(xyz{i},best.);
  end;
run;
/* Second example you only need to know there is two columns */
data want2;
  set t_have
  xyz_num=input(col1,best.);
run;
Remember the data you work with and program with does NOT need to be the same as what is in the output - make life simple for you.
Converting character data to numeric
HAVE
 Variables in Creation Order
#    Variable    Type    Len
1    X           Char      1
2    Y           Char      1
3    Z           Char      1
WANT
 Variables in Creation Order
#    Variable    Type    Len
1    X           Num       8
2    Y           Num       8
3    Z           Num       8
* create some data;
data have;
  x="1"; y="2"; z="3";
run;
* create the select clause to covert char to num;
proc sql;
  select
     catx(' ','input(',name,',best.) as',name) into :namlst separated by ','
  from
     sashelp.vcolumn
  where
          libname='WORK'
     and  memname='HAVE'
     and  upcase(type) eqt 'C'
;quit;
%put &=namlst;
/*
 input( X ,best.) as X
,input( Y ,best.) as Y
,input( Z ,best.) as Z
*/
* do the conversion;
proc sql;
  create
    table want as
  select
    &namlst
  from
    have;
;quit;
/*
       X         Y         Z
----------------------------
       1         2         3
*/
Thank you, I followed your advice and changed things at the start it self.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
