BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
aaou
Obsidian | Level 7

Hi, I have a data set with a large amount of character variables; I want to convert them to numerical variables.

 

Can someone show me how to do this? I'm assuming you have to use arrays.

 

For ease of exposition suppose I have ten character variables.

 

Thanks!

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

This is clearly and example of bad data strcuture.  Why do you have "a large amount of character variables" on which you need to do this processing?  Why not have your data elements going fown the page as observations, it makes almost all programming tasks so much easier:

data have;
  x="1"; y="2"; z="3";
run;

proc transpose data=have out=t_have;
  var x y z;
run;

/* First example you need to know all the variables */
data want1 (drop=i);
  set have;
  array xyz{3} x y z;
  array xyz_num{3} 8;
  do i=1 to 3;
    xyz_num{i}=input(xyz{i},best.);
  end;
run;

/* Second example you only need to know there is two columns */
data want2;
  set t_have
  xyz_num=input(col1,best.);
run;

Remember the data you work with and program with does NOT need to be the same as what is in the output - make life simple for you.

 

 

View solution in original post

9 REPLIES 9
Kurt_Bremser
Super User

If all character variables carry numeric values (or are empty), then you can use this as a blueprint:

data _null_;
set have;
array charvar {*} _character_;
call symput('numvars',strip(put(dim(charvar),best.)));
stop;
run;

data want;
set have;
array charvar {*} _character_;
array newvar {&numvars} newvar1-newvar&numvars;
do i = 1 to &numvars;
  newvar{i} = input(charvar{i},best.);
end;
drop i;
run;
aaou
Obsidian | Level 7

Hi thanks...I'm new to programming, so struggling a bit in understanding your code.

If you assume that the names of the character variables that I'm trying to convert are X,Y and Z...would you mind subsituting those variable names in their respective places in the code you have posted? Just so that I can clearly understand what needs to be done. Thank you so much!

Kurt_Bremser
Super User

OK, lets dissect my code a little:

/* the first step gets the number of character variables present in the
dataset, so I can later define the array of new variables with the correct
number of members */

data _null_; * do not create an output dataset;
set have;
array charvar {*} _character_; * define an array that contains all character variables,
 no need to know their names;;
call symput('numvars',strip(put(dim(charvar),best.))); * put the value into a macro variable;
stop; * end execution in the first iteration of the data step;
run;

data want;
set have;
array charvar {*} _character_; * see above;
array newvar {&numvars} newvar1-newvar&numvars;* here I have to define names for the new (numeric) variables,
for the size I use the macro variable created in the first step;
* also note that the default for a newly defined array is numeric;
do i = 1 to &numvars; * iterate through both arrays;
  newvar{i} = input(charvar{i},best.); * convert;
end;
drop i; * get rid of the index variable;
run;
aaou
Obsidian | Level 7

Thanks a lot! Truly appreciate it.

Kurt_Bremser
Super User

@aaou wrote:

Hi thanks...I'm new to programming, so struggling a bit in understanding your code.

If you assume that the names of the character variables that I'm trying to convert are X,Y and Z...would you mind subsituting those variable names in their respective places in the code you have posted? Just so that I can clearly understand what needs to be done. Thank you so much!


If you do not have a naming pattern for your variables (that enables simple iteration with an index), it makes things more complicated. To fully automate the task, you will need to use the dataset metadata to create a list of variables to be converted.

Assume you have stored the library and dataset name in macro variables:

data _null_;
set sashelp.vcolumn (
  where=(libname = "&mylibname" and memname = "&mydataset" and type = 'char')
) end=done;
/* get data from the view in SASHELP that describes columns;
take only those from your dataset with type character */
/* also define a variable that signals the end */
if _n_ = 1 then call execute("data &mylibname..&mydataset._num; set &mylibname..&mydataset;");
/* call execute pushes code into the execution queue to be executed
immediately after the current data step ends */
/* this one starts a data step */
call execute(trim(name)!!'_num = input('!!trim(name)!!',best.);');
/* do the conversions */
if done then call execute('run;');
/* finish the data step */
run;

 
Jagadishkatam
Amethyst | Level 16

Assume that you have 10 char variables(char1-char10), then you could try the below untested code

 

which creates 10 numeric variables

 

Data want;
set have;
array ch(10) char1-char10;
array nu(10) num1-num10;
do i = 1 to 10;
nu(i)=input(ch(i),best.);
end;
run;
Thanks,
Jag
RW9
Diamond | Level 26 RW9
Diamond | Level 26

This is clearly and example of bad data strcuture.  Why do you have "a large amount of character variables" on which you need to do this processing?  Why not have your data elements going fown the page as observations, it makes almost all programming tasks so much easier:

data have;
  x="1"; y="2"; z="3";
run;

proc transpose data=have out=t_have;
  var x y z;
run;

/* First example you need to know all the variables */
data want1 (drop=i);
  set have;
  array xyz{3} x y z;
  array xyz_num{3} 8;
  do i=1 to 3;
    xyz_num{i}=input(xyz{i},best.);
  end;
run;

/* Second example you only need to know there is two columns */
data want2;
  set t_have
  xyz_num=input(col1,best.);
run;

Remember the data you work with and program with does NOT need to be the same as what is in the output - make life simple for you.

 

 

rogerjdeangelis
Barite | Level 11
Converting character data to numeric

HAVE

 Variables in Creation Order

#    Variable    Type    Len

1    X           Char      1
2    Y           Char      1
3    Z           Char      1

WANT

 Variables in Creation Order

#    Variable    Type    Len

1    X           Num       8
2    Y           Num       8
3    Z           Num       8


* create some data;
data have;
  x="1"; y="2"; z="3";
run;

* create the select clause to covert char to num;
proc sql;
  select
     catx(' ','input(',name,',best.) as',name) into :namlst separated by ','
  from
     sashelp.vcolumn
  where
          libname='WORK'
     and  memname='HAVE'
     and  upcase(type) eqt 'C'
;quit;

%put &=namlst;
/*
 input( X ,best.) as X
,input( Y ,best.) as Y
,input( Z ,best.) as Z
*/

* do the conversion;
proc sql;
  create
    table want as
  select
    &namlst
  from
    have;
;quit;

/*
       X         Y         Z
----------------------------
       1         2         3
*/


aaou
Obsidian | Level 7

Thank you, I followed your advice and changed things at the start it self. 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 2271 views
  • 7 likes
  • 5 in conversation