Hi, I'm trying to convert one of the numeric variables in a SAS dataset into a character variable, but the dataset is being wiped after I run this code. Why does this happen?
My dataset is a STATA export that I made a SAS dataset copy of according to example 2 from here https://stats.idre.ucla.edu/stata/faq/how-do-i-export-a-stata-dta-file-to-a-sas-xport-file/.
data datapath.exvar;
ETHNICIT = 1;
NEW = put(ETHNICIT,1.);
ETHNICIT = 2;
NEW = put(ETHNICIT,2.);
ETHNICIT = 3;
NEW = put(ETHNICIT,3.);
ETHNICIT = 4;
NEW = put(ETHNICIT,4.);
ETHNICIT = 5;
NEW = put(ETHNICIT,5.);
ETHNICIT = 6;
NEW = put(ETHNICIT,6.);
ETHNICIT = 7;
NEW = put(ETHNICIT,7.);
ETHNICIT = 8;
NEW = put(ETHNICIT,8.);
drop ETHNICIT;
rename NEW = ETHNICIT;
run;
Log:
NOTE: There were 23274 observations read from the data set XPTFILE.EXVAR.
NOTE: The data set DATAPATH.EXVAR has 23274 observations and 14 variables.
NOTE: PROCEDURE COPY used (Total process time):
real time 1.44 seconds
cpu time 0.17 seconds
4406 data datapath.exvar;
4407 ETHNICIT = 1;
4408 NEW = put(ETHNICIT,1.);
4409 ETHNICIT = 2;
4410 NEW = put(ETHNICIT,2.);
4411 ETHNICIT = 3;
4412 NEW = put(ETHNICIT,3.);
4413 ETHNICIT = 4;
4414 NEW = put(ETHNICIT,4.);
4415 ETHNICIT = 5;
4416 NEW = put(ETHNICIT,5.);
4417 ETHNICIT = 6;
4418 NEW = put(ETHNICIT,6.);
4419 ETHNICIT = 7;
4420 NEW = put(ETHNICIT,7.);
4421 ETHNICIT = 8;
4422 NEW = put(ETHNICIT,8.);
4423 drop ETHNICIT;
4424 rename NEW = ETHNICIT;
4425 run;
NOTE: The data set DATAPATH.EXVAR has 1 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.58 seconds
cpu time 0.01 seconds
So you have a dataset in your library XPTFILE, and you want to put it in your library DATAPATH and make some corrections.
data datapath.exvar; /* create new dataset */
set xptfile.exvar; /* read existing dataset */
NEW = put(ETHNICIT,1.); /* create new variable, with a length of 1 */
drop ETHNICIT; /* get rid of the old one */
rename NEW = ETHNICIT; /* rename the new one to the old one */
run;
This will implicitly loop over all observations in the source dataset; if you encounter numbers with more than one digit, increase the format in the put() function.
In light of your most recent post (where an external code you use already does the copying), change this to:
data datapath.exvar_corr; /* create new dataset */
set datapath.exvar; /* read existing dataset */
NEW = put(ETHNICIT,1.); /* create new variable, with a length of 1 */
drop ETHNICIT; /* get rid of the old one */
rename NEW = ETHNICIT; /* rename the new one to the old one */
run;
Note: you should never overwrite a data set in a single step (especially while developing); if something fails, you have to recreate your input dataset, which can be time consuming and annoying.
I'm trying to convert one of the numeric variables in a SAS dataset into a character variable, but the dataset is being wiped after I run this code.
Please explain this further ... what do you mean by "being wiped"? Why do you say that converting numeric to character is not working?
Also, your code and your LOG doesn't match, we need to see the actual code and actual log. It says
NOTE: There were 23274 observations read from the data set XPTFILE.EXVAR.
but this data set XPTFILE.EXVAR never appears in your code.
Hi, here's the complete code and log. What I meant by "wiped" was that the dataset goes from 23274 observations and 14 variables to 1 observation and 1 variable. To further clarify, I'm trying to convert all the observations of the ETHNICIT variable from numeric to character (ex. 1s to 1., 2s to 2., 3s to 3. etc).
%include "T:\...\exvar.sas";
libname in XPORT "T:\...\exvar.xpt";
proc contents data=in.exvar;
run;
data "T:\...\exvar";
set in.exvar;
run;
Log:
1 %include "T:\...\exvar.sas";
NOTE: Libref DATAPATH was successfully assigned as follows:
Engine: V9
Physical Name: T:\...\
NOTE: Libref XPTFILE was successfully assigned as follows:
Engine: XPORT
Physical Name: T:\...\exvar.xpt
NOTE: Writing HTML Body file: sashtml.htm
NOTE: Input library XPTFILE is sequential.
NOTE: Copying XPTFILE.EXVAR to DATAPATH.EXVAR (memtype=DATA).
NOTE: BUFSIZE is not cloned when copying across different engines.
System Option for BUFSIZE was used.
NOTE: There were 23274 observations read from the data set XPTFILE.EXVAR.
NOTE: The data set DATAPATH.EXVAR has 23274 observations and 14 variables.
NOTE: PROCEDURE COPY used (Total process time):
real time 1.94 seconds
cpu time 0.64 seconds
NOTE: Format GRADUATE has been output.
NOTE: Format ETHNICIT has been output.
NOTE: Format EMPLOYME has been output.
NOTE: Format RELATION has been output.
NOTE: Format HELP has been output.
NOTE: Format RNQ47A has been output.
NOTE: Format NQ has been output.
NOTE: Format NQA has been output.
NOTE: Format NQB has been output.
NOTE: PROCEDURE FORMAT used (Total process time):
real time 0.09 seconds
cpu time 0.07 seconds
57 libname in XPORT "T:\...\exvar.xpt";
NOTE: Libref IN was successfully assigned as follows:
Engine: XPORT
Physical Name: T:\...\exvar.xpt
58
59 proc contents data=in.exvar;
60 run;
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 0.08 seconds
cpu time 0.03 seconds
61
62 data "T:\...\exvar";
63 set in.exvar;
64 run;
NOTE: There were 23274 observations read from the data set IN.EXVAR.
NOTE: The data set T:\...\exvar has
23274 observations and 14 variables.
NOTE: DATA statement used (Total process time):
real time 0.55 seconds
cpu time 0.14 seconds
65 data datapath.exvar;
66 ETHNICIT = 1;
67 NEW = put(ETHNICIT,1.);
68 ETHNICIT = 2;
69 NEW = put(ETHNICIT,2.);
70 ETHNICIT = 3;
71 NEW = put(ETHNICIT,3.);
72 ETHNICIT = 4;
73 NEW = put(ETHNICIT,4.);
74 ETHNICIT = 5;
75 NEW = put(ETHNICIT,5.);
76 ETHNICIT = 6;
77 NEW = put(ETHNICIT,6.);
78 ETHNICIT = 7;
79 NEW = put(ETHNICIT,7.);
80 ETHNICIT = 8;
81 NEW = put(ETHNICIT,8.);
82 drop ETHNICIT;
83 rename NEW = ETHNICIT;
84 run;
NOTE: The data set DATAPATH.EXVAR has 1 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.54 seconds
cpu time 0.03 seconds
It's not clear to me why you want to convert numeric 1 to character 1, but the easiest solution is just to leave the values as numeric 1. Then no work is involved.
The reason you see only one record in your result was explained by @Kurt_Bremser
Well, there are SAS tools that automatically assume numbers to be continuous and characters to be categorical. I remember something like this from working with Enterprise Miner.
You do not read a dataset, so you will have only 1 iteration of the data step.
In this single iteration, you set two variables repeatedly, drp one of them, and rename the other.
Then, at the end of the single iteration, the data step will perform an implicit output, causing the final value to be written to the dataset.
Since your first occurrence of the variable new in
NEW = put(ETHNICIT,1.);
lets the data step compiler define the variable with a length of 1, all subsequent PUTs can only store the leftmost character (which is always a blank) in the variable, so you get nothing in the end.
Maxim 47: Set a Length:
data exvar;
ETHNICIT = 1;
length new $8;
NEW = put(ETHNICIT,1.);
ETHNICIT = 2;
NEW = put(ETHNICIT,2.);
ETHNICIT = 3;
NEW = put(ETHNICIT,3.);
ETHNICIT = 4;
NEW = put(ETHNICIT,4.);
ETHNICIT = 5;
NEW = put(ETHNICIT,5.);
ETHNICIT = 6;
NEW = put(ETHNICIT,6.);
ETHNICIT = 7;
NEW = put(ETHNICIT,7.);
ETHNICIT = 8;
NEW = put(ETHNICIT,8.);
drop ETHNICIT;
rename NEW = ETHNICIT;
run;
You can greatly simplify your code by using a loop:
data exvar;
length ethnicit 8 new $8;
do ethnicit = 1 to 8;
NEW = put(ETHNICIT,1.);
end;
drop ETHNICIT;
rename NEW = ETHNICIT;
run;
But if it is your intention to have 8 observations in your dataset, you need to add a OUTPUT:
data exvar;
length ethnicit 8 new $8;
do ethnicit = 1 to 8;
NEW = put(ETHNICIT,1.);
output;
end;
drop ETHNICIT;
rename NEW = ETHNICIT;
run;
Hi. Maybe I'm approaching this the wrong way. I'm trying to convert all the observations of ETHNICIT from numeric to character in the exvar dataset (ex. all 1s to 1., all 2s to 2., 3s to 3. etc). Is the simplified code the right way to do that?
Also, how would I read the dataset to do this then?
I appreciate the help!
So you have a dataset in your library XPTFILE, and you want to put it in your library DATAPATH and make some corrections.
data datapath.exvar; /* create new dataset */
set xptfile.exvar; /* read existing dataset */
NEW = put(ETHNICIT,1.); /* create new variable, with a length of 1 */
drop ETHNICIT; /* get rid of the old one */
rename NEW = ETHNICIT; /* rename the new one to the old one */
run;
This will implicitly loop over all observations in the source dataset; if you encounter numbers with more than one digit, increase the format in the put() function.
In light of your most recent post (where an external code you use already does the copying), change this to:
data datapath.exvar_corr; /* create new dataset */
set datapath.exvar; /* read existing dataset */
NEW = put(ETHNICIT,1.); /* create new variable, with a length of 1 */
drop ETHNICIT; /* get rid of the old one */
rename NEW = ETHNICIT; /* rename the new one to the old one */
run;
Note: you should never overwrite a data set in a single step (especially while developing); if something fails, you have to recreate your input dataset, which can be time consuming and annoying.
It worked! Thanks so much.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.