10-29-2014 03:17 PM
I am working with a SAS data set that has say string variable SEX that has values "01" (yes, a string) with a label "Female" and value "02" (yes, a string) with a label "Male". How do I convert that into a labeled numeric variable? How can I do this in a loop over all variables, detecting their type, or at least through a loop of the known list of variables that need to be converted?
10-29-2014 03:32 PM
We may need to standardize your description before going into code details. In SAS a Label is attached to a variable not a specific value. By any chance do mean that there is a FORMAT associated with the variable SEX that displays "Female" and "Male" or is that a behavior you would like to have?
If you are talking about formats when you say "labeled numeric" do you have the code that created the formats?
Are all of the variables you are interested in converting all integers? Do you have any of these numeric that take more than 12 characters to display?
One approach might be export to csv and then reimport into a different data set but you'll need a way to assign the formats.
10-29-2014 03:53 PM
Thanks, @ballardw -- I am coming from Stata background where the titles and concepts are different.
(1) Yes, there's a format attached to the variable SEX, so that's the existing behavior.
(2) I don't have the code that created the formats; this is a downloaded data set as is.
(3) I believe all of the variables are integers, and while some of them may have say 5 digits (zip codes), I don't think anything has more than 7.
Exporting to .csv is a painful, but probably workable solution.
10-29-2014 05:48 PM
If the data set is displaying the formatted values correctly then in at least one library with those formats. And can be recovered using the steps below. If not you'll need to write Proc Format code to create the formats.
Assuming the format is named $SEX then you can determine which library with:
Once you know the library and name of the catalog you can get a dataset with the characteristics needed to modify and make your new formats with
Proc Format cntlout=MyFormats /*or any SAS dataset name you choose*/
library=lib ; /* this is the library reported in the previous code, NOT in quoutes*/
Modify the Myformats dataset to replace C with N in the variable TYPE for those formats associated with the variables you just made numeric. You can then use the resulting data set with the cntlin option to build new formats.
proc format library=lib cntlin=MyModifiedformatdataset;run;
This will create formats with the same name but would reference Sex. instead of $Sex.
If we get through this part we can look at a number of ways to associate the formats depending on how many variables are involved.
3)The export/import is pretty easy:
proc export data=yourSASdatasetnamegoeshere
outfile = "C:\yourpath\temp.csv"
proc import datafile="C:\yourpath\temp.csv"
This will create a new data set with variables with the same names but the codes should be numeric. Note: you will lose leading zeroes.