12-18-2011 10:21 PM
I need to run the same program over many datasets which are similar but do not always use exactly the same variable names to collect the same data.
e.g. Age might be collected in all datasets but has a range of variable names: AGE, AGE1, AGE_NUM, CALCAG, AGNUM
Is there a program I can run to assist me in identifying variables collecting the same data and applying a generic variable name?
Interestingly, the label names are relatively consistent so maybe I can use these...?
PS. Clearly, I am a SAS novice!
12-18-2011 11:43 PM
One possibility would depend on the names of the other variables in your various datasets. In your example, all of the age-like variables contained the string "AG". If that is the case, and none of your other variables contain the string "AG" then you could write a small proc sql call that created a rename statement in a macro variable from the dictionary.columns view.
Let us know if the above scenario might be a solution and, if it is, then I or someone can show you how you could write such a routine.
12-18-2011 11:56 PM
I think I can definitely use something like this to rename some/most of the variables (as in the AGE example) and some guidance on the code would be very useful!
However, there are other variable names where there may not be a unique string. I guess for these, I'll just have to manually identify and rename the variables on a case-by-case basis...
12-19-2011 08:43 AM
You could use something like:
input name $ theage;
input name $ myage;
input name $ myage;
proc sql noprint;
where libname='WORK' and
set &filename. (&renames.);
12-19-2011 05:00 PM
Thanks! I will implement this today and see how many variables I can rename.
12-19-2011 12:31 PM
Are the similar values in the same order in the datasets that you have examined? If so, there may be some hope of using one base dataset and renaming others based on order.
12-19-2011 04:58 PM
Hi there. Unfortunately, the variable order is not always the same.