Hi all,
I have worked more on SPSS than SAS. In SPSS, i know we can recode variables easily using following command:
recode x (1=2) (else=copy). : to recode all the 1's as 2's in the same variable x
OR
recode y (9=sysmis). : to make all the 9's in x missing.
I dont need to create a new variable in SPSS to do so, i can simple change some of the values in the original variable.
How can the same be done in SAS? I tried using PROC FORMAT, but I think it just allows to assign some value labels for the values (1=Yes, 2= No).
Please let me know if anyone has the solution. Thanks!
if x=1 then x=2;
if y=9 then y=.;
One word of caution and a slightly expanded version of the suggestion made by Douglas. Combine multiple recode statements with else if, rather than just a set of if statements. Unlike SPSS, which does that automatically in a recode statement, just using a set of if statements will force any of the statements to apply .. even after a previous one has been applied.
Arrays are a nice feature of SAS that make recoding a series of similar variables easier.
For example if we consider the SPSS convention of using a valid numeric value (like 9 or 999) to represent a missing value. You might have many variables that have that convention that you need to set to a missing or special missing in SAS.
data fix ;
set original ;
* Set Not Available code of 9 the special missing value .N ;
array miss9 sex race ...... ;
do over miss9 ;
if miss9 = 9 then miss9=.N ;
end;
* Set Not Available code of 999 the special missing value .N ;
array miss999 q5 q11-q22 ...... ;
do over miss999;
if miss999 = 999 then miss999=.N ;
end;
run;
Generally, I prefer to use proc format rather than creating new variables. This makes your code more flexible and you do not complicate the datasets with extra columns.
Many procedures (both base and stat) can be directed to use formatted values rather than the raw values. For example, proc means has an ORDER=DATA|FORMATTED|FREQ|UNFORMATTED option. Using ORDER=FORMATTED would cause the formatted values to be used for levels of class variables. Of course the class variables must be associated with the appropriate format, so you may need a format statement in the proc step, unless the format association is in the data set.
If you need the recoded values for filtering purposes, you can employ a put statement with custom format to achieve this. E.g., assuming a format Xfmt. is created to do your recoding of x, use where put(x,Xfmt.) = 2 .
And of course proc format can be used for enhancing output -- the labelling function you mention.
Hope this helps.
BTW, it is my understanding that PROC Format was added back in the Dark Ages (70s) in response to SPSS recode functionality:-)
If you are real sure that you want to change the variable the SELECT statement is also useful especially with larger lists. It is basically shorthand for multiple if then else statements.
Select (x);
when (1) x=0;
when (2) x=3;
when (3) x=2;
when (4) x= 157;
otherwise ; /* do nothing, x isn't changed */
end;
The example above, while likely to be nonsensical, is like recode x (1=0, 2=3,3=2,4=157)(else=copy) (if I remember my SPSS).
Rember to use a NEW output dataset and not the same or else you will eventually be sorry.
Actually ,You also can use proc format to rewrite the value of variable Like SPSS.
proc format;
value $rf(default=40)
       low-'1'="9.999"
       '1'<-high="9.99 ~{super b}";
run;
data one;
input (name  age ratio) (:$40.);
ratio=put(ratio,$rf.) ;
cards;
Mathew  14  0.75
Mark 15 0.975
Luck  37 12.59
John 48   2.75
Paul 45  0.735
;
run;
Ksharp
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
