Hello:
I would like to use PROC SQL according to the following data steps. I am thinking of using "update' statement and "Case...end". However, I don't know how to work in the multiple variables.
data dataout.recat1;
set dataout.test;
if age in (1,2,3,4,5) then age=1;
if marital in (2, 4) then marital=2;
if income in (1,2,3) then income=1;
run;
This is another difficult problem to solve accurately without having the source data available to test it. However this code wll work. Whether it's what you want is another thing:
proc sql;
create table dataout.recat1 as
select ifn(age in(1, 2, 3, 4, 5), 1, .) as age,
ifn(marital in(2, 4), 2, .) as marital,
ifn(income in(1, 2, 3), 1, .) as income
from dataout.test;
quit;
The first ifn function tests age to see if it's 1 - 5; if it is, age is reset to 1, but if not it's set to missing. The same pattern is in the next two lines. I use ifn or ifc if there are only going to be two possibilities; if there are more I use case/when/else.
If you want other variables in the output dataset, you need to specify them.
Note that you would only use update if you were replacing the dataset; in your datastep code you specify a different dataset.
I don't like set extra missing "." for 1-5 cause there are missing "." in original data. It will cause the confusion.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.