Hello:
I would like to use PROC SQL according to the following data steps. I am thinking of using "update' statement and "Case...end". However, I don't know how to work in the multiple variables.
data dataout.recat1;
set dataout.test;
if age in (1,2,3,4,5) then age=1;
if marital in (2, 4) then marital=2;
if income in (1,2,3) then income=1;
run;
This is another difficult problem to solve accurately without having the source data available to test it. However this code wll work. Whether it's what you want is another thing:
proc sql;
create table dataout.recat1 as
select ifn(age in(1, 2, 3, 4, 5), 1, .) as age,
ifn(marital in(2, 4), 2, .) as marital,
ifn(income in(1, 2, 3), 1, .) as income
from dataout.test;
quit;
The first ifn function tests age to see if it's 1 - 5; if it is, age is reset to 1, but if not it's set to missing. The same pattern is in the next two lines. I use ifn or ifc if there are only going to be two possibilities; if there are more I use case/when/else.
If you want other variables in the output dataset, you need to specify them.
Note that you would only use update if you were replacing the dataset; in your datastep code you specify a different dataset.
I don't like set extra missing "." for 1-5 cause there are missing "." in original data. It will cause the confusion.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.