02-08-2017 09:38 PM
I would like to use PROC SQL according to the following data steps. I am thinking of using "update' statement and "Case...end". However, I don't know how to work in the multiple variables.
if age in (1,2,3,4,5) then age=1;
if marital in (2, 4) then marital=2;
if income in (1,2,3) then income=1;
02-08-2017 10:23 PM - edited 02-08-2017 10:31 PM
This is another difficult problem to solve accurately without having the source data available to test it. However this code wll work. Whether it's what you want is another thing:
create table dataout.recat1 as
select ifn(age in(1, 2, 3, 4, 5), 1, .) as age,
ifn(marital in(2, 4), 2, .) as marital,
ifn(income in(1, 2, 3), 1, .) as income
The first ifn function tests age to see if it's 1 - 5; if it is, age is reset to 1, but if not it's set to missing. The same pattern is in the next two lines. I use ifn or ifc if there are only going to be two possibilities; if there are more I use case/when/else.
If you want other variables in the output dataset, you need to specify them.
Note that you would only use update if you were replacing the dataset; in your datastep code you specify a different dataset.
02-08-2017 11:00 PM
I don't like set extra missing "." for 1-5 cause there are missing "." in original data. It will cause the confusion.
02-08-2017 11:10 PM