Hi,
I have a cancer dataset "Test" for years 1990 to 2013. It has two variables Casite35 with 35 codes and Casite81 with 81 codes, each code represent a cancer type. All cancers were reported for years 1990 to 2013 except for Benign Brain Cancer from year 2004 to 2013 in both variables (Casite35 and casite81). I used the following SAS commands to restrict them to years 2004-2013:
data Cancer;
set Test;
if CaSite35 in (27) then do;
if 2004 le year le 2013 then output;
else; /* do nothing for the brain cancer codes*/
end;
else output; /* all the other codes for CaSite35*/
run;
data Cancer_13;
set Cancer;
if CaSite81 in (31020) then do;
if 2004 le year le 2013 then output;
else; /* do nothing for the brain cancer codes*/
end;
else output; /* all the other codes for CaSite81*/
run;
I know it's wrong to use it this way. The Cancer_13 observations were less than the observations in the original dataset "Test" and I think I figure out the reason why. I tried to change the codes in different ways like merging both statements into one statement for example but none worked. So is there anyway that I can restrict those two codes in both variables without affecting the other variables and the observations in the data set?
Thank you.
Given your code I can't see what you're calling 'wrong'.
Why is it wrong? The else output makes it hard to see what you're trying to do overall. Since you have more outputs, it may be easier to specify what you want to delete instead?
I think as requested some sample data that shows what records you want to keep and which ones you want to delete are required to help generate the code. Your description is not clear enough and we don't know what wrong with yoru current code.
The code below should be equivalent, but I'm guessing a bit in the middle of the night so I would double check it.
data Cancer;
set Test;
if CaSite35 in (27) or CaSite81=31020 and (2004 le year le 2013) then output;
else if CaSite35 = 27 then delete;
else if CaSite81=31020 then delete;
else output;
run;
Just a couple of observations ...
Of course Cancer_13 contains fewer observations than Test. You start with Test, and output just some of its observations to get Cancer. Then you take the remaining observations, subset them again, to get Cancer_13.
The comment about showing what you want is 100% appropriate. Show 20 lines of data (just 3 or 4 variables), and show the "before" and "after" picture that you have in mind.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.