DATA Step, Macro, Functions and more

Variable restriction to certain years

Reply
Contributor
Posts: 35

Variable restriction to certain years

Hi,

 

I have a cancer dataset "Test" for years 1990 to 2013. It has two variables Casite35 with 35 codes and Casite81 with 81 codes, each code represent a cancer type. All cancers were reported for years 1990 to 2013 except for Benign Brain Cancer from year 2004 to 2013 in both variables (Casite35 and casite81). I used the following SAS commands to restrict them to years 2004-2013:

 

data Cancer;

    set Test;

    if CaSite35 in (27) then do;

         if 2004 le year le 2013 then output;

         else; /* do nothing for the brain cancer codes*/

    end;

    else output; /* all the other codes for CaSite35*/

run;

 

 

data Cancer_13;

    set Cancer;

    if CaSite81 in (31020) then do;

         if 2004 le year le 2013 then output;

         else; /* do nothing for the brain cancer codes*/

    end;

    else output; /* all the other codes for CaSite81*/

run;

 

I know it's wrong to use it this way. The Cancer_13 observations were less than the observations in the original dataset "Test" and I think I figure out the reason why. I tried to change the codes in different ways like merging both statements into one statement for example but none worked. So is there anyway that I can restrict those two codes in both variables without affecting the other variables and the observations in the data set?

 

Thank you.

Super User
Posts: 5,257

Re: Variable restriction to certain years

Please provide sample input and desired output to further describe your requirement.
Data never sleeps
Contributor
Posts: 35

Re: Variable restriction to certain years

Thank you LinusH. Sorry, but I do not exactly understand what you want me to provide.
Contributor
Posts: 35

Re: Variable restriction to certain years

 
Super User
Posts: 17,840

Re: Variable restriction to certain years

Given your code I can't see what you're calling 'wrong'. 

 

Why is it wrong? The else output makes it hard to see what you're trying to do overall. Since you have more outputs, it may be easier to specify what you want to delete instead?

 

I think as requested some sample data that shows what records you want to keep and which ones you want to delete are required to help generate the code. Your description is not clear enough and we don't know what wrong with yoru current code. 

 

The code below should be equivalent, but I'm guessing a bit in the middle of the night so I would double check it.

 

data Cancer;
    set Test;
    if CaSite35 in (27) or CaSite81=31020 and (2004 le year le 2013) then output;
    else if CaSite35 = 27 then delete;
    else if CaSite81=31020 then delete;
    else output; 
    
run;
 
 
Super User
Posts: 5,085

Re: Variable restriction to certain years

Just a couple of observations ...

 

Of course Cancer_13 contains fewer observations than Test.  You start with Test, and output just some of its observations to get Cancer.  Then you take the remaining observations, subset them again, to get Cancer_13.

 

The comment about showing what you want is 100% appropriate.  Show 20 lines of data (just 3 or 4 variables), and show the "before" and "after" picture that you have in mind.

Ask a Question
Discussion stats
  • 5 replies
  • 264 views
  • 0 likes
  • 4 in conversation