BookmarkSubscribeRSS Feed
mayasak
Quartz | Level 8

Hi,

 

I have a cancer dataset "Test" for years 1990 to 2013. It has two variables Casite35 with 35 codes and Casite81 with 81 codes, each code represent a cancer type. All cancers were reported for years 1990 to 2013 except for Benign Brain Cancer from year 2004 to 2013 in both variables (Casite35 and casite81). I used the following SAS commands to restrict them to years 2004-2013:

 

data Cancer;

    set Test;

    if CaSite35 in (27) then do;

         if 2004 le year le 2013 then output;

         else; /* do nothing for the brain cancer codes*/

    end;

    else output; /* all the other codes for CaSite35*/

run;

 

 

data Cancer_13;

    set Cancer;

    if CaSite81 in (31020) then do;

         if 2004 le year le 2013 then output;

         else; /* do nothing for the brain cancer codes*/

    end;

    else output; /* all the other codes for CaSite81*/

run;

 

I know it's wrong to use it this way. The Cancer_13 observations were less than the observations in the original dataset "Test" and I think I figure out the reason why. I tried to change the codes in different ways like merging both statements into one statement for example but none worked. So is there anyway that I can restrict those two codes in both variables without affecting the other variables and the observations in the data set?

 

Thank you.

5 REPLIES 5
LinusH
Tourmaline | Level 20
Please provide sample input and desired output to further describe your requirement.
Data never sleeps
mayasak
Quartz | Level 8
Thank you LinusH. Sorry, but I do not exactly understand what you want me to provide.
Reeza
Super User

Given your code I can't see what you're calling 'wrong'. 

 

Why is it wrong? The else output makes it hard to see what you're trying to do overall. Since you have more outputs, it may be easier to specify what you want to delete instead?

 

I think as requested some sample data that shows what records you want to keep and which ones you want to delete are required to help generate the code. Your description is not clear enough and we don't know what wrong with yoru current code. 

 

The code below should be equivalent, but I'm guessing a bit in the middle of the night so I would double check it.

 

data Cancer;
    set Test;
    if CaSite35 in (27) or CaSite81=31020 and (2004 le year le 2013) then output;
    else if CaSite35 = 27 then delete;
    else if CaSite81=31020 then delete;
    else output; 
    
run;
 
 
Astounding
PROC Star

Just a couple of observations ...

 

Of course Cancer_13 contains fewer observations than Test.  You start with Test, and output just some of its observations to get Cancer.  Then you take the remaining observations, subset them again, to get Cancer_13.

 

The comment about showing what you want is 100% appropriate.  Show 20 lines of data (just 3 or 4 variables), and show the "before" and "after" picture that you have in mind.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1158 views
  • 0 likes
  • 4 in conversation