SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Observations subset

Accepted Solution Solved
Reply
Contributor
Posts: 35
Accepted Solution

Observations subset

Hello,

 

I have data for cancer with 106 variables including year and brain in CASite67 which has several types of cancer including brain         (coded as 1301 and 1302 for insitu and malignant respectively).

            if Seer_site_group in ( 31010 ) and beh=2

                  then CASite67= 3101 ;  *Brain, In Situ;

            if Seer_site_group in ( 31010 ) and beh=3

                  then CASite67= 3102 ;  *Brain, Malignant;

I need to restrict the data for brain only for years 2004-2013. 

 

Thanks


Accepted Solutions
Solution
‎04-19-2016 10:46 AM
Super User
Posts: 11,343

Re: Observations subset

So we need a little more conditional logic then simple dataset options. This should output records with those two codes occuring in the specified years

 

data want;

    set have;

    if CaSite67 in (3101,3102) then do;

         if 2004 le year le 2013 then output;

         else; /* do nothing for the brain cancer codes*/

    end;

    else output; /* all the other codes for CaSite67*/

run;

View solution in original post


All Replies
Super User
Posts: 11,343

Re: Observations subset

It is not clear if you want to create a new data set or subset for analysis or whether your variable CASite67 exists in your data.

If you need to create a subset data set and that variable is in your existing data then something like this should work:

 

Data want;

   set have (where= (CASite67 in (3101,3102) and (2004 le year le 2013)));

run;

Contributor
Posts: 35

Re: Observations subset

Thank you .

 

This statement kept only the brain observations in CASite67. What I needed is to keep all other cancer observations

such as:

if Seer_site_group in ( 20010 )
then CASite67= 101 ; *Lip;
if Seer_site_group in ( 20020 )
then CASite67= 102 ; *Tongue ;
if Seer_site_group in ( 20030 )
then CASite67= 103 ; *Salivary glands;
if Seer_site_group in ( 20040 )
then CASite67= 104 ; *Floor of Mouth;
if Seer_site_group in ( 20050)
then CASite67= 105 ; *Gum and Other Mouth;
if Seer_site_group in ( 20060)
then CASite67= 106 ; *Nasopharynx;
if Seer_site_group in ( 20070 )
then CASite67= 107 ; *Tonsil; 

for all years (1999-2013) 

 

and just brain observations from year 2004 to 2013.

 

Thanks

 

Solution
‎04-19-2016 10:46 AM
Super User
Posts: 11,343

Re: Observations subset

So we need a little more conditional logic then simple dataset options. This should output records with those two codes occuring in the specified years

 

data want;

    set have;

    if CaSite67 in (3101,3102) then do;

         if 2004 le year le 2013 then output;

         else; /* do nothing for the brain cancer codes*/

    end;

    else output; /* all the other codes for CaSite67*/

run;

Contributor
Posts: 35

Re: Observations subset

I think this worked just fine. Thank you a lot.

 

Community Manager
Posts: 567

Re: Observations subset

Great, I'm glad the solution worked for you, mayasak! Can you unmark your response and mark the appropriate "solution" from 

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 368 views
  • 4 likes
  • 3 in conversation