BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
RickyS
Quartz | Level 8

Good Morning did some google searching and searching through SAS Communities and did not see a solution for what I would like to do, I'm sure its out there just can't find it.  Continuing my search but thought I'd post.

 

I have a data set with observations from 1999 - 2017, trying to figure out a simple way to select observations that appear in each and every year.

 

Cannot think of a way to translate a sub-setting statement ......id must appear in every year  OR Id cannot be missing for any year. 

 

Observations are unique by id and year but may not repeat for each year.

 

Sorry should ay that I know I could subset the file into 1 data set for every year and then rejoin based on year 1999 using id, but there must be a better way.

 

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
andreas_lds
Jade | Level 19

Maybe something like

 

proc sql;
   create table work.want as
   select *
      from work.have
         group by Id
            having count(*) = 2017-1999+1
   ;
quit;

View solution in original post

8 REPLIES 8
PeterClemmensen
Tourmaline | Level 20

There are several ways to do this.

 

Can you post some example data for us to provide a usable code answer?

RickyS
Quartz | Level 8
Id Year Name Score
1 1999 A 85
1 2000 A 85
1 2001 A 87
1 2002 A 85
1 2003 A 85
1 2004 A 85
1 2005 A 85
1 2006 A 85
1 2007 A 85
1 2008 A 90
1 2009 A 85
1 2010 A 85
1 2011 A 82
1 2012 A 85
1 2013 A 85
1 2014 A 85
1 2015 A 85
1 2016 A 85
1 2017 A 85
2 1999 B 85
2 2000 B 85
2 2002 B 85
2 2003 B 85
2 2004 B 85
2 2005 B 85
2 2006 B 85
2 2007 B 85
2 2008 B 90
2 2009 B 85
2 2010 B 85
2 2011 B 82
2 2012 B 85
2 2013 B 85
2 2014 B 85
2 2015 B 85
2 2016 B 85
2 2017 B 85

So we do not Id 2 because it had no score in 2001, if there is no score and no observation for a given year want to drop the observation

novinosrin
Tourmaline | Level 20
data have;
input Id Year Name $ Score;
cards;
1 1999 A 85
1 2000 A 85
1 2001 A 87
1 2002 A 85
1 2003 A 85
1 2004 A 85
1 2005 A 85
1 2006 A 85
1 2007 A 85
1 2008 A 90
1 2009 A 85
1 2010 A 85
1 2011 A 82
1 2012 A 85
1 2013 A 85
1 2014 A 85
1 2015 A 85
1 2016 A 85
1 2017 A 85
2 1999 B 85
2 2000 B 85
2 2002 B 85
2 2003 B 85
2 2004 B 85
2 2005 B 85
2 2006 B 85
2 2007 B 85
2 2008 B 90
2 2009 B 85
2 2010 B 85
2 2011 B 82
2 2012 B 85
2 2013 B 85
2 2014 B 85
2 2015 B 85
2 2016 B 85
2 2017 B 85
;

proc sql noprint;
select max(c) into :m 
from
(select count(id) as c from have group by id );
create table want as
select *
from have
group by id
having count(id)=&m;
quit;
PaigeMiller
Diamond | Level 26

UNTESTED CODE

 

proc summary data=have;
    class id;
    var year;
    output out=yearstats min=minyear max=maxyear n=nyear;
run;
data want;
    if _n_=1 then set yearstats(where=(_type_=0) keep=minyear maxyear _type_);
    merge have yearstats(where=(_type_=1) keep=nyear _type_);
    by id;
    if nyear<(maxyear-minyear+1) then delete;
    drop _type_;
run;
    
    

Hashtag: #PROCSUMMARYRULEZ 

--
Paige Miller
andreas_lds
Jade | Level 19

Maybe something like

 

proc sql;
   create table work.want as
   select *
      from work.have
         group by Id
            having count(*) = 2017-1999+1
   ;
quit;
RickyS
Quartz | Level 8

Thank you everyone as I looked at the solutions I think they would all work but this one seemed the most simplistic, and easiest to validate.  Did not think o using a having statement this way.

 

cheers

PaigeMiller
Diamond | Level 26

Just remember to change the solution from @andreas_lds if newer data contains 2018 or later records, or if at some point data from 1999 is removed from the analysis, etc. Or use a more general approach that figures out the max and min years.

--
Paige Miller
PeterClemmensen
Tourmaline | Level 20
data have;
input Id $ Year Name $ Score;
datalines;
1 1999 A 85
1 2000 A 85
1 2001 A 87
1 2002 A 85
1 2003 A 85
1 2004 A 85
1 2005 A 85
1 2006 A 85
1 2007 A 85
1 2008 A 90
1 2009 A 85
1 2010 A 85
1 2011 A 82
1 2012 A 85
1 2013 A 85
1 2014 A 85
1 2015 A 85
1 2016 A 85
1 2017 A 85
2 1999 B 85
2 2000 B 85
2 2002 B 85
2 2003 B 85
2 2004 B 85
2 2005 B 85
2 2006 B 85
2 2007 B 85
2 2008 B 90
2 2009 B 85
2 2010 B 85
2 2011 B 82
2 2012 B 85
2 2013 B 85
2 2014 B 85
2 2015 B 85
2 2016 B 85
2 2017 B 85
;

data _null_;
   if _N_ = 1 then do;
      declare hash h(dataset:'have');
      h.defineKey('Id', 'Year');
      h.defineDone();    

      declare hash out(dataset:'have', multidata:'Y');
      out.defineKey('Id');
      out.definedata(all:'Y');
      out.defineDone();
   end;

   set have end=eof;
   by ID;

   if h.check()=0 then c+1;

   if last.id then do;
      if c ne (2017-1999+1) then out.remove(key:Id);
      c=0;
   end;

   if eof then out.output(dataset:'want');
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1229 views
  • 2 likes
  • 5 in conversation