BookmarkSubscribeRSS Feed
GK5
Calcite | Level 5 GK5
Calcite | Level 5

Hi all SAS experts! 

 

My dataset looks something like this:

 

Unique    ID   Bacteria              date:

110          1    s.aureus              1/12 2000

111          1    e.coli                    1/12 2000 

112          1    e.coli                    1/9 2000

113          1    strep                     3/10 2004

114          2    s.aureus               1/3 2010

115          2    e.coli                    5/3 2010

116          2    pseudomonas      7/3 2010

170          3   pseudomonas       5/10 2005 

175          3  e.coli                       5/10 2005 

176          3 enterobac                 5/10 2005                

180          3 s.aureus                   1/10 2006

190          3 s.aureus                   6/12 2006

 

How can I make SAS to select the first observation from the same ID of the same date only? With this example I would like to for an example keep for ID=1 the unique variable 110 and 111 as they are from the same date but get rid of the other samples with the same ID=1.

 

Hope you guys can help!

 

Kind regards

 

GK

8 REPLIES 8
PeterClemmensen
Tourmaline | Level 20

@GK5 Hi and welcome to the SAS Community 🙂

 

Here is one way

 

data have;
input Unique ID Bacteria :$20. date :ddmmyy10.;
format date ddmmyy10.;
datalines;
110 1 s.aureus 1/12/2000
111 1 e.coli 1/12/2000 
112 1 e.coli 1/9/2000
113 1 strep 3/10/2004
114 2 s.aureus 1/3/2010
115 2 e.coli 5/3/2010
116 2 pseudomonas 7/3/2010
170 3 pseudomonas 5/10/2005 
175 3 e.coli 5/10/2005
176 3 enterobac 5/10/2005 
180 3 s.aureus 1/10/2006
190 3 s.aureus 6/12/2006
;

data want;
    array _ {99999} _temporary_;
    set have;
    by id;
    if first.id then _[id]=date;
    if _[id]=date;
run;

 

Result:

 

Unique ID Bacteria     date 
110    1  s.aureus     01/12/2000 
111    1  e.coli       01/12/2000 
114    2  s.aureus     01/03/2010 
170    3  pseudomonas  05/10/2005 
175    3  e.coli       05/10/2005 
176    3  enterobac    05/10/2005 

 

GK5
Calcite | Level 5 GK5
Calcite | Level 5

Thank you for your respons.

 

Appologies for not mentioning that this dataset is quite large - what I wrote is just an "sample" of the whole set to illustrate what my data looks like. So what should my code in SAS look like? Sorry, Rookie here 🙂

 

 

PeterClemmensen
Tourmaline | Level 20

How large? And what is your maximum value of ID?

GK5
Calcite | Level 5 GK5
Calcite | Level 5

Max. value of ID in total is 4039199 - so it's a lot. 🙂

PeterClemmensen
Tourmaline | Level 20

My code can handle that, simply add more entries to the array like this

 

data want;
    array _ {9999999} _temporary_;
    set have;
    by id;
    if first.id then _[id]=date;
    if _[id]=date;
run;
Astounding
PROC Star

I think it's actually simpler than that:

 

data want;
   set have;
   by id notsorted;
   if first.id then _date = date;
   retain _date;
   if date = _date;
run;
RTN
Calcite | Level 5 RTN
Calcite | Level 5

I tried that but it only keeps the observations from the latest date not the earliest.

Astounding
PROC Star
Only you can see your program.

Only you can see your log.

Only you can see your data.

Only you can post something helpful to help explain why you are getting the last date instead of the first.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 1053 views
  • 1 like
  • 4 in conversation