BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Abraham
Obsidian | Level 7

Hi,

From the below data, how to get first record based on pid and disease. (should work only on pid with duplicate disease condition)

 

data abc;
input pid age disease $ country $ sno;
cards;
101 23 sarcoma US 1
101 23 sarcoma US 2
102 43 pneumonia China 1
103 56 syphilis Russia 2
103 52 syphilis Russia 1
103 57 Cpox Russia 3
103 58 Cpox Russia 4
103 59 Cpox Russia 5
103 59 Rpox Russia 6
104 75 Spox Uzbek 1
104 82 Spox Uzbek 2
104 12 Asthma Uzbek 3
104 13 Asthma Uzbek 4
104 14 Asthma Uzbek 5
;

 

Output

pid age disease country sno
101 23 sarcoma US 1
103 52 syphilis Russia 1
103 57 Cpox Russia 3
104 75 Spox Uzbek 1
104 12 Asthma Uzbek 3
1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

If you are allowed to sort your data, that would be the safest beginning:

 

proc sort data=have;

   by pid disease sno;

run;

 

After that, you could continue:

 

data want;

set have;

by pid disease;

if first.disease=1 and last.disease=0;

run;

 

If you are not allowed to sort your data, you have to assume that they are properly grouped (all records for the same pid/disease appear together, in order).  You could then code:

 

data want;

set have;

by pid disease notsorted;

if first.disease=1 and last.disease=0;

run;

 

So the result is safer if you are allowed to sort.  But it's still obtainable if you can't sort, as long as the data behave.

 

Good luck.

View solution in original post

5 REPLIES 5
PaigeMiller
Diamond | Level 26

UNTESTED CODE

 

proc sort data=abc;
    by pid disease;
run;
data abc2;
    set abc;
    by pid disease;
    if first.disease;
run;
--
Paige Miller
Astounding
PROC Star

If you are allowed to sort your data, that would be the safest beginning:

 

proc sort data=have;

   by pid disease sno;

run;

 

After that, you could continue:

 

data want;

set have;

by pid disease;

if first.disease=1 and last.disease=0;

run;

 

If you are not allowed to sort your data, you have to assume that they are properly grouped (all records for the same pid/disease appear together, in order).  You could then code:

 

data want;

set have;

by pid disease notsorted;

if first.disease=1 and last.disease=0;

run;

 

So the result is safer if you are allowed to sort.  But it's still obtainable if you can't sort, as long as the data behave.

 

Good luck.

PaigeMiller
Diamond | Level 26

I have never heard of a situation where you are not allowed to sort your data.

 

Does that actually happen? For what reasons would you not be allowed to sort your data?

--
Paige Miller
Astounding
PROC Star

Paige,

 

There are situations where you wouldn't want to sort a data set ... size of the data set, existence of indices.  But even without good reason, the world of SAS provides many sources of incredible situations.  Here are just a few I have either encountered or heard about from others.

 

One supervisor would not allow a MERGE statement, forcing programmers to use IF/THEN instead.  MERGE is just too difficult to master.

 

Student comments and questions ... well real life is stranger than you could imagine.

 

x=2;

 

Student question:  Why would you want to do that?

 

a = b + c;

 

Student comment/question:  You can't add letters.

 

Sometimes students even produce code that works (or at least generates no errors) but is suitable for a puzzle:

 

if a = 1 or 2 then b=3 and c=4;

 

In SAS, as in real life, the truth can be stranger than fiction.

PGStats
Opal | Level 21

To complement @Astounding's ideas, if your records are properly grouped (all records for the same pid/disease appear together) but not always in order, you could do:

 

data want;
do until(last.disease);
    set abc; by pid disease notsorted;
    firstSno = min(firstSno, sno);
    end;
/* To skip groups with a single record */
if first.disease then call missing(firstSno);
do until(last.disease);
    set abc;  by pid disease notsorted;
    if sno = firstSno then do;
        output;
        /* To get only the first record of ties */
        call missing(firstSno);
        end;
    end;
drop firstSno;
run;
PG

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1525 views
  • 3 likes
  • 4 in conversation