Help using Base SAS procedures

find record

Accepted Solution Solved
Reply
Contributor
Posts: 42
Accepted Solution

find record

Hi,

From the below data, how to get first record based on pid and disease. (should work only on pid with duplicate disease condition)

 

data abc;
input pid age disease $ country $ sno;
cards;
101 23 sarcoma US 1
101 23 sarcoma US 2
102 43 pneumonia China 1
103 56 syphilis Russia 2
103 52 syphilis Russia 1
103 57 Cpox Russia 3
103 58 Cpox Russia 4
103 59 Cpox Russia 5
103 59 Rpox Russia 6
104 75 Spox Uzbek 1
104 82 Spox Uzbek 2
104 12 Asthma Uzbek 3
104 13 Asthma Uzbek 4
104 14 Asthma Uzbek 5
;

 

Output

pid age disease country sno
101 23 sarcoma US 1
103 52 syphilis Russia 1
103 57 Cpox Russia 3
104 75 Spox Uzbek 1
104 12 Asthma Uzbek 3

Accepted Solutions
Solution
‎11-16-2015 11:55 PM
Respected Advisor
Posts: 4,790

Re: find record

If you are allowed to sort your data, that would be the safest beginning:

 

proc sort data=have;

   by pid disease sno;

run;

 

After that, you could continue:

 

data want;

set have;

by pid disease;

if first.disease=1 and last.disease=0;

run;

 

If you are not allowed to sort your data, you have to assume that they are properly grouped (all records for the same pid/disease appear together, in order).  You could then code:

 

data want;

set have;

by pid disease notsorted;

if first.disease=1 and last.disease=0;

run;

 

So the result is safer if you are allowed to sort.  But it's still obtainable if you can't sort, as long as the data behave.

 

Good luck.

View solution in original post


All Replies
Trusted Advisor
Posts: 1,444

Re: find record

UNTESTED CODE

 

proc sort data=abc;
    by pid disease;
run;
data abc2;
    set abc;
    by pid disease;
    if first.disease;
run;
Solution
‎11-16-2015 11:55 PM
Respected Advisor
Posts: 4,790

Re: find record

If you are allowed to sort your data, that would be the safest beginning:

 

proc sort data=have;

   by pid disease sno;

run;

 

After that, you could continue:

 

data want;

set have;

by pid disease;

if first.disease=1 and last.disease=0;

run;

 

If you are not allowed to sort your data, you have to assume that they are properly grouped (all records for the same pid/disease appear together, in order).  You could then code:

 

data want;

set have;

by pid disease notsorted;

if first.disease=1 and last.disease=0;

run;

 

So the result is safer if you are allowed to sort.  But it's still obtainable if you can't sort, as long as the data behave.

 

Good luck.

Trusted Advisor
Posts: 1,444

Re: find record

I have never heard of a situation where you are not allowed to sort your data.

 

Does that actually happen? For what reasons would you not be allowed to sort your data?

Respected Advisor
Posts: 4,790

Re: find record

Paige,

 

There are situations where you wouldn't want to sort a data set ... size of the data set, existence of indices.  But even without good reason, the world of SAS provides many sources of incredible situations.  Here are just a few I have either encountered or heard about from others.

 

One supervisor would not allow a MERGE statement, forcing programmers to use IF/THEN instead.  MERGE is just too difficult to master.

 

Student comments and questions ... well real life is stranger than you could imagine.

 

x=2;

 

Student question:  Why would you want to do that?

 

a = b + c;

 

Student comment/question:  You can't add letters.

 

Sometimes students even produce code that works (or at least generates no errors) but is suitable for a puzzle:

 

if a = 1 or 2 then b=3 and c=4;

 

In SAS, as in real life, the truth can be stranger than fiction.

Respected Advisor
Posts: 4,606

Re: find record

To complement @Astounding's ideas, if your records are properly grouped (all records for the same pid/disease appear together) but not always in order, you could do:

 

data want;
do until(last.disease);
    set abc; by pid disease notsorted;
    firstSno = min(firstSno, sno);
    end;
/* To skip groups with a single record */
if first.disease then call missing(firstSno);
do until(last.disease);
    set abc;  by pid disease notsorted;
    if sno = firstSno then do;
        output;
        /* To get only the first record of ties */
        call missing(firstSno);
        end;
    end;
drop firstSno;
run;
PG
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 348 views
  • 3 likes
  • 4 in conversation