Help using Base SAS procedures

Help sorting and removing dups

Accepted Solution Solved
Reply
New Contributor
Posts: 2
Accepted Solution

Help sorting and removing dups


I have a set with multiple vars and two ID's: the first ID identifies the account owner, and the second ID identifies the user and records. EX:

a     1

a     1

a     1

a     2

a     2

a     3

a     3

b     1

b     1

b     2

.

z     5

I wish to collect all records with all account owners (a -....-z) and their FIRST user and records. EX:

a     1

a     1

a     1

b     1

b     1

etc.

Any thoughts on an efficient piece of code to do this?

Thanks


Accepted Solutions
Solution
‎06-21-2012 03:53 PM
Frequent Contributor
Posts: 87

Re: Help sorting and removing dups

Posted in reply to mjkinchen

A non-sql version:

proc sort data = have ;

    by id1 id2 ;

run ;

data want ;

set have ;

    by id1 id2 ;

        if first.id1 ;

run ;

View solution in original post


All Replies
Super Contributor
Posts: 1,636

Re: Help sorting and removing dups

Posted in reply to mjkinchen

how about:

data have;
input id1 $ id2;
cards;
a     1
a     1
a     1
a     2
a     2
a     3
a     3
b     1
b     1
b     2
;

proc sql;
  create table want as
    select * from have
      group by id1
        having id2=min(id2);
quit;
proc print;run;

Linlin

New Contributor
Posts: 2

Re: Help sorting and removing dups

This worked REALLY well... Thanks! Smiley Happy

Frequent Contributor
Posts: 95

Re: Help sorting and removing dups

Posted in reply to mjkinchen

Expanding on Steve's code.

data want ;

    set have ;

    by id1 id2 ;

    retain _id1 _id2;

    if first.id1 then do;

      _id1 = id1;

      _id2 = id2;

    end;

    if _id1 = id1 and _id2 = id2;

    drop _id:;

run ;

Respected Advisor
Posts: 4,173

Re: Help sorting and removing dups

Posted in reply to mjkinchen

If you really want the first user and not the user with the lowest id then Linlin's SQL won't give you the correct result. Some code like below would do:

data have;
  input ida $ idb $;
  datalines;
b 1
b 1
b 2
a 2
a 2
a 1
a 1
a 1
a 3
a 3
;
run;

data want(drop=_Smiley Happy;
  set have;
  by ida notsorted;
  retain _r_idb;
  if first.ida then _r_idb=idb;
  if idb=_r_idb then output;
run;

Solution
‎06-21-2012 03:53 PM
Frequent Contributor
Posts: 87

Re: Help sorting and removing dups

Posted in reply to mjkinchen

A non-sql version:

proc sort data = have ;

    by id1 id2 ;

run ;

data want ;

set have ;

    by id1 id2 ;

        if first.id1 ;

run ;

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 275 views
  • 7 likes
  • 5 in conversation