BookmarkSubscribeRSS Feed
Tal
Pyrite | Level 9 Tal
Pyrite | Level 9

I have a dataset like this:

Fname lname id score

--------------------------

John    smith.  100 a

John    smith.  100 b

Tom.   Fonda.  200 a

Tom.   Fonda.  300 b

Mila.   Fonda.   400 a

.....

If same ppl apear more then once but with different id i randomly select only one record of that person but if they appear multiple times with the same id then they all supposed to get selected. Also single records get selected too. How do i achieve this. Anyone please?

By same ppl i mean same first and last name

5 REPLIES 5
Jagadishkatam
Amethyst | Level 16

Please try

data have;

input Fname$ lname$ id score$;

cards;

John    smith.  100 a

John    smith.  100 b

Tom.   Fonda.  200 a

Tom.   Fonda.  300 b

Mila.   Fonda.   400 a

;

proc sort data=have;

by Fname lname id;

run;

data want(drop=seq) id(keep=Fname lname id seq);

set have;

by Fname lname id;

if first.id then seq=1;

else seq+1;

output want;

if last.id then output id;

run;

data all;

  merge want id;

  by Fname lname id;

  if seq >=1;

run;

Thanks,

Jag

Thanks,
Jag
art297
Opal | Level 21

It's late, so I don't guarantee that the following meets all of your stated conditions. Besides, that's your job to make sure that code does what you want it to do.

However, to understand the following code, you may have to do some Google searches to understand how a double-dow works.

data have;

  input Fname$ lname$ id score$;

  cards;

John    smith.  100 a

John    smith.  100 b

Tom.   Fonda.  200 a

Tom.   Fonda.  300 b

Mila.   Fonda.   400 a

;

proc sort data=have;

  by Fname lname id;

run;

data want;

  do until (last.lname);

    set have;

    by Fname lname id;

    if first.lname then nameseq=1;

    else nameseq+1;

    if first.id then idseq=1;

    else idseq+1;

  end;  

  do until (last.lname);

    set have;

    by Fname lname id;

    random=ranuni(0);

    output;

  end;  

run;

proc sort data=want;

  by Fname lname random;

run;

data want (drop=nameseq idseq random);

  set want;

  by Fname lname;

  if nameseq ge 2 and idseq ge 2 then output;

  else if nameseq eq 1 and idseq eq 1 then output;

  else if first.lname then output;

run;

shubhayog
Obsidian | Level 7

Hi Tal,

I would like to update the reply from Jagadishkatam,

Please try with following

Data have;

  input FNAME$ LNAME$ ID SCORE$;

  datalines;

  John    smith.  100 a

  John    smith.  100 b

  Tom.   Fonda.  200 a

  Tom.   Fonda.  300 b

  Mila.   Fonda.   400 a

  Mila. smith. 500 a

  Zen Fonda. 600 b

  Tom. smith. 700 c

  ;

run;

proc sort

  data=have;

  by FNAME LNAME ID;

run;

data want;

  set have;

  by FNAME LNAME ID;

  if first.ID then ISEQ=1;

  else ISEQ+1;

  if first.LNAME then LSEQ=1;

  else LSEQ+1;

run;

data all (drop=ISEQ LSEQ);

  merge want have;

  by FNAME LNAME ID;

  if ISEQ eq LSEQ;

run;

Hope it will work for you,

Please share if any abnormal observations.

Thank you Jagadishkatam for hint Smiley Happy

Thanks and Regards,

Yogesh

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

Well, not sure about your logic of "i randomly select only one record", would need to be some kind of logical selection.  In my example below I take the smallest for instance - first part gets records with all the same, second gets the different ones:

proc sql;

  create table WANT as

  select  A.FNAME,

          A.LNAME,

          A.ID,

          A.SCORE

  from    WORK.HAVE A

  where   not exists(select distinct FNAME from WORK.HAVE where FNAME=A.FNAME and LNAME=A.LNAME and ID ne A.ID)

  union all

  select  FNAME,

          LNAME,

          ID,

          SCORE

  from    (select * from WORK.HAVE B where exists(select distinct FNAME from WORK.HAVE where FNAME=B.FNAME and LNAME=B.LNAME and ID ne B.ID))

  group by FNAME,LNAME

  having  SCORE=min(SCORE);

quit;

Tal
Pyrite | Level 9 Tal
Pyrite | Level 9

luckely i dont need to do this  but thanks for the approaches  guys .

will  go over  them  just for practice Smiley Happy

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 924 views
  • 10 likes
  • 5 in conversation