BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SarahW13
Obsidian | Level 7

I have a question about how to remove duplicate rows based on the value of a certain variable. I'm using the example below to explain what I am effectively trying to do.

 

In this dataset, each id has many duplicate values for the variable med. In some of those rows, there is also a value for the variable flag. In those cases, I want to preserve only the rows where there is a value for flag. However, there are other rows for the id and med value where there is NO value for the variable flag (eg, where id = 1 and Med = B); in those cases, I want to just keep all of those rows intact.

 

id  med  flag   seq

1    A       1       1

1    A       .        2

1    A       .        3

1    B       .        1

1    B       .        2

2    A       1       1

2    A       .        2

2    C       .        1

2    C       0       2

2    C       .        3

2    C       .        4

2    D       1       1

 

 

This is the dataset that I want to have as the end result:

id  med  flag   seq

1    A       1       1

1    B       .        1

1    B       .        2

2    A       1       1

2    C       0       2

2    D       1       1

 

Does anyone have any advice?

1 ACCEPTED SOLUTION

Accepted Solutions
amatsu
Obsidian | Level 7
data have;
input id med$ flag seq;
cards;
1 A 1 1
1 A . 2
1 A . 3
1 B . 1
1 B . 2
2 A 1 1
2 A . 2
2 C . 1
2 C 0 2
2 C . 3
2 C . 4
2 D 1 1
;

proc sort data=have out=flag (keep=id med) nodupkey;
  by id med;
  where flag^=.;
run;

data want1;
  merge have flag(in=_in);
  by id med;
  if (_in=1 and flag^=.) or _in=0 then output;
run;


or


data want2;
  set have;

  if _n_=1 then do;
     dcl hash hs( dataset:"have(where=(flag^=.))" );
     hs.definekey("id","med");
     hs.definedone();
  end;

  rc = hs.check();
  if rc =0 and flag^=. then output;
  if rc^=0 then output;
run;

View solution in original post

4 REPLIES 4
amatsu
Obsidian | Level 7
data have;
input id med$ flag seq;
cards;
1 A 1 1
1 A . 2
1 A . 3
1 B . 1
1 B . 2
2 A 1 1
2 A . 2
2 C . 1
2 C 0 2
2 C . 3
2 C . 4
2 D 1 1
;

proc sort data=have out=flag (keep=id med) nodupkey;
  by id med;
  where flag^=.;
run;

data want1;
  merge have flag(in=_in);
  by id med;
  if (_in=1 and flag^=.) or _in=0 then output;
run;


or


data want2;
  set have;

  if _n_=1 then do;
     dcl hash hs( dataset:"have(where=(flag^=.))" );
     hs.definekey("id","med");
     hs.definedone();
  end;

  rc = hs.check();
  if rc =0 and flag^=. then output;
  if rc^=0 then output;
run;

SarahW13
Obsidian | Level 7

Thank you so much, amatsu! This worked! 🙂

Ksharp
Super User
data have;
input id med$ flag seq;
cards;
1 A 1 1
1 A . 2
1 A . 3
1 B . 1
1 B . 2
2 A 1 1
2 A . 2
2 C . 1
2 C 0 2
2 C . 3
2 C . 4
2 D 1 1
;
data want;
 do until(last.med);
   set have;
   by id med notsorted;
   if not missing(flag) then yes=1;
 end;
 do until(last.med);
   set have;
   by id med notsorted;
   if yes then do;if not missing(flag) then output;end;
    else output;
 end;
 drop yes;
 run;
novinosrin
Tourmaline | Level 20
data have;
input id med$ flag seq;
cards;
1 A 1 1
1 A . 2
1 A . 3
1 B . 1
1 B . 2
2 A 1 1
2 A . 2
2 C . 1
2 C 0 2
2 C . 3
2 C . 4
2 D 1 1
;

proc sql;
create table want as
select *
from have
group by id,med
having flag=max(flag)
order by id,med,seq;
quit;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1533 views
  • 1 like
  • 4 in conversation