BookmarkSubscribeRSS Feed
_maldini_
Barite | Level 11

I am analyzing survey data. I want to remove respondents that completed the survey more than once. A five-digit code was generated that enabled repeat responders to be identified (Variable name = privatecode).

 

I see various ways of using PROC SORT to remove duplicates and leave one observation (i.e. NODUPRECS, NODUPKEY). I want to remove ALL observations with the same private code. 

 

How can I accomplish this?

 

Thanks!

8 REPLIES 8
Shmuel
Garnet | Level 18

Do you mean that you want to filter any privatecode if occures more than once ?

 

If so do:

proc sort data=have; by privatecode; run;

 

data want;

 set have;

   by privatecode;

       if not (first.privatecode and last.privatecode) then delete;

run;

 

_maldini_
Barite | Level 11

If by "filter" you mean remove, then yes!

_maldini_
Barite | Level 11

@Shmuel 

 

There is one problem. Almost 400 observations have privatecode missing. These observations are being deleted because the privatecode is the same, missing.

 

What's the best work-around here?

 

Thanks!

ballardw
Super User

  if not (first.privatecode and last.privatecode) and not missing(privatecode)  then delete;

 

 

Shmuel
Garnet | Level 18

@ballardw gave you the right answer how do include all observations with miising privatecode.

 

@ballardw - is it possible to be done with the SQL too ?

ballardw
Super User

@Shmuel wrote:

 

@ballardw - is it possible to be done with the SQL too ?


I'm not going to recreate a dataset to test since the OP didn't provide example data (hint)

 

I would start by changing

having count(*)=1

to

having count(*)=1 or missing(a.privatecode)

this is untested.

Shmuel
Garnet | Level 18

In order to include all observations with missing privatecode, change my code to:

 

 

data want;
 set have;
   by privatecode;
if not (first.privatecode and last.privatecode)
and not missing(privatecode) then delete; run;

 

ballardw
Super User

Identify the singles and keep. Proc sql will do this:

 

proc sql;
   create table want as
   select b.*
   from (
         select distinct privatecode
         from YOURDATESET
         group by privatecode
         having count(*)=1
        ) as a 
     left join YOURDATASET as b
     on a.private=b.privatecode;
quit;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1122 views
  • 1 like
  • 3 in conversation