BookmarkSubscribeRSS Feed
_maldini_
Barite | Level 11

I am analyzing survey data. I want to remove respondents that completed the survey more than once. A five-digit code was generated that enabled repeat responders to be identified (Variable name = privatecode).

 

I see various ways of using PROC SORT to remove duplicates and leave one observation (i.e. NODUPRECS, NODUPKEY). I want to remove ALL observations with the same private code. 

 

How can I accomplish this?

 

Thanks!

8 REPLIES 8
Shmuel
Garnet | Level 18

Do you mean that you want to filter any privatecode if occures more than once ?

 

If so do:

proc sort data=have; by privatecode; run;

 

data want;

 set have;

   by privatecode;

       if not (first.privatecode and last.privatecode) then delete;

run;

 

_maldini_
Barite | Level 11

If by "filter" you mean remove, then yes!

_maldini_
Barite | Level 11

@Shmuel 

 

There is one problem. Almost 400 observations have privatecode missing. These observations are being deleted because the privatecode is the same, missing.

 

What's the best work-around here?

 

Thanks!

ballardw
Super User

  if not (first.privatecode and last.privatecode) and not missing(privatecode)  then delete;

 

 

Shmuel
Garnet | Level 18

@ballardw gave you the right answer how do include all observations with miising privatecode.

 

@ballardw - is it possible to be done with the SQL too ?

ballardw
Super User

@Shmuel wrote:

 

@ballardw - is it possible to be done with the SQL too ?


I'm not going to recreate a dataset to test since the OP didn't provide example data (hint)

 

I would start by changing

having count(*)=1

to

having count(*)=1 or missing(a.privatecode)

this is untested.

Shmuel
Garnet | Level 18

In order to include all observations with missing privatecode, change my code to:

 

 

data want;
 set have;
   by privatecode;
if not (first.privatecode and last.privatecode)
and not missing(privatecode) then delete; run;

 

ballardw
Super User

Identify the singles and keep. Proc sql will do this:

 

proc sql;
   create table want as
   select b.*
   from (
         select distinct privatecode
         from YOURDATESET
         group by privatecode
         having count(*)=1
        ) as a 
     left join YOURDATASET as b
     on a.private=b.privatecode;
quit;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1519 views
  • 1 like
  • 3 in conversation