BookmarkSubscribeRSS Feed
DarthPathos
Lapis Lazuli | Level 10

Hi all

 

Although I've been doing survey analysis for a while, I've never had to dig into the data to find possible duplicates.  These surveys are public and semi-anonymous (the person can provide an email address if they want, but there is no connection between the responses and the email), so I have more things to think about than the usual face-to-face surveys I'm used to analysing.  

 

I have toyed with the idea of using IPs (can't if they're using public wifi, possible multiple responders using the same); I also can't be sure that the person is going to answer all the questions the same way (they may be trying to get multiple gift cards).  I've looked at something called the Hamming Distance (SAS documentation) but no idea if that's an appropriate method.  

 

I apologise this is so vague, but I literally don't even know where to begin.  Any suggestions would be appreciated!

Chris

Has my article or post helped? Please mark as Solution or Like the article!
5 REPLIES 5
Reeza
Super User
Are you suspecting systematic fraud, or individuals who are doing it multiple times? Systematic is a bit easier to catch - look for surveys that are coming in at regular intervals or with weird IPs. You can get locations from IP addresses using PROC GEOCODE as well - you can block/remove all who's locations don't make sense. https://documentation.sas.com/?docsetId=grmapref&docsetTarget=p10c0w9s4g0w3in0zocuj7bzvf2s.htm&docse...
DarthPathos
Lapis Lazuli | Level 10

Darn, I was hoping that multiple times was going to be easier...….we're looking for people going in and answering the survey 2 or more times.  The current proposal is that we'll review every 15 participants, using the entire database as the comparison (someone may answer again three weeks later). 

 

I have the basics down (for example, data that is clearly gibberish or made up, inconsistent answers, etc.) but I recall reading about a way to detect response patterns in surveys, but I can't recall specifics and the stuff I find is beyond complicated.  

 

Appreciate your time!

Chris

Has my article or post helped? Please mark as Solution or Like the article!
Reeza
Super User
This is common in academic testing or psychometrics so you could search within that field?
DarthPathos
Lapis Lazuli | Level 10

That I could do!  Didn't think of that, thanks so much 🙂

Has my article or post helped? Please mark as Solution or Like the article!
ballardw
Super User

Since you mention "attempting to get multiple gift cards" perhaps one place to look is the "where to send the gift card" data.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1056 views
  • 3 likes
  • 3 in conversation