Hi everyone.
Can I please seek your advise on how to write the SAS program to clean my survey based data in a way that can be used to perform routine data check because the data collection is till going. That is whatever that has previously been checked and clarified as logical/plausible/truly unavailable answer will not show up again when I run the data cleaning program next time.
Thank you very much
This isn't really a Q&A secnario here. If I was asked to this then I would probably look at something like this:
Say you have data:
SUBJ Q1 Q2 Q3 Q4...
Now you need to keep a record on each obs, the above structure isn't good for that. So step one is to have a normalised dataset:
SUBJ QNUM RESULT
... 1 ...
... 2 ...
...
Why does this change matter so much, well, you can simply add additional data to each observation this way, say you want a flag for locked, a date for last checked, and outstanding qeury coded item:
SUBJ QNUM RESULT LOCKED LAST_DATE TERM
... 1 ... N 12DEC2015 Result_Missing
... 2 ... Y 14JAN2016
...
The main thing will be how to know when to update things, say you have cleaned a data item, and consider it locked, if the data next transfer comes in and has changed...
Personally, I would run your suite of checks on the whole data at each timepoint, and just compare that to a list of outstanding items. Pretty simple, but a manual.
...
If I thought keeping track was absolutely necessary I would ensure that my original data has a unique identifier for each record.
Then after I had checked/cleaned data I would have a data set of the identifiers checked.
The "next time" I cleaned the data I would subset the data to those records whose idendifiers were not in the data set of the already checked. Then update the identifier set with those checked. Repeat as needed.
But there are a number of other issues involved I don't go into without getting paid...
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.