i found this question in an exam and was confused. In the question it was mentioned two datasets were validated using proc compare and the values were exact match but the observation number was different one has observation n=5 other n=6 so. The question was is the validation success or failure
code:
resetline;
data a;
do x=1 to 5;
output;
end;
run;
data b;
do x=1 to 6;
output;
end;
run;
proc compare base=a compare=b;
run;
log:
11 12 proc compare base=a compare=b; 13 run; NOTE: There were 5 observations read from the data set WORK.A. NOTE: There were 6 observations read from the data set WORK.B. NOTE: PROCEDURE COMPARE used (Total process time): real time 0.03 seconds user cpu time 0.04 seconds system cpu time 0.00 seconds memory 8057.07k OS Memory 30208.00k
output:
log with SYSINFO:
14
15 %put * &=SYSINFO. *;
* SYSINFO=128 *
after looking at the SYSINFO return codes here:
With value 128 - I would say: "Comparison Failed". In general everything greater or equal 64 I consider a fail (sometimes 16 too).
Bart
What criteria were provided in the question for determining success or failure?
That could depend on why someone runs proc compare and the situation or reason determines the "success" or "failure".
Consider: I am running code on an update source file that is expected to have more observations than an older file. But I am looking to see if my process on the updated source produces the same results for some of the records. That has a different "success" than if I am recovering an backup, for example, and want to verify that the compared data set exactly matches the archive copy.
If there is not more description to the problem to be solved in regards to any value judgement (success or failure) then the question is poorly constructed as it is lacking in details.
@Anandu wrote:
So then what is the answer does the validation with difference in observation but with same value. Got succeeded or failed due to difference in number of observation. It was an exam question MCQ
I vote failed. Especially if the question was about validating a dataset. It's possible if someone said "validating the data in the smaller dataset" they might claim it passed, but if I had to choose for a test, I'd definitely choose failed.
I love the ERROR option for proc compare when validating data.
12 proc compare base=a compare=b error; 13 run; ERROR: Data set WORK.B contains 1 observations not in WORK.A. ERROR: The data sets WORK.A and WORK.B do not contain the same data. One or both data sets contain variables or observations not in the other. However, all comparisons are equal for the data in common. NOTE: There were 5 observations read from the data set WORK.A. NOTE: There were 6 observations read from the data set WORK.B.
This is a knowledge-sharing community for SAS Certified Professionals and anyone who wants to learn more about becoming SAS Certified. Ask questions and get answers fast. Share with others who are interested in certification and who are studying for certifications.To get the most from your community experience, use these getting-started resources:
Community Do's and Don'ts
How to add SAS syntax to your post
How to get fast, helpful answers
Ready to level-up your skills? Choose your own adventure.