A portion of my data is as in the attachment. In the attachment, there are three items, "key" is item key, COL1 to COL3 are the responses of three candidates to those three items. I want to choose any pair from three candidates, then compare the responses of each pair with the "key". COL12, COL13, COL23 are the results I wanted. Following are the simple SAS code to create COL12 variable.
data want;
set have;
if COL1=COL2=key then COL12="same resp and both correct";
else if COL1=COL2 and COL1^=key then COL12="same resp and both wrong";
else if COL1=COL2 and COL2^=key then COL12="same resp and both wrong";
else if COL1=key then COL12="different resp and one of them correct";
else if COL2=key then COL12="different resp and one of them correct";
else COL12="different resp and both wrong";
run;
Because I have many 52 columns, so the combinations of each pair would be C(52,2)=52!.(52-2)!2!=1326. Any quick way to help me get what I wanted?
PLEASE HELP!
thank you!
Hi @superbug See if this helps,
data have;
input itempos key COL1-COL3;
infile datalines dlm=',';
datalines;
1,1,3,3,1
2,3,3,3,3
3,1,1,1,1
;
data want;
set have;
array c col:;
length category $30 Pair $20;
do _i=1 to dim(c)-1;
do _j=_i+1 to dim(c);
k1=c(_i);
k2=c(_j);
Pair=catx('_','Pair',_i,_j);
if k1=k2=key then category="same resp and both wrong";
else if k1=k2^=key then category="same resp and both wrong";
else if k1=key or k2=key then category="different resp and one of them correct";
output;
end;
end;
keep itempos key category pair;
run;
proc transpose data=want out=final_want(drop=_:);
by itempos key notsorted;
var category;
id pair;
run;
Feel free to let us know if you need any clarifications. All the best!
As data in the attachment. I have 250 items , "key" is the key of each item, COL1 to COL52 are the responses of 52 candidates to those 250 items. I want to choose any pair from those 52 candidates, that is, C(52, 2)=52!/(52-2)!2!, and compare the responses of EACH PAIR with the "key" . For example,
data comb1;
set comb;
if COL1=COL2=key then category="same resp and both correct";
else if COL1=COL2 and COL1^=key then category="same resp and both wrong";
else if COL1=COL2 and COL2^=key then category="same resp and both wrong";
else if COL1=key then category="different resp and one of them correct";
else if COL2=key then category="different resp and one of them correct";
else category="different resp and both wrong";
run;
Because C(52, 2)=52! / (52-2)!2! = 1326 pairs, for each pair, I want to use the code above, how should I do it in SAS?
thank you!
@superbug in problems like this, it is always easier to solve a smaller problem, which resembles your actual problem and then scale up.
Suppose your data looks like this
data have;
input itempos key COL1-COL3;
infile datalines dlm=',';
datalines;
1,1,3,3,1
2,3,3,3,3
3,1,1,1,1
;
What do you want the result to look like?
use the small data you suggested, I'd like the result to be as in the attachment (have.csv). thank you!
Why do you want to do this? As you point out yourself, this will require 52 choose 2 = 1326 = too many new variables in your original problem.
It can be done fairly simple, but it is not advisable. It seems like a part of a bigger problem so what do you want to do with the new variables?
under each pair, there are different categories on the responses compared with the key. I want to get the freq of the categories in each pair.
you said "It can be done fairly simple, but it is not advisable", could you please let me know your simple way?
thanks!
Hi @superbug See if this helps,
data have;
input itempos key COL1-COL3;
infile datalines dlm=',';
datalines;
1,1,3,3,1
2,3,3,3,3
3,1,1,1,1
;
data want;
set have;
array c col:;
length category $30 Pair $20;
do _i=1 to dim(c)-1;
do _j=_i+1 to dim(c);
k1=c(_i);
k2=c(_j);
Pair=catx('_','Pair',_i,_j);
if k1=k2=key then category="same resp and both wrong";
else if k1=k2^=key then category="same resp and both wrong";
else if k1=key or k2=key then category="different resp and one of them correct";
output;
end;
end;
keep itempos key category pair;
run;
proc transpose data=want out=final_want(drop=_:);
by itempos key notsorted;
var category;
id pair;
run;
Feel free to let us know if you need any clarifications. All the best!
your solution worked!
Thank you so much! give you a big thumb up!!
Because I was so eager to get correct solutions in my previous post, while I was waiting for help, I reposted it before I get your correct solution.
@superbug No worries, No matter what, there is never a need to duplicate for the simple reason 1. often confuses which one to respond to 2. Ethically doesn't go well 3. Yes I fully understand, sometimes the requirement is urgent and can get us panicky, however still better to stick to one and wait.
Trust me, there are about 30+ of us who will respond and will attempt to offer the much needed solution. Particularly on gaming alike these kinda questions we treat this as fun. For a rather advisory role, I don't but on the other hand I seek and take advice from senior people like @Reeza (Thanking Reeza for merging the thread).
So either way, you have it all. Just be patient for one of the 30+ mentioned to respond, and on top of that a new person outside of the expected 30+ has surprised many a times.
All the best! Take care and have fun!
You SAS expert / masters are so amazing! I feel so fortunate to join the SAS support community recently.
A follow up problem need your help.
After I get those1326 pairs, I use the following to get freq on the category of each pair
proc freq data=final_want;
table Pair_:/out=freq;
run;
I got the freq tables in the "Results Viewer". For example, "pair_1_2" looks like following
Pair_1_2 | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
---|---|---|---|---|
different resp and both wrong | 35 | 17.50 | 35 | 17.50 |
different resp and one of them | 37 | 18.50 | 72 | 36.00 |
same resp and both correct | 118 | 59.00 | 190 | 95.00 |
same resp and both wrong | 10 | 5.00 | 200 | 100.00 |
since each pair has a freq table, could you please let me know how to stack all of those freq table? Then using the stacked freq table, I want to divided "frequency" by 149, and get a "new_pct" variable, the results I wanted will look like the following:
category | freq | pct_new | ||
different resp and both wrong | 35 | 149 | 0.23 | |
different resp and one of them | 37 | 149 | 0.25 | |
pair_1_2 | same resp and both correct | 118 | 149 | 0.79 |
same resp and both wrong | 10 | 149 | 0.07 | |
different resp and both wrong | 42 | 149 | 0.28 | |
pari_1_3 | different resp and one of them | 34 | 149 | 0.23 |
same resp and both correct | 114 | 149 | 0.77 | |
same resp and both wrong | 10 | 149 | 0.07 |
Many thanks!!
Hi @superbug That looks like you want a Tabulated report(cross) report. Am i correct? If yes, the leads to the utilization of PROC TABULATE / PROC REPORT . If this understanding of mine is correct, You are better off starting a new thread mentioning what you HAVE (the outcome of proc freq) and the report design structure you want.
That would make it easy for your to follow and for respondents to offer pointed answers. People like Ballardw , Reeza are super proficient in Report procedures while I am mostly into video games(aka programming) . Makes sense?
PS Always make sure your IF THENs are properly tested with solutions I offer in this or any other thread because sometimes I write so fast making me a typo champion. lol
Thanks much for your advice!
a while ago, you help about pick a 2 people from a certain number of people. Now if I want to choose 4 people from a certain number of people, I am trying to modify the code you help, but there are error messages, could you help please. Below is the code for pick 4 people from a certain number of people. Thanks much!
data comb1;
set comb;
array c col:;
length cat $30 Pair $20;
do _i=1 to dim(c)-1;
do _j=_i+1 to dim(c);
do _m=_j+1 to dim(c)+1;
do _n=_m+1 to dim(c)+2;
k1=c(_i);
k2=c(_j);
k3=c(_m);
k4=c(_n);
Pair=catx('_','Pair',_i,_j,_m,_n);
if k1=k2=k3=k4=itemkey then cat="same resp and all correct";
else if k1=k2=k3=k4^=itemkey then cat="same resp and all wrong";
else if _i=itemkey or _j=itemkey or _m=itemkey or _n=itemkey then cat="different resp and one of them correct";
else cat="different resp and all wrong";
output;
end;
end;
end;
end;
keep itempos itemkey cat pair;
run;
Hi @superbug Pleasure is and would be mine if i could help. If it's not urgent, can you hang in there till tomorrow EST (US time ) as I'm doing the weekend chores like laundry and cleaning at the moment. Stay safe
Thank you for replying my post. Sorry for bothering you on the weekend. Of course I'll wait until you are available.
Enjoy doing the weekend chores.
Stay safe!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.