Hi Everyone,
I have: 1 data-file with variable (a b c d...) and 1 condition-file.
Each row in the condition file contains certain condition (say a=1 and b=3 and c=5)
I want to count the number of records in the data-file that meet each condition in condition-file
So for the first condition: a =4 b =4 c =5 , there are 2 record in data-file meet that criteria.
and for the first condition: b =5 c =9 d =6 , there are 1 record in data-file meet that criteria.
I try to change a PROC IML code but it will not work.
Any help to fix the code or make new code is very much appreciated.
HHC
data have;
input date a b c d;
datalines;
1 4 4 5 5
2 4 4 5 9
3 4 5 5 0
4 3 6 8 9
5 3 5 0 0
6 4 5 1 2
7 6 5 9 6
;run;
data condition;
input id a_name $ a_value b_name $ b_value c_name $ c_value;
datalines;
1 a 4 b 4 c 5
2 b 5 c 9 d 6
;run;
proc iml;
use have(drop=date);
read all var _all_ into have[c=vnames];
close;
use condition;
read all var{a_name a_value b_name b_value c_name c_value};
close;
n=nrow(a_name);
do i=1 to n;
temp=have[,a_name[i]]||have[,b_name[i]||have[,c_name[i]];
idx=loc(temp[,1]=a_value[i] | temp[,2]=b_value[i] | temp[,3]=c_value[i]);
want=temp[idx,];
var1=var1//a_name[i];
var2=var2//b_name[i];
var3=var3//c_name[i];
value1=value1//a_value[i];
value2=value2//b_value[i];
value3=value3//c_value[i];
Total_N=Total_N//nrow(want);
N_match=N_match//ncol(loc(temp[,1]=a_value[i] & temp[,2]=b_value[i] & temp[,3]=c_value[i]));
end;
create want var {var1 var2 var3 value1 value2 value3 Total_N N_match};
append;
close;
quit;
As far as I can see your code is correct, though you were missing a ] along the way.
This code should work and I have commented where the error was.
data have;
input date a b c d;
datalines;
1 4 4 5 5
2 4 4 5 9
3 4 5 5 0
4 3 6 8 9
5 3 5 0 0
6 4 5 1 2
7 6 5 9 6
;run;
data condition;
input id a_name $ a_value b_name $ b_value c_name $ c_value;
datalines;
1 a 4 b 4 c 5
2 b 5 c 9 d 6
;run;
proc iml;
use have(drop=date);
read all var _all_ into have[c=vnames];
close have;
print have;
use condition;
read all var{a_name a_value b_name b_value c_name c_value};
close condition;
n=nrow(a_name);
do i=1 to n;
temp=have[,a_name[i]] || have[,b_name[i]] || have[,c_name[i]]; /* Your mistake was here :) */
idx=loc( temp[,1] = a_value[i] | temp[,2] = b_value[i] | temp[,3] = c_value[i] );
want=temp[idx,];
var1=var1//a_name[i];
var2=var2//b_name[i];
var3=var3//c_name[i];
value1=value1//a_value[i];
value2=value2//b_value[i];
value3=value3//c_value[i];
Total_N=Total_N//nrow(want);
N_match=N_match//ncol(loc(temp[,1]=a_value[i] & temp[,2]=b_value[i] & temp[,3]=c_value[i]));
end;
create want var {var1 var2 var3 value1 value2 value3 Total_N N_match};
append;
close want;
quit;
The N_MATCH variable gives the desired values as below
I just updated the post, I hope it is clearer now.
Thank you.
HC
As far as I can see your code is correct, though you were missing a ] along the way.
This code should work and I have commented where the error was.
data have;
input date a b c d;
datalines;
1 4 4 5 5
2 4 4 5 9
3 4 5 5 0
4 3 6 8 9
5 3 5 0 0
6 4 5 1 2
7 6 5 9 6
;run;
data condition;
input id a_name $ a_value b_name $ b_value c_name $ c_value;
datalines;
1 a 4 b 4 c 5
2 b 5 c 9 d 6
;run;
proc iml;
use have(drop=date);
read all var _all_ into have[c=vnames];
close have;
print have;
use condition;
read all var{a_name a_value b_name b_value c_name c_value};
close condition;
n=nrow(a_name);
do i=1 to n;
temp=have[,a_name[i]] || have[,b_name[i]] || have[,c_name[i]]; /* Your mistake was here :) */
idx=loc( temp[,1] = a_value[i] | temp[,2] = b_value[i] | temp[,3] = c_value[i] );
want=temp[idx,];
var1=var1//a_name[i];
var2=var2//b_name[i];
var3=var3//c_name[i];
value1=value1//a_value[i];
value2=value2//b_value[i];
value3=value3//c_value[i];
Total_N=Total_N//nrow(want);
N_match=N_match//ncol(loc(temp[,1]=a_value[i] & temp[,2]=b_value[i] & temp[,3]=c_value[i]));
end;
create want var {var1 var2 var3 value1 value2 value3 Total_N N_match};
append;
close want;
quit;
The N_MATCH variable gives the desired values as below
Oh, that is great.
Thanks a lot.
HC
HaHa. You nailed it. you really should learn some IML code . it is good for you. data have; input date a b c d; datalines; 1 4 4 5 5 2 4 4 5 9 3 4 5 5 0 4 3 6 8 9 5 3 5 0 0 6 4 5 1 2 7 6 5 9 6 ;run; data condition; input id a_name $ a_value b_name $ b_value c_name $ c_value; datalines; 1 a 4 b 4 c 5 2 b 5 c 9 d 6 ;run; proc iml; use have(drop=date); read all var _all_ into have[c=vnames]; close; use condition; read all var{id a_name a_value b_name b_value c_name c_value}; close; n=nrow(a_name); count=j(n,1,.); do i=1 to n; count[i]=sum(have[,a_name[i]]=a_value[i] & have[,b_name[i]]=b_value[i]& have[,c_name[i]]=c_value[i] ); end; create want var{id a_name a_value b_name b_value c_name c_value count}; append; close; quit;
Can you tell me what the meaning of have[c=vnames];
I dont see you use "c" anywhere but when I change "c" into something else, code not work. But "vnames", I can change to anything.
read all var _all_ into have[c=vnames];
HC
Yes. you have to specify "c=vnames" ,
It doesn't matter what 'vnames' is, you can name it any variable name.
Specify it is for using the variable name reference .
like
have[,a_name[i]]
note here a_name[i] is 'a' not 1 ,
so "c=vnames" is making sure you can refer to column of matrix as character 'a'.
Default index is like 1,2,........
"c=vnames" is copying variable names into row vector 'vnames'. C here is keyword means column
Thanks a lot.
I will try 🙂
HC
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.