Hello Everyone,
I have the following problem and cannot fully solve. Please help me if you could.
I have 2 data set: (a) original with f1-f4 variables with many records and (b) filter which show 1 condition/value for f1 f2 f4
I want to create a new column "number" in the original file to count the number of time a given variable in original data has the same value as stated in filter file.
For the below example, the first record should has "number"= 3 as f1 f2 f4 match value in filter.
second record should has "number"=2 (f1 f2)
third record should has "number"=1 (f1).
I try to write a code but somehow the output file only has 1 record.
Please help me with my code or create a new one.
Thank you very much.
HHC
data originial;
input id f1 f2 f3 f4 ;
datalines;
1 0 1 2 4
2 0 1 4 6
3 0 8 6 9
4 1 3 6 8
5 8 6 5 3
;
data filter;
input f1 f2 f4;
datalines;
0 1 4
;
data want;
set originial;
array ori{4} f1 f2 f3 f4;
number=0;
set filter ;
array fil{3} f1 f2 f4;
do i=1 to 4;
do j=1 to 3;
if vname(ori{i})=vname(fil{j}) and ori{i}=fil{j} then do; number=number+1;end;
end;
end;
run;
I apparently misunderstood what you are trying to do. If both the variables and values have to be the same, then you might be looking for something like:
data want (drop=_:);
if _n_ eq 1 then set filter (rename=(f1=_f1 f2=_f2 f4=_f4)) ;
array ori{3} f1 f2 f4;
array fil{3} _f1 _f2 _f4;
set originial;
number=0;
do _n_=1 to 3;
if fil{_n_} eq (ori{_n_}) then number=number+1;
end;
run;
I think that the following does what you want but, if I correctly understand what you are trying to do, the second record should get a number result of 3.
data want;
if _n_ eq 1 then set filter (rename=(f1=_f1 f2=_f2 f4=_f4)) ;
array ori{4} f1 f2 f3 f4;
number=0;
array fil{3} _f1 _f2 _f4;
set originial;
do i=1 to 3;
test=0;
do j=1 to 4;
if fil{i} eq (ori{j}) then test=1;
end;
if test then number+1;
end;
run;
Thank you, Arthur for your help.
I am still trying to follow your code.
For the second record, as f1=0, f2=1 and f4=6 and filter is f1=0, f2=1 and f4=4 , only 2 condition met and number should be 2.
I think in IF statement, we still need condition of Name variable = Name variable vname(ori{i})=vname(fil{j}
HHC
I apparently misunderstood what you are trying to do. If both the variables and values have to be the same, then you might be looking for something like:
data want (drop=_:);
if _n_ eq 1 then set filter (rename=(f1=_f1 f2=_f2 f4=_f4)) ;
array ori{3} f1 f2 f4;
array fil{3} _f1 _f2 _f4;
set originial;
number=0;
do _n_=1 to 3;
if fil{_n_} eq (ori{_n_}) then number=number+1;
end;
run;
Thanks, Arthur.
I follow your approach and eventually, I make a transpose dataset to record the filter. So I can just do i=1 to N.
By the way, I am not clear how SAS read " if _n_ eq 1 then set filter" vs "set filter". Could you tell me why?
Many thanks.
HHC
data originial;
input id f1 f2 f3 f4 ;
datalines;
1 0 1 2 99
2 0 1 4 6
3 0 8 6 7
4 1 3 6 8
5 9 9 9 9
;
data filter;
input f1 f2;
datalines;
0 1
;
data ff; set originial;
if first.id; run;
data ff; set ff filter;
drop id;run;
proc transpose data=ff out=f3;run;
data f3; set f3; drop _name_;
proc transpose data=f3 out=f4;run;
data f4; set f4;drop _name_;run;
*-----------------------------------------------------------;
data want;
if _n_ eq 1 then set f4;
number=0;
array fil{4} col1-col4;
set originial;
array ori{4} f1 f2 f3 f4;
test=0;
do I=1 to 4;
if fil{i} eq (ori{i}) then NUMBER=NUMBER+1;
end;
run;
*-----------------------------------------------------------;
I'm not sure why you tried an approach by making copies of the original files, then transposing them, and then trying to obtain the desired results. One of your data steps wouldn't even run as you had used first.id without ever including a by id statement.
However, the reason I used if _n_ eq 1 to set filter was so that SAS wouldn't stop reading records after it read the one record in the filter data set. With the 'if _n_ eq 1 then' condition established, SAS will happily read all of the records from original. That was the reason why your original attempt only read one record.
Thank you for your help, Arthur.
The reason I did it is that there are about 100 condition and I dont want to manually count the position to put in DO statement.
HHC
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.