Dear Consultants,
For the first time I used data step for creating dummy dataset(as I always get data in CSV files and never got a chance to use input statements), I hope now nobody will get anxious. Here is the dataset.
data dummy;
infile datalines missover;
input ID C1$ C2$ C3$ C4$ C5$ C6$ C7$ C8$ C9$ C10$ S1$ S2$ S3$ S4$ S5$;
datalines;
1234 a b . . . . . . . . b b d .
2235 b d a . . . . . . . a d a s
2568 a s d f g h i . . . x y z
2698 x x x . . . . . . . x x y z
5545 a x d f f f f f f f f s d f
5625 f k l a f . . . . . y t . . .
;
run;
Now from this dataset I want only those records in which values in columns C (C1 to C10) are repeated in columns S (S1 to S5), something like below.
1234 a b . . . . . . . . b b d .
2235 b d a . . . . . . . a d a s
2698 x x x . . . . . . . x x y z
5545 a x d f f f f f f f f s d f
Waiting for the expert advise.
Next code will do what you want:
data dummy;
infile datalines missover;
input ID C1$ C2$ C3$ C4$ C5$ C6$ C7$ C8$ C9$ C10$ S1$ S2$ S3$ S4$ S5$;
datalines;
1234 a b . . . . . . . . b b d .
2235 b d a . . . . . . . a d a s
2568 a s d f g h i . . . x y z
2698 x x x . . . . . . . x x y z
5545 a x d f f f f f f f f s d f
5625 f k l a f . . . . . y t . . .
;
run;
data want;
set dummy;
array cx $ c1-c10;
array sx $ s1-s10;
flag=0;
do i=1 to dim(cx);
if cx(i) not in (' ','.') and
whichc(cx(i),s1,s2,s3,s4,s5) then flag=1;
end;
if flag;
run;
Minor efficiency comment. Consider changing
do i=1 to dim(cx);
to
do i=1 to dim(cx) while (flag^=1);
You want that character appeared at least twice in s1-s5 ?
data dummy;
infile datalines missover;
input ID C1$ C2$ C3$ C4$ C5$ C6$ C7$ C8$ C9$ C10$ S1$ S2$ S3$ S4$ S5$;
datalines;
1234 a b . . . . . . . . b b d .
2235 b d a . . . . . . . a d a s
2568 a s d f g h i . . . x y z
2698 x x x . . . . . . . x x y z
5545 a x d f f f f f f f f s d f
5625 f k l a f . . . . . y t . . .
;
run;
data want;
set dummy;
array cx{*} $ c1-c10;
array sx{*} $ s1-s10;
do i=1 to dim(cx);
n=countc(cats(of sx{*}),cx{i});
if n gt 1 then do;output;leave;end;
end;
drop i n;
run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.