Dear Consultants,
For the first time I used data step for creating dummy dataset(as I always get data in CSV files and never got a chance to use input statements), I hope now nobody will get anxious. Here is the dataset.
data dummy;
infile datalines missover;
input ID C1$ C2$ C3$ C4$ C5$ C6$ C7$ C8$ C9$ C10$ S1$ S2$ S3$ S4$ S5$;
datalines;
1234 a b . . . . . . . . b b d .
2235 b d a . . . . . . . a d a s
2568 a s d f g h i . . . x y z
2698 x x x . . . . . . . x x y z
5545 a x d f f f f f f f f s d f
5625 f k l a f . . . . . y t . . .
;
run;
Now from this dataset I want only those records in which values in columns C (C1 to C10) are repeated in columns S (S1 to S5), something like below.
1234 a b . . . . . . . . b b d .
2235 b d a . . . . . . . a d a s
2698 x x x . . . . . . . x x y z
5545 a x d f f f f f f f f s d f
Waiting for the expert advise.
Next code will do what you want:
data dummy;
infile datalines missover;
input ID C1$ C2$ C3$ C4$ C5$ C6$ C7$ C8$ C9$ C10$ S1$ S2$ S3$ S4$ S5$;
datalines;
1234 a b . . . . . . . . b b d .
2235 b d a . . . . . . . a d a s
2568 a s d f g h i . . . x y z
2698 x x x . . . . . . . x x y z
5545 a x d f f f f f f f f s d f
5625 f k l a f . . . . . y t . . .
;
run;
data want;
set dummy;
array cx $ c1-c10;
array sx $ s1-s10;
flag=0;
do i=1 to dim(cx);
if cx(i) not in (' ','.') and
whichc(cx(i),s1,s2,s3,s4,s5) then flag=1;
end;
if flag;
run;
Minor efficiency comment. Consider changing
do i=1 to dim(cx);
to
do i=1 to dim(cx) while (flag^=1);
You want that character appeared at least twice in s1-s5 ?
data dummy;
infile datalines missover;
input ID C1$ C2$ C3$ C4$ C5$ C6$ C7$ C8$ C9$ C10$ S1$ S2$ S3$ S4$ S5$;
datalines;
1234 a b . . . . . . . . b b d .
2235 b d a . . . . . . . a d a s
2568 a s d f g h i . . . x y z
2698 x x x . . . . . . . x x y z
5545 a x d f f f f f f f f s d f
5625 f k l a f . . . . . y t . . .
;
run;
data want;
set dummy;
array cx{*} $ c1-c10;
array sx{*} $ s1-s10;
do i=1 to dim(cx);
n=countc(cats(of sx{*}),cx{i});
if n gt 1 then do;output;leave;end;
end;
drop i n;
run;
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.