I am looking a way to create a subset on all columns name that "start with" a specific character. My data contains the same column name multiple time with a digit number at the end and I want the code to always select all the columns regardless the last digit numbers and give the filter criteria on all the columns together .
So here from the above table , I want only those observations where any of these columns starting with "apple" contains "one" .
Thank you
Please do not supply example data in pictures, we can't use pictures in SAS code; do it with data steps with datalines, like this:
data have;
input no (apple1-apple5) ($);
datalines;
1 one two two two two
2 two tw two three two
;
proc transpose
data=have
out=long (rename=(col1=apple))
;
by no;
var apple:;
run;
data want_long;
merge
long
long (
in=want
keep=no apple
rename=(apple=_apple)
where=(_apple = "one")
)
;
by no;
if want;
run;
proc transpose
data=want_long
out=want_wide (drop=_name_)
;
by no;
var apple;
id _name_;
run;
You will most probably find that the long layout (created by the first TRANSPOSE) is better for your work with data. Also see Maxim 19.
Here is an array solution:
data want;
set have;
array apples(*) apple1-apple5;
do _N_=1 to dim(apples);
if apples(_N_)=:'one' then do;
output;
leave;
end;
end;
run;
If you have various datasets, with different numbers of "apples", you can change the array declaration to
array apples(*) apple:;
as long as no other columns have names that begin with "apple".
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.