I am looking a way to create a subset on all columns name that "start with" a specific character. My data contains the same column name multiple time with a digit number at the end and I want the code to always select all the columns regardless the last digit numbers and give the filter criteria on all the columns together .
So here from the above table , I want only those observations where any of these columns starting with "apple" contains "one" .
Thank you
Please do not supply example data in pictures, we can't use pictures in SAS code; do it with data steps with datalines, like this:
data have;
input no (apple1-apple5) ($);
datalines;
1 one two two two two
2 two tw two three two
;
proc transpose
data=have
out=long (rename=(col1=apple))
;
by no;
var apple:;
run;
data want_long;
merge
long
long (
in=want
keep=no apple
rename=(apple=_apple)
where=(_apple = "one")
)
;
by no;
if want;
run;
proc transpose
data=want_long
out=want_wide (drop=_name_)
;
by no;
var apple;
id _name_;
run;
You will most probably find that the long layout (created by the first TRANSPOSE) is better for your work with data. Also see Maxim 19.
Here is an array solution:
data want;
set have;
array apples(*) apple1-apple5;
do _N_=1 to dim(apples);
if apples(_N_)=:'one' then do;
output;
leave;
end;
end;
run;
If you have various datasets, with different numbers of "apples", you can change the array declaration to
array apples(*) apple:;
as long as no other columns have names that begin with "apple".
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.