Summary: I have a very large table (more than 1000 columns) where I need to find (create a list that I can copy from) the variable names that have values that contain the string "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX".
I have sample code I wrote and some sample data to help:
data have;
input cust_id lorem $ ipsum $ dolor $ sit $ amet $ consectetur $ adipiscing $ elit $ sed $ do $ eiusmod $ tempor $
;
datalines;
1 XXXX KW23 RW16 BA62 KR39 HO32 FX55 VT44 RG81 XXXX SI10 ZK27
1 WW10 ST89 RT44 YQ20 FD70 ZB99 MF40 TV38 HE86 XXXX MU18 OC98
1 IQ59 VM72 XXXX FI95 AT96 GU59 JO89 AW21 FV34 XXXX TU46 TK69
2 IX95 KV92 NU30 BZ39 YS62 VU51 RW99 YX60 AG35 XXXX CM70 BV53
2 XXXX FS77 QI34 NT73 CW86 ED83 HU82 PY22 WS60 XXXX QM63 AN29
3 UP55 RU88 DO14 YW45 OR14 KH75 RA76 MZ65 FD65 XXXX BR51 TE67
4 VX45 NP14 II57 VR57 WY25 FH58 EZ58 XXXX BJ11 XXXX TH83 FY17
4 UJ73 HE29 GV27 FT69 RS17 MK32 CE47 ZJ30 KK48 XXXX KD69 XXXX
4 ZE25 UM46 XXXX UC55 XX99 YS61 YJ23 IM52 ZH76 XXXX MO29 XXXX
5 EA96 AY41 DO99 US92 VC61 HV97 ZR44 CF46 YN99 HC55 HC72 XXXX
;
run;
quit;
data have_char;
set have (KEEP=_CHARACTER_);
run;
proc contents data = have_char NOPRINT OUT=VBLS (KEEP=NAME VARNUM); run;
proc sort data = VBLS; by VARNUM; run;
PROC SQL NOPRINT;
SELECT DISTINCT NAME
INTO :var_list separated by " "
FROM VBLS;
QUIT;
data want_if_it_worked (keep=VARS_WITH_XXXX);
set have;
array val{12} &var_list.;
VARS_WITH_XXXX=vname(val(whichc("XXXX",of val(*))));
run;
This is the table I am getting with the above code that I do not want:
Below is the table I want to output from the "have" data:
Thanks for the help!
data have;
input cust_id lorem $ ipsum $ dolor $ sit $ amet $ consectetur $ adipiscing $ elit $ sed $ do $ eiusmod $ tempor $
;
datalines;
1 XXXX KW23 RW16 BA62 KR39 HO32 FX55 VT44 RG81 XXXX SI10 ZK27
1 WW10 ST89 RT44 YQ20 FD70 ZB99 MF40 TV38 HE86 XXXX MU18 OC98
1 IQ59 VM72 XXXX FI95 AT96 GU59 JO89 AW21 FV34 XXXX TU46 TK69
2 IX95 KV92 NU30 BZ39 YS62 VU51 RW99 YX60 AG35 XXXX CM70 BV53
2 XXXX FS77 QI34 NT73 CW86 ED83 HU82 PY22 WS60 XXXX QM63 AN29
3 UP55 RU88 DO14 YW45 OR14 KH75 RA76 MZ65 FD65 XXXX BR51 TE67
4 VX45 NP14 II57 VR57 WY25 FH58 EZ58 XXXX BJ11 XXXX TH83 FY17
4 UJ73 HE29 GV27 FT69 RS17 MK32 CE47 ZJ30 KK48 XXXX KD69 XXXX
4 ZE25 UM46 XXXX UC55 XX99 YS61 YJ23 IM52 ZH76 XXXX MO29 XXXX
5 EA96 AY41 DO99 US92 VC61 HV97 ZR44 CF46 YN99 HC55 HC72 XXXX
;
run;
proc transpose data=have(obs=0) out=vname;
var _character_;
run;
data _null_;
set vname end=last;
if _n_=1 then call execute('proc sql;create table temp as select ');
call execute(catx(' ','max(upcase(',_name_,')="XXXX") as ',_name_));
if not last then call execute(',');
else call execute('from have ;quit;');
run;
proc transpose data=temp out=want(where=(col1=1)) name=vars_with_xxxx;
var _all_;
run;
Like this?
data WANT;
set HAVE;
array C[*] _CHARACTER_;
if whichc('XXXX',of C[*]) then
FIRSTVAR_WITH_XXX=vname(C[whichc('XXXX',of C[*])]);
run;
FIRSTVAR_WITH_XXX
lorem
do
dolor
do
lorem
do
elit
do
dolor
tempor
data have;
input cust_id lorem $ ipsum $ dolor $ sit $ amet $ consectetur $ adipiscing $ elit $ sed $ do $ eiusmod $ tempor $
;
datalines;
1 XXXX KW23 RW16 BA62 KR39 HO32 FX55 VT44 RG81 XXXX SI10 ZK27
1 WW10 ST89 RT44 YQ20 FD70 ZB99 MF40 TV38 HE86 XXXX MU18 OC98
1 IQ59 VM72 XXXX FI95 AT96 GU59 JO89 AW21 FV34 XXXX TU46 TK69
2 IX95 KV92 NU30 BZ39 YS62 VU51 RW99 YX60 AG35 XXXX CM70 BV53
2 XXXX FS77 QI34 NT73 CW86 ED83 HU82 PY22 WS60 XXXX QM63 AN29
3 UP55 RU88 DO14 YW45 OR14 KH75 RA76 MZ65 FD65 XXXX BR51 TE67
4 VX45 NP14 II57 VR57 WY25 FH58 EZ58 XXXX BJ11 XXXX TH83 FY17
4 UJ73 HE29 GV27 FT69 RS17 MK32 CE47 ZJ30 KK48 XXXX KD69 XXXX
4 ZE25 UM46 XXXX UC55 XX99 YS61 YJ23 IM52 ZH76 XXXX MO29 XXXX
5 EA96 AY41 DO99 US92 VC61 HV97 ZR44 CF46 YN99 HC55 HC72 XXXX
;
run;
proc transpose data=have(obs=0) out=vname;
var _character_;
run;
data _null_;
set vname end=last;
if _n_=1 then call execute('proc sql;create table temp as select ');
call execute(catx(' ','max(upcase(',_name_,')="XXXX") as ',_name_));
if not last then call execute(',');
else call execute('from have ;quit;');
run;
proc transpose data=temp out=want(where=(col1=1)) name=vars_with_xxxx;
var _all_;
run;
A slightly different twist with additional information. Once you have read in the data:
data want;
set have;
array xx {*} _character_;
do _n_ = 1 to dim(xx);
if xx{_n_} = 'XXXX' then do;
xxxx_name = vname(xx{_n_});
output;
end;
end;
keep xxxx_name;
run;
proc freq data=want;
tables xxxx_name;
run;
This way, you not only get the list of variables names, you also get a count of how many times each name contains XXXX.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.