Hi everyone,
I'm about to cry because of SAS! I feel like I'm going crazy! Here's my issue: I want to create variables to flag specified keywords that are found in a TOPIC variable. Here's the code that I have:
data test ; format topics $char256. ; infile datalines dsd dlm=":"; input topics $ ; datalines; PHYSICAL COMPUTING BASIC COMPUTER TEACHING PROJECTS USING WORD PROCESSING AND DATA ANALYZING POWERPOINT GAME DESIGN WEB DESIGN AND DEVELOPMENT MOBILE APP DEVELOPMENT PHYSICAL COMPUTING MAKER ACTIVITIES WEB DESIGN AND DEVELOPMENT TEXT-BASED PROGRAMMING ; run; %macro test() ; DATA test ; SET test ; array keywords [6] $ t1-t6 ('PHYSICAL', 'ACTIVITES', 'GAME', 'DATA', 'CODING', 'MOBILE') ; array varnames [6] physcomp makeract gamedesign datasci coding mobiledev ; %do i = 1 %to 6 ; if findw(topics, keywords[&i]) > 0 then varnames[&i] = 1 ; else varnames[&i] = 0 ; %end ; RUN ; %mend ; %test() ;
which produces the following data:
PHYSICAL COMPUTING BASIC COMPUTER TEACHING PROJECTS USING WORD PROCESSING AND DATA ANALYZING POWERPOINT
1 0 0 0 0 0
GAME DESIGN
0 0 0 0 0 0
WEB DESIGN AND DEVELOPMENT MOBILE APP DEVELOPMENT
0 0 0 0 0 0
PHYSICAL COMPUTING MAKER ACTIVITIES WEB DESIGN AND DEVELOPMENT TEXT-BASED PROGRAMMING
1 0 0 0 0 0
Huh? Why isn't GAME in the second observation being recognized? Or MOBILE in the third observation? Why is PHYSICAL the only keyword being recognized? Please help, I'm at my wit's end!!!
I can't test and explore right now, but here's an idea to examine.
When you create character variables with an array the default length is 8. So "GAME" is not the value "GAME " is the value. I'm not sure how FINDW handles that, but you can easily find out. Also note that some of your keywords are longer than 8 characters long, so the values are being chopped off.
Also, the order of the keyword/varnames list doesn't matter -- even if I put game design first, it still only recognizes physical computing.
I can't test and explore right now, but here's an idea to examine.
When you create character variables with an array the default length is 8. So "GAME" is not the value "GAME " is the value. I'm not sure how FINDW handles that, but you can easily find out. Also note that some of your keywords are longer than 8 characters long, so the values are being chopped off.
Hi @jl13491 and welcome to the SAS Support Communities!
Don't worry, FINDW can really be confusing. Only minutes before I read your post I struggled with that function in a different context ...
I see three issues:
array keywords [6] $10 ...
findw(topics, trim(keywords[&i]))
In addition to the above correct comments, you probably want to avoid using macros and macro variables when they are not necessary. Arrays contain all the features you need to do this, without macros.
DATA test1;
SET test ;
array keywords [6] $ 10 _temporary_ ('PHYSICAL', 'ACTIVITIES', 'GAME', 'DATA', 'CODING', 'MOBILE') ;
array varnames [6] physcomp makeract gamedesign datasci coding mobiledev;
do i = 1 to 6 ;
if findw(topics, trim(keywords[i]))> 0 then varnames[i] = 1 ; else varnames[i]=0;
end ;
drop i;
RUN ;
@JackHamilton said:
If you had dozens or hundreds of keywords instead of half a dozen, it might be better to put the keywords and corresponding variable names into a SAS data set, and use a macro to generate the two ARRAY statements. DOSUBL might be even easier.
Why do you say that?
Also, you might want to look at the R or T modifier in the FINDW function as an alternative to using the TRIM function.
I should have thought of that!
/* Create a SAS data set containing words to be looked for and */
/* their associated variables. */
data mylist;
input @1 lookfor $32.
@34 varname $32. ;
datalines4;
PHYSICAL physcomp
ACTIVITIES makeract
GAME gamedesign
DATA datasci
CODING coding
MOBILE mobiledev
;;;;
/*** Method 1 ***/
/* Create global macro variables containing expanded parts of array statements. */
%global lookfor_list varname_list list_length;
proc sql noprint;
select
quote(trim(lookfor), "'") ,
trim(varname)
into
:LOOKFOR_LIST separated by ' ' ,
:VARNAME_LIST separated by ' '
from
mylist;
%let list_length = &SQLOBS. ;
quit;
%put INFO: &=LOOKFOR_LIST ;
%put INFO: &=VARNAME_LIST ;
%put INFO: &=LIST_LENGTH ;
data _null_;
array keywords [&LIST_LENGTH.] $32 _temporary_ (&LOOKFOR_LIST.) ;
array varnames [&LIST_LENGTH.] 4 &VARNAME_LIST. ;
put keywords[1]=;
run;
%symdel lookfor_list varname_list list_length;
/*** Method 2 ***/
/* create global macro variables containing complete array statements. */
%macro make_code();
%global lookfor_code varname_code;
%local lookfor_list varname_list list_length;
proc sql noprint;
select
quote(trim(lookfor), "'") ,
trim(varname)
into
:LOOKFOR_LIST separated by ' ' ,
:VARNAME_LIST separated by ' '
from
mylist;
%let list_length = &SQLOBS. ;
quit;
%let lookfor_code = array keywords [&LIST_LENGTH.] $32 _temporary_ (&LOOKFOR_LIST.) ;
%let varname_code = array varnames [&LIST_LENGTH.] 4 &VARNAME_LIST. ;
%put INFO: &=LOOKFOR_CODE;
%put INFO: &=VARNAME_CODE;
%mend make_code;
%make_code();
data _null_;
&LOOKFOR_CODE. ;
&VARNAME_CODE. ;
put keywords[1]=;
run;
%symdel lookfor_code varname_code;
/*** Method 3 ***/
/* Create a macro that inserts code generated from SAS data set, */
/* without causing a section break. In real life, this would be */
/* parameter-driven. */
%macro use_dosub();
%local lookfor_code varname_code;
%local lookfor_list varname_list list_length;
%let dosubl_rc = %sysfunc(dosubl(%nrstr(
proc sql noprint;
select
quote(trim(lookfor), "'") ,
trim(varname)
into
:LOOKFOR_LIST separated by ' ' ,
:VARNAME_LIST separated by ' '
from
mylist;
%let list_length = &SQLOBS. ;
quit;
%let lookfor_code = array keywords [&LIST_LENGTH.] $32 _temporary_ (&LOOKFOR_LIST.) ;
%let varname_code = array varnames [&LIST_LENGTH.] 4 &VARNAME_LIST. ;
)));
%put INFO: RC from &=DOSUBL_RC;
%put INFO: &=LOOKFOR_CODE;
%put INFO: &=VARNAME_CODE;
&LOOKFOR_CODE;
&VARNAME_CODE;
%mend use_dosub;
data _null_;
%use_dosub();
put keywords[1]=;
run;
%put _user_;
This is using a sledgehammer for a penny nail, but sometimes the ability to create code inline without generating a section break is very useful.
@JackHamilton wrote:
Personally, I find it easier to maintain a long narrow list than a short wide list, and if the list is coming from an external source it's usually easier to read it in a data step than in a macro. Data-driven instead of manually maintained array statements.
Good explanation! Thanks!
Here is example that remove macros.
First create some sample datasets. One with the list of strings to search and a second with the list of words to find.
data test ;
infile datalines truncover;
row+1;
input topics $256. ;
datalines;
PHYSICAL COMPUTING BASIC COMPUTER TEACHING PROJECTS USING WORD PROCESSING AND DATA ANALYZING POWERPOINT
GAME DESIGN
WEB DESIGN AND DEVELOPMENT MOBILE APP DEVELOPMENT
PHYSICAL COMPUTING MAKER ACTIVITIES WEB DESIGN AND DEVELOPMENT TEXT-BASED PROGRAMMING
;
data keywords;
key+1;
input keyword $30.;
datalines;
PHYSICAL
ACTIVITIES
GAME
DATA
CODING
MOBILE
;
Then do a cross join to match every keyword to every string and then using FINDW() to check if the keyword is in the string.
If you want the wide format then use PROC TRANSPOSE to create it. If the goal is the wide structure then make the join a VIEW.
proc sql ;
create view tall as
select a.*,b.*
, 0<findw(topics,keyword,' ','sit') as found
from test a
, keywords b
order by row,key
;
quit;
proc transpose data=tall out=want(drop=_name_) prefix=keyword;
by row topics;
id key;
idlabel keyword;
var found;
run;
Let's check the results:
proc print;
var row key: ;
run;
proc print label;
var row key: ;
run;
I usually find that I need to train FINDW() to SIT for it be useful.
i or I
ignores the case of the characters.
s or S
adds space characters (blank, horizontal tab, vertical tab, carriage return, line feed, and form feed) to the list of characters.
t or T
trims trailing blanks from the string, word, and character arguments.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.