Hi everyone, here is my issue: I have two available tables: -a table list_words with one string variable named words) and 10 individuals (or p individuals in general, with p supposed as known) -a table raw_data with one string variable (maned text) and 20 individuals (or n individuals in general, with n supposed as known) I would like to create a new variable main_word within the raw_data table. For each individual i of the raw_data table, if one of the words (i.e. a value of the variable words of the raw_data table) can be found within the text, then the i-th element of raw_data takes the found word as a value for main_word. If no word is found within the text for the i-th individual, then main_word receives the value "" for that individual. is it possible to do such a thing without having to write a DATA step on the raw_data table for each of the words? (with the word manually entered as a parameter for an indexw function)? What I tried to do (and which didn't work): 1: Use an array and put each word in a macro-variable. DATA _NULL_ ;
SET list_words;
CALL SYMPUT(COMPRESS("wordsmacro"||_N_),words) ;
RUN ;
%LET array_word = array array_word $30 wordsmacro&1-wordsmacro&10 ;
DATA raw_data ;
SET raw_data ;
LENGTH main_word $ 30 ;
DO i = 1 to 10 ;
IF INDEXW(text, array_word[i]) > 0
THEN DO ;
main_word = array_word[i] ;
LEAVE ;
END ;
END ;
RUN ; The issue here is that array_word apparently isn't defined inside the DATA step. The log extract with the mistakes is the following: 27 IF INDEXW(text, array_word[i]) > 0 ERROR: Undeclared array referenced: array_word. ERROR: Variable array_word has not been declared as an array. 28 THEN DO ; 29 main_word = array_word[i] ; ERROR: Undeclared array referenced: array_word. ERROR: Variable array_word has not been declared as an array. 30 LEAVE ; 2: Put each word in a macro-variable and do a loop on the table: DATA _NULL_ ;
SET list_words ;
CALL SYMPUT(COMPRESS("wordsmacro"||_N_),words) ;
RUN ;
%MACRO parcours ;
DATA raw_data ;
SET raw_data ;
LENGTH main_word $ 30 ;
%DO i = 1 %TO 10 :
IF INDEXW(text, &&wordsmacro&i) > 0
THEN DO ;
main_word = &&wordsmacro&i ;
LEAVE ;
END ;
%END ;
RUN ;
%MEND ;
%parcours ; The log extract with the mistakes is the following: 23 %parcours ; WARNING: Apparent symbolic reference I not resolved. WARNING: Apparent symbolic reference MOTSMACRO not resolved. WARNING: Apparent symbolic reference I not resolved. ERROR: A character operand was found in the %EVAL function or %IF condition where a numeric operand is required. The condition was: 10 : IF INDEXW(text, &&wordsmacro&i) > 0 THEN DO ERROR: The %TO value of the %DO I loop is invalid. ERROR: The macro PARCOURS will stop executing. 24 25 GOPTIONS NOACCESSIBLE; 26 %LET _CLIENTTASKLABEL=; 27 %LET _CLIENTPROJECTPATH=; 28 %LET _CLIENTPROJECTNAME=; 29 %LET _SASPROGRAMFILE=; 30 31 ;*';*";*/;quit;run; ____ 180 ERROR 180-322: Statement is not valid or it is used out of proper order. 3: I don't know how to code it here, but maybe a hash object could work too? Some possible tables: DATA list_words ;
INPUT mots $ 1-40 ;
DATALINES ;
Lorem
ipsum
dolor
sit
amet
consectetur
adipiscing
elit
sed
do
RUN ;
DATA raw_data ;
INPUT texte $ 1-600 ;
DATALINES ;
Lorem ipsum dolor sit amet
consectetur adipiscing elit, sed do
eiusmod tempor incididunt ut labore
et dolore magna aliqua.
Lorem ipsum dolor sit amet
consectetur adipiscing elit, sed do
eiusmod tempor incididunt ut labore
et dolore magna aliqua.
Lorem ipsum dolor sit amet
consectetur adipiscing elit, sed do
eiusmod tempor incididunt ut labore
et dolore magna aliqua.
Lorem ipsum dolor sit amet
consectetur adipiscing elit, sed do
eiusmod tempor incididunt ut labore
et dolore magna aliqua.
Lorem ipsum dolor sit amet
consectetur adipiscing elit, sed do
eiusmod tempor incididunt ut labore
et dolore magna aliqua.
RUN ;
... View more