I'm trying to write a program to create a variable TARGET based on the file name. If the file name has value which is matched with value from the macro variable 'nl_rb' then I want if condition to statisfy otherwise else.
But it's not happening although filename has value which is matched with the value from the macro variable 'nl_rb'. Only else statement statisfied instead of 'if'. Any leads to understand what I was missing in 'if indexw'? Appericiate if someone of you help me understand how to look up macro variable value in the file name?
Sample File name : IFT_GT_RND_2_0417_1_20201009T075212.csv
%let inpath=/var/sasdata/GTM; %let lh_path=/var/sasdata/LH; %let nl_path=/var/sasdata/NL; /*NL RB*/ PROC SQL noprint; SELECT PARAM_VLU_TXT into: nl_rb separated by ',' FROM PARAM_VLU WHERE CMPNT_NM = 'NL' AND PARAM_NM = 'RBK'; QUIT; %put ######NL RB####### &nl_rb; /*macro variable nl_rb resolves to 0417,0437,0438,0439,0440,0441,0555,1234,0170,1100*/ data files ; length id 8 msg filename source target $256 ; did=dopen("source"); if did<=0 then do; msg=sysmsg(); put msg; stop; end; do id=1 to dnum(did); filename=dread(did,id); if scan(lowcase(filename),-1,'.')='csv' then do; source="&inpath" ||"/"|| filename; /*check for macro variable values in filename*/ if indexw(filename,'&nl_rb.') then do; target="&nl_path"||"/"||filename; end; else target="&lh_path"||"/"||filename; end; output; end; did=dclose(did); drop did msg; run;
Actual Output:
id | filename | source | target |
1 |
IFT_GT_RND_2_0417_1_20201009T075212.csv |
/var/sasdata/GTM | /var/sasdata/LH |
Excepted Output:
id | filename | source | target |
1 |
IFT_GT_RND_2_0417_1_20201009T075212.csv |
/var/sasdata/GTM | /var/sasdata/NL |
Please internalize this:
SUCCESSFUL MACRO PROGRAMMING STARTS WITH WORKING NON-MACRO CODE!
And also Maxim 34: Work in Steps.
This would have alerted you of the fact that the INDEXW is in fact not finding what you want, and that is because you did not tell it that you use the underline as a delimiter.
See this example where the string is found:
%let nl_rb=0417,0437,0438,0439,0440,0441,0555,1234,0170,1100;
data _null_;
filename = "IFT_GT_RND_2_0417_1_20201009T075212.csv";
flag = 0;
do i = 1 to countw("&nl_rb.",",");
if indexw(filename,scan("&nl_rb.",i,","),"_") then flag = 1;
end;
put flag=;
run;
Log:
73 %let nl_rb=0417,0437,0438,0439,0440,0441,0555,1234,0170,1100; 74 75 data _null_; 76 filename = "IFT_GT_RND_2_0417_1_20201009T075212.csv"; 77 flag = 0; 78 do i = 1 to countw("&nl_rb.",","); 79 if indexw(filename,scan("&nl_rb.",i,","),"_") then flag = 1; 80 end; 81 put flag=; 82 run; flag=1
But it's not happening ...
This is not enough information. When code doesn't work properly, you need to show us the LOG, or the incorrect output. If you are going to show us the log, then please follow these instructions: copy the log as text and then paste it into the window that appears when you click on the </> icon, please do not provide the log any other way.
Actual Output:
id | filename | source | target |
1 |
IFT_GT_RND_2_0417_1_20201009T075212.csv |
/var/sasdata/GTM | /var/sasdata/LH |
Excepted Output:
id | filename | source | target |
1 |
IFT_GT_RND_2_0417_1_20201009T075212.csv |
/var/sasdata/GTM | /var/sasdata/NL |
if indexw(filename,'&nl_rb.') then do;
Use double quotes
Also, if you look at the INDEXW documentation, I think it should be
if indexw("&nl_rb",filename,',') then do;
but even that doesn't work, perhaps you need to extract the fifth "word" from filename to obtain the string 0417, then it should work. Or loop through all "words" in filename.
@PaigeMiller Any example for this? I'm not certain to loop through all "words" in filename
"but even that doesn't work, perhaps you need to extract the fifth "word" from filename to obtain the string 0417, then it should work. Or loop through all "words" in filename"
@David_Billa wrote:
@PaigeMiller Any example for this? I'm not certain to loop through all "words" in filename
"but even that doesn't work, perhaps you need to extract the fifth "word" from filename to obtain the string 0417, then it should work. Or loop through all "words" in filename"
If it's always the fifth "word", no looping is needed.
Did you mean:
... If the file name has a value which is matched with a value from the macro variable 'nl_rb' then I want if condition to statisfy otherwise else.
The SCAN function is used to tokenize, that is it separates the string into its ten values, respective to the commas that delimit. the string.
nl_rb=0417,0437,0438,0439,0440,0441,0555,1234,0170,1100;
data example;
do i=1 to 10;
four=scan ("&nl_rb", i,',')
output;
end;
stop;
run;
more on this at this link: SAS Help Center: SCAN Function
@PhilC Macro variable resolves to any number of values. It's not upto 10 values everytime. Out of those values, any of the value will be in file name. If the filename contains the string matches with the macro value then if statement should statisfy.
Is it that the four digits in the filename is always positions 14 through 17?
@PhilC It's always 5th word separated by underscore '_'
then you can use
four=scan(filename,5,"_");
Replace this code:
if indexw(filename,'&nl_rb.') then do;
target="&nl_path"||"/"||filename;
end;
else target="&lh_path"||"/"||filename;
with this:
flag = 0;
do i = 1 to countw("&nl_rb.",",");
if indexw(filename,scan("&nl_rb.",i,",")) then flag = 1;
end;
if flag
then target="&nl_path"||"/"||filename;
else target="&lh_path"||"/"||filename;
- double quotes around the macro variable
- loop through the macro variable
- test each "word" in the macro variable individually
@Kurt_Bremser Still else statement statisfies. File in the source folder is - IFT_GT_RND_2_0417_1_20201009T075212.csv
Log:
54 data files ; 55 length id 8 msg filename source target $256 ; 56 did=dopen("source"); 57 if did<=0 then do; 58 msg=sysmsg(); 59 put msg; 60 stop; 61 end; 62 do id=1 to dnum(did); 63 filename=dread(did,id); 64 if scan(lowcase(filename),-1,'.')='csv' then do; 65 source="&inpath" ||"/"|| filename; SYMBOLGEN: Macro variable INPATH resolves to /var/sasdata/GTM 66 67 /*Move to NL folder*/ 68 flag = 0; 69 do i = 1 to countw("&nl_rb.",","); SYMBOLGEN: Macro variable NL_RB resolves to 0417,0437,0438,0439,0440,0441,0555,1234,0170,1100,1101,1103,1104,1105,1400,1402,5112,5115,5116,8110,8120,8130,8190,5210, 5440,5601,6010,6020,8040,8050,7005,7006,0169 70 if indexw(filename,scan("&nl_rb.",i,",")) then flag = 1; SYMBOLGEN: Macro variable NL_RB resolves to 0417,0437,0438,0439,0440,0441,0555,1234,0170,1100,1101,1103,1104,1105,1400,1402,5112,5115,5116,8110,8120,8130,8190,5210, 5440,5601,6010,6020,8040,8050,7005,7006,0169 71 end; 72 if flag 73 then target="&nl_path"||"/"||filename; SYMBOLGEN: Macro variable NL_PATH resolves to /var/sasdata/NL 74 else target="&lh_path"||"/"||filename; SYMBOLGEN: Macro variable LH_PATH resolves to /var/sasdata/LH 75 76 end; 77 output; 78 end; 79 did=dclose(did); 80 drop did msg; 81 run; NOTE: The data set WORK.FILES has 1 observations and 6 variables
Please internalize this:
SUCCESSFUL MACRO PROGRAMMING STARTS WITH WORKING NON-MACRO CODE!
And also Maxim 34: Work in Steps.
This would have alerted you of the fact that the INDEXW is in fact not finding what you want, and that is because you did not tell it that you use the underline as a delimiter.
See this example where the string is found:
%let nl_rb=0417,0437,0438,0439,0440,0441,0555,1234,0170,1100;
data _null_;
filename = "IFT_GT_RND_2_0417_1_20201009T075212.csv";
flag = 0;
do i = 1 to countw("&nl_rb.",",");
if indexw(filename,scan("&nl_rb.",i,","),"_") then flag = 1;
end;
put flag=;
run;
Log:
73 %let nl_rb=0417,0437,0438,0439,0440,0441,0555,1234,0170,1100; 74 75 data _null_; 76 filename = "IFT_GT_RND_2_0417_1_20201009T075212.csv"; 77 flag = 0; 78 do i = 1 to countw("&nl_rb.",","); 79 if indexw(filename,scan("&nl_rb.",i,","),"_") then flag = 1; 80 end; 81 put flag=; 82 run; flag=1
You have two obvious problems. One is technical, macro variable references are not resolved inside of single quotes. So use double quotes. The second is logical. The INDEXW() function cannot look for multiple possible values at once.
First forget the macro code aspects and figure out what SAS code you want to run. Once you have that you can think about how you could use macro code to generate that SAS code.
I see a numbers of pathways.
First if the location of the "word" to check for is in the same place in the filename then reverse the arguments to the INDEXW() function. Instead of checking if the filename contains one a series of words you can pull the word you want to check from the filename and test if it is in the list of words you put into the macro variable.
indexw("&nl_rb",scan(filename,5,'_'),',')
You could adapt your current method to use regular expression coding instead of INDEXW() function. Play around and figure out how to use the | in regular expression to allow you to search for any one of a list of items.
The other method is to split the step that gets the filenames out from the step that does the testing for the strings. Then join the two tables and aggregate the test for the presence of each individual word from you list.
select filename
, max(0 ne indexw(filename,trim(PARAM_VLU_TXT),'_')) as any_match
from files
, (SELECT PARAM_VLU_TXT into: nl_rb separated by ','
FROM PARAM_VLU
WHERE CMPNT_NM = 'NL' AND PARAM_NM = 'RBK') list
group by filename
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.