BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
David_Billa
Rhodochrosite | Level 12

I've the code as below. I would like to know if there is any issue in Loop under else if. I have two files in inpath and I want value of TARGET for both the records in output but I'm not getting value for first record.

 

Macro variable NL_RB resolves to 1234,4567,7865,0098

Macro variable LH_RB resolves to 0012,0013,1365,0248,7654

 

Details of the filename are provided in the result.

 

%let function_compont=LH;
%let inpath=/var/sasdata/INPUT/GT;
%let lh_path=/var/sasdata/INPUT/LH;
%let nl_path=/var/sasdata/INPUT/NL;

data files (drop= i id) ;
  length id 8 msg filename source target $256 ;
  did=dopen("source");
  if did<=0 then do;
      msg=sysmsg(); 
      put msg; 
      stop;
  end;
  do id=1 to dnum(did);
    filename=dread(did,id);

    if (scan(lowcase(filename),-1,'.')='csv' or scan(lowcase(filename),-1,'.')='xlsx') and ("&function_compont" EQ 'NL' or "&function_compont" EQ 'LH') then do;
    source="&inpath" ||"/"|| filename;

    /*NL path for NL Rb*/
    if "&function_compont" EQ 'NL' and (scan(lowcase(filename),5,'_') not in ('5601','6010','6020')) then do;
    flag = 0;
    do i = 1 to countw("&nl_rb.",",");
    if indexw(filename,scan("&nl_rb.",i,","),"_")
    then do;
    flag = 1;
    target = "&nl_path"||"/"||filename;
    end;
    end;
    end;

    /*LH path for LH Rb*/
    else if "&function_compont" EQ 'LH' and (scan(lowcase(filename),5,'_') not in ('5601','6010','6020') or scan(lowcase(filename),4,'_') not in ('5601','6010','6020')) then do;
    flag = 0;
    do i = 1 to countw("&lh_rb.",",");
    if indexw(filename,scan("&lh_rb.",i,","),"_")
    then do;
    flag = 1;
    target = "&lh_path"||"/"||filename;
    end;
    end;
    end;
    output;
    end;
  did=dclose(did);
  drop did msg;
run;

Actual Result:

 

filename source target flag
IFR_GT_MS_0012.xlsx /var/sasdata/INPUT/GT/IFR_GT_MS_0012.xlsx   0
IFR_GT_TR_1_0012_1_20201009T075212.csv /var/sasdata/INPUT/GT/IFR_GT_TR_1_0012_1_20201009T075212.csv /var/sasdata/INPUT/LH/IFR_GT_TR_1_0012_1_20201009T075212.csv 1

 

Desired Result:

 

filename source target flag
IFR_GT_MS_0012.xlsx /var/sasdata/INPUT/GT/IFR_GT_MS_0012.xlsx /var/sasdata/INPUT/LH/IFR_GT_MS_0012.xlsx 1
IFR_GT_TR_1_0012_1_20201009T075212.csv /var/sasdata/INPUT/GT/IFR_GT_TR_1_0012_1_20201009T075212.csv /var/sasdata/INPUT/LH/IFR_GT_TR_1_0012_1_20201009T075212.csv 1

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

So add another loop to loop over the prefixes you have used to define your macro variables.  So if you have these macro variables:

%let inpath=/var/sasdata/INPUT/GT;
%let lh_path=/var/sasdata/INPUT/LH;
%let nl_path=/var/sasdata/INPUT/NL;

%let nl_rb=0012;
%let lh_rb=0099;

And this list of filenames

data files ;
  length id 8 filename $256 ;
  id+1;
  input filename ;
cards;
IFR_GT_MS_0012.xlsx
IFR_GT_TR_1_0012_1_20201009T075212.csv
someotherfile.pdf
IFR_GT_MS_0099.xlsx
;

Something like this should work (notice the TRIM() to make sure you are not passing trailing spaces to INDEXW() function).

data files_to_move ;

  set files ;
  length extension $10 source target $256 ;

* Get extension on the filename ;
  if index(filename,'.') then extension=lowcase(scan(filename,-1,'.')) ;
* only keep csv or xlsx files ;
  if extension in ('csv','xlsx');

* Set source filename ;
  source = catx('/',"&inpath",filename);

/*  Check if any of the words in ..._RB string exist in the filename */
  length flag 8 mvar $32 words $500 ;
  put filename= ;
  do mvar='LH','NL' while (not flag);
    flag = 0;
    words = symget(cats(mvar,"_rb"));
    do i = 1 to countw(words,",") while(not flag);
      word=scan(words,i,',');
      if indexw(filename,trim(word),"_.") then do;
        flag = 1;
        target = catx('/',symget(cats(mvar,"_path")),filename);
      end;
    end;
  end;
  drop i mvar word words ;

run;

Results:

Obs    id    filename                                  extension

 1      1    IFR_GT_MS_0012.xlsx                         xlsx
 2      2    IFR_GT_TR_1_0012_1_20201009T075212.csv      csv
 3      4    IFR_GT_MS_0099.xlsx                         xlsx

Obs                               source

 1     /var/sasdata/INPUT/GT/IFR_GT_MS_0012.xlsx
 2     /var/sasdata/INPUT/GT/IFR_GT_TR_1_0012_1_20201009T075212.csv
 3     /var/sasdata/INPUT/GT/IFR_GT_MS_0099.xlsx

Obs                               target                               flag

 1     /var/sasdata/INPUT/NL/IFR_GT_MS_0012.xlsx                         1
 2     /var/sasdata/INPUT/NL/IFR_GT_TR_1_0012_1_20201009T075212.csv      1
 3     /var/sasdata/INPUT/LH/IFR_GT_MS_0099.xlsx                         1

View solution in original post

6 REPLIES 6
Kurt_Bremser
Super User

Format the code, so it becomes readable.

Pull out the read of the filenames into a separate step for easier debugging .

 

What is contained in macro variables nl_rb and lh_rb?

David_Billa
Rhodochrosite | Level 12

Macro variable NL_RB resolves to 1234,4567,7865,0098

Macro variable LH_RB resolves to 0012,0013,1365,0248,7654

which can be seen in filename. You can check my previous post to see how the filename looks like.

Kurt_Bremser
Super User

You need to use the dot as an additional delimiter in the INDEXW function, for cases where the search string is the last part of the filename before the filename extension.

You also have no need for setting a flag variable, with the way you run your DO loops.

If you keep the directory read in that step, make sure you initialize your target variable to empty at the start of the outermost loop. In the simplified data step I use, this is not necessary, as each separate filename is processed in a separate data step iteration.

%let function_compont=LH;
%let inpath=/var/sasdata/INPUT/GT;
%let lh_path=/var/sasdata/INPUT/LH;
%let nl_path=/var/sasdata/INPUT/NL;
%let lh_rb=0012;
%let nl_rb=0012;

data have;
input filename $80.;
datalines;
IFR_GT_MS_0012.xlsx
IFR_GT_TR_1_0012_1_20201009T075212.csv 
;

data files;
length source target $256;
set have;
if
  scan(lowcase(filename),-1,'.') in ('csv','xlsx')
  and "&function_compont" in ('NL','LH')
then do;
  source = "&inpath" || "/" || filename;
  /*NL path for NL Rb*/
  if
    "&function_compont" EQ 'NL' and
    scan(lowcase(filename),5,'_') not in ('5601','6010','6020')
  then do;
    do i = 1 to countw("&nl_rb.",",");
      if indexw(filename,scan("&nl_rb.",i,","),"_.")
      then target = "&nl_path"||"/"||filename;
    end;
  end;
  /*LH path for LH Rb*/
  else if
    "&function_compont" EQ 'LH' and
    (
      scan(lowcase(filename),5,'_') not in ('5601','6010','6020') or
      scan(lowcase(filename),4,'_') not in ('5601','6010','6020')
    )
  then do;
    do i = 1 to countw("&lh_rb.",",");
      if indexw(filename,scan("&lh_rb.",i,","),"_.")
      then target = "&lh_path"||"/"||filename;
    end;
  end;
end;
run;

I would not have been able to find the problem without structuring the code, at least it would have taken me a whole lot longer.

Tom
Super User Tom
Super User

The main issue is the delimiters you are using for the INDEXW() function.  Since the digit string is right before the extension the right hand delimiter there is a period instead of an underscore.

 

So you really just need to add period along with underscore in the list of delimiters in this function call.

indexw(filename,scan("&lh_rb.",i,","),"_.")

Here is a simpler version of what I think you are doing that takes advantage the you have the FUNCTION_COMPONT macro variable which can be used to select the right list. Although I suspect that your real problem is different that what the code you posted is doing.  I thought you wanted to be able to route files to more than one target directory at a time?  With the current logic you would want to run the filtering step twice, once with FUNCTION_COMPONT=LH and then another run with FUNCTION_COMPONT=NL to move those files.

%let inpath=/var/sasdata/INPUT/GT;
%let lh_path=/var/sasdata/INPUT/LH;
%let nl_path=/var/sasdata/INPUT/NL;

%let nl_rb=0012,5678,9012;
%let lh_rb=0012,5678,9012;

%let function_compont=LH;

* Get list of files in source directory ;
filename source "&inpath";
data files ;
  length id 8 msg filename $256 ;
  did=dopen("source");
  if did<=0 then do;
      msg=sysmsg(); 
      put msg; 
      stop;
  end;
  do id=1 to dnum(did);
    filename=dread(did,id);
    output;
  end;
  did=dclose(did);
  drop did msg;
run;

* Generate list of files to move ;
data files_to_move ;
  set files ;
  length extension $10 source target $256 ;

* Get extension on the filename ;
  if index(filename,'.') then extension=lowcase(scan(filename,-1,'.')) ;
* only keep csv or xlsx files ;
  if extension in ('csv','xlsx');
 
/*  Check if any of the words in ..._RB string exist in the filename */
  length flag 8 words $500 ;
  flag = 0;
  words = symget("&function_compont._rb");
  do i = 1 to countw(words,",") while(not flag);
    if indexw(filename,scan(words,i,","),"_.") then flag = 1;
  end;
  if flag then do;
    target = catx('/',"&&&function_compont._path",filename);
    source = catx('/',"&inpath",filename);
  end;
  drop i words ;

run;

Once the step to get the list is separated from the step to filter and calculate source and target you can now test that second step with some simulated filename data.

data files ;
  length id 8 filename $256 ;
  id+1;
  input filename ;
cards;
IFR_GT_MS_0012.xlsx
IFR_GT_TR_1_0012_1_20201009T075212.csv
someotherfile.pdf
IFR_GT_MS_0099.xlsx
;

Here is output for that list of files.

Obs    id    filename                                  extension

 1      1    IFR_GT_MS_0012.xlsx                         xlsx
 2      2    IFR_GT_TR_1_0012_1_20201009T075212.csv      csv
 3      4    IFR_GT_MS_0099.xlsx                         xlsx

Obs                               source

 1     /var/sasdata/INPUT/GT/IFR_GT_MS_0012.xlsx
 2     /var/sasdata/INPUT/GT/IFR_GT_TR_1_0012_1_20201009T075212.csv
 3

Obs                               target                               flag

 1     /var/sasdata/INPUT/LH/IFR_GT_MS_0012.xlsx                         1
 2     /var/sasdata/INPUT/LH/IFR_GT_TR_1_0012_1_20201009T075212.csv      1
 3                                                                       0
David_Billa
Rhodochrosite | Level 12
Yes, I may want to route the files to two different target and it is based
on digits in filename.

Assume I have one file with digits '1234' and other file with digits '0012'
then I want to route files to different directories.
Tom
Super User Tom
Super User

So add another loop to loop over the prefixes you have used to define your macro variables.  So if you have these macro variables:

%let inpath=/var/sasdata/INPUT/GT;
%let lh_path=/var/sasdata/INPUT/LH;
%let nl_path=/var/sasdata/INPUT/NL;

%let nl_rb=0012;
%let lh_rb=0099;

And this list of filenames

data files ;
  length id 8 filename $256 ;
  id+1;
  input filename ;
cards;
IFR_GT_MS_0012.xlsx
IFR_GT_TR_1_0012_1_20201009T075212.csv
someotherfile.pdf
IFR_GT_MS_0099.xlsx
;

Something like this should work (notice the TRIM() to make sure you are not passing trailing spaces to INDEXW() function).

data files_to_move ;

  set files ;
  length extension $10 source target $256 ;

* Get extension on the filename ;
  if index(filename,'.') then extension=lowcase(scan(filename,-1,'.')) ;
* only keep csv or xlsx files ;
  if extension in ('csv','xlsx');

* Set source filename ;
  source = catx('/',"&inpath",filename);

/*  Check if any of the words in ..._RB string exist in the filename */
  length flag 8 mvar $32 words $500 ;
  put filename= ;
  do mvar='LH','NL' while (not flag);
    flag = 0;
    words = symget(cats(mvar,"_rb"));
    do i = 1 to countw(words,",") while(not flag);
      word=scan(words,i,',');
      if indexw(filename,trim(word),"_.") then do;
        flag = 1;
        target = catx('/',symget(cats(mvar,"_path")),filename);
      end;
    end;
  end;
  drop i mvar word words ;

run;

Results:

Obs    id    filename                                  extension

 1      1    IFR_GT_MS_0012.xlsx                         xlsx
 2      2    IFR_GT_TR_1_0012_1_20201009T075212.csv      csv
 3      4    IFR_GT_MS_0099.xlsx                         xlsx

Obs                               source

 1     /var/sasdata/INPUT/GT/IFR_GT_MS_0012.xlsx
 2     /var/sasdata/INPUT/GT/IFR_GT_TR_1_0012_1_20201009T075212.csv
 3     /var/sasdata/INPUT/GT/IFR_GT_MS_0099.xlsx

Obs                               target                               flag

 1     /var/sasdata/INPUT/NL/IFR_GT_MS_0012.xlsx                         1
 2     /var/sasdata/INPUT/NL/IFR_GT_TR_1_0012_1_20201009T075212.csv      1
 3     /var/sasdata/INPUT/LH/IFR_GT_MS_0099.xlsx                         1

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 910 views
  • 4 likes
  • 3 in conversation