BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Phil_NZ
Barite | Level 11

Hi SAS Unsers,

 

This morning when I try to adjust my code, I face a noticeable problem as below.

 

My code for running through all file in a directory is:

filename mydir 'C:\Users\pnguyen\Desktop\New folder';
data _null_;
did = dopen('mydir');
do i = 1 to dnum(did);
  fname = dread(did,i);
  /*fname: AustriaEUR*/
  length short_fn $29 currency $3 ;
  short_fn= cats(substr(fname, 1,length(fname)-3),'_');
  currency=substr(fname,length(fname)-2);
  cmd=cats('%ImportAndTranspose(File=C:\Users\pnguyen\Desktop\New folder\',
      strip(fname),
      ',cur=',currency,
      ',outf=',short_fn,'sheet,startsheet=1,endsheet=45);');
	  /*strip(fname) to link 
	 File=C:\Users\pnguyen\Desktop\New folder\AustriaEUR*/
        call execute(cmd);
end;
keep fname;
run;

A part of my macro is

 

%macro ImportAndTranspose(
      File=
	  , cur=
	  , outf=
      , StartSheet=
      , EndSheet=
   );
   %local i;
   
   %do i = &StartSheet. %to &EndSheet.;

    %if &i=1 %then %do;
	
    proc import datafile= "&File." 
                  out= &outf.&i.
                  dbms= xlsx 
                  replace;
          range= "Sheet1$A:X";
          getnames= yes;
      run;
      
      proc sort data= &outf.&i.;
      by Type;
      run;
	%end;

So, I test with one file named "AustriaEUR" in the folder "New Folder". I have the PROC MPRINT as below, which is quite strange to me

MPRINT(IMPORTANDTRANSPOSE):   proc import datafile= "C:\Users\pnguyen\Desktop\New folder\AustriaEUR.xlsx" out= AustriaEUR.x_sheet1 
dbms= xlsx replace;
MPRINT(IMPORTANDTRANSPOSE):   range= "Sheet1$A:X";
MPRINT(IMPORTANDTRANSPOSE):   getnames= yes;
MPRINT(IMPORTANDTRANSPOSE):   run;
MPRINT(IMPORTANDTRANSPOSE):   proc sort data= AustriaEUR.x_sheet1;
MPRINT(IMPORTANDTRANSPOSE):   by Type;
MPRINT(IMPORTANDTRANSPOSE):   run;

The MPRINT as above means that outf = AustriaEUR.x_sheet

 

However, if we substitute to the initial code, we have:

fname=AustriaEUR

then

short_fn= cats(substr(fname, 1,length(fname)-3),'_');

So short_fn=Austria_

And 

',outf=',short_fn,'sheet

means that outf=Austria_sheet which is much different from the one shown from OPTION MPRINT (outf = AustriaEUR.x_sheet)

 

Can you please tell me where I did wrongly.

Warm regards!

 

 
Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

@Phil_NZ wrote:

But why don't you put any delimiter in the condition of scan function? In case we don't put the delimiter, what is the default one?

 

And I am looking for a further comprehensive solution because other than Austria, I have other files named with two words like United Kingdom or United States (you can see the blank in the middle of the name). And in these cases, it seems that the scan function can not work properly.

In this case, of course, you must specify the delimiter because the list of default delimiters includes the blank:

fname = scan(dread(did,i),1,'.');

And, yes, I should have specified it in the first place to provide more robust code. Sorry for the omission.

View solution in original post

14 REPLIES 14
PaigeMiller
Diamond | Level 26

One way to debug this is to change data _null_; to data null; and this creates a SAS data set named NULL. Then you can look into this data set and see if the text strings are being created properly.

--
Paige Miller
Phil_NZ
Barite | Level 11

Hi @PaigeMiller !

 I replaced

data _null_;

by

data null;

However, there is no output popping up so far with some new warnings

WARNING: The data set WORK.NULL may be incomplete.  When this step was stopped there were 1 observations and 1 variables.
NOTE: Compressing data set WORK.NULL decreased size by 0.00 percent. 
      Compressed is 1 pages; un-compressed would require 1 pages.
WARNING: Data set WORK.NULL was not replaced because this step was stopped.

Thank you!

 
Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
PaigeMiller
Diamond | Level 26

@Phil_NZ wrote:

Hi @PaigeMiller !

 I replaced

data _null_;

by

data null;

However, there is no output popping up so far with some new warnings

WARNING: The data set WORK.NULL may be incomplete.  When this step was stopped there were 1 observations and 1 variables.
NOTE: Compressing data set WORK.NULL decreased size by 0.00 percent. 
      Compressed is 1 pages; un-compressed would require 1 pages.
WARNING: Data set WORK.NULL was not replaced because this step was stopped.

Thank you!

 

Although you have an answer, I point out that there is not supposed to be "output popping up" here. The point here was for you to LOOK AT (with your own eyes) the data set created called NULL to see if the text strings were being properly created. This is a valuable debugging method when you are creating strings to be placed into a CALL EXECUTE.

--
Paige Miller
Phil_NZ
Barite | Level 11

Thank you @PaigeMiller , it is a valuable lesson for me, thank you!

Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
Phil_NZ
Barite | Level 11

And is there any document talking about the comparison between NULL and _NULL_, can I ask? I google it but no result so far!

Many thanks and warm regards!

 
Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
PaigeMiller
Diamond | Level 26

data _null_; does not create a data set you can look at. This is fine if you have worked out the bugs in your program and you don't need an actual data set.

 

data null; (or any valid data set name except _null_) creates a data set you can look at.

--
Paige Miller
Phil_NZ
Barite | Level 11

Hi @PaigeMiller 

Thank you for your dedicated answer, I really appreciate it. From people like you, I believe that there are still a lot of great people all over the world, motivating me to be the next great person!

 

Many thanks and warmest regards.

 
Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
FreelanceReinh
Jade | Level 19

Hi  @Phil_NZ,

 

The DREAD function returns the filename including the suffix, probably .xlsx, so your attempt to cut off "EUR" did in fact cut off "lsx".

 

Try if 

fname = scan(dread(did,i),1);

resolves the issue.

Phil_NZ
Barite | Level 11

Hi @FreelanceReinh 

It looks like that you are right, the MPRINT now is more reasonable

MPRINT(IMPORTANDTRANSPOSE):   proc import datafile= "C:\Users\pnguyen\Desktop\New folder\AustriaEUR" out= Austria_sheet1 dbms= xlsx 
replace;

Many thanks for your solution.

 

But why don't you put any delimiter in the condition of scan function? In case we don't put the delimiter, what is the default one?

 

And I am looking for a further comprehensive solution because other than Austria, I have other files named with two words like United Kingdom or United States (you can see the blank in the middle of the name). And in these cases, it seems that the scan function can not work properly.

 

Many thanks and warm regards.

 

 
Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
FreelanceReinh
Jade | Level 19

@Phil_NZ wrote:

But why don't you put any delimiter in the condition of scan function? In case we don't put the delimiter, what is the default one?

 

And I am looking for a further comprehensive solution because other than Austria, I have other files named with two words like United Kingdom or United States (you can see the blank in the middle of the name). And in these cases, it seems that the scan function can not work properly.

In this case, of course, you must specify the delimiter because the list of default delimiters includes the blank:

fname = scan(dread(did,i),1,'.');

And, yes, I should have specified it in the first place to provide more robust code. Sorry for the omission.

Phil_NZ
Barite | Level 11

Hi @FreelanceReinh 

And apart from that, it is interesting to me, because when the filename is Austria_

when I run the code

filename mydir 'C:\Users\pnguyen\Desktop\New folder';
data _null_;
did = dopen('mydir');
do i = 1 to dnum(did);
  fname = dread(did,i);
  short_fn= cats(substr(fname, 1,3),'_');
  cmd=cats('%ImportAndTranspose(File=C:\Users\pnguyen\Desktop\New folder\',
      strip(fname),',outf=',short_fn,'sheet,startsheet=1,endsheet=45);');
  call execute(cmd);
end;
keep fname;
run;

with the macro

%macro ImportAndTranspose(
      File=
	  , outf=
      , StartSheet=
      , EndSheet=
   );
   %local i;
   
   %do i = &StartSheet. %to &EndSheet.;

    %if &i=1 %then %do;
	
    proc import datafile= "&File." 
                  out= &outf.&i.
                  dbms= xlsx 
                  replace;
          range= "Sheet1$A:X";
          getnames= yes;
      run;
      
      proc sort data= &outf.&i.;
      by Type;
      run;
	%end;

The outf seems not to include the suffix xlsx because it works smoothly

MPRINT(IMPORTANDTRANSPOSE):   proc import datafile= "C:\Users\pnguyen\Desktop\New folder\Austria_.xlsx" out= Aus_sheet1 dbms= xlsx 
replace;

It is quite an interesting part to me when I compare this workable code and the problematic code in this post that you explain because of the existence of the suffix.

I am wondering if there is any reasonable explanation for this difference?

Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
FreelanceReinh
Jade | Level 19

@Phil_NZ wrote:

Hi @FreelanceReinh 

And apart from that, it is interesting to me, because when the filename is Austria_

when I run the code

filename mydir 'C:\Users\pnguyen\Desktop\New folder';
data _null_;
did = dopen('mydir');
do i = 1 to dnum(did);
  fname = dread(did,i);
  short_fn= cats(substr(fname, 1,3),'_');
...

The outf seems not to include the suffix xlsx because it works smoothly (...)

I am wondering if there is any reasonable explanation for this difference?


Well, the above code is different in that it cuts the value of fname after the first three characters rather than before the last three. That way the suffix isn't involved.

Phil_NZ
Barite | Level 11

Hi @FreelanceReinh 

Thank you for your straight answer, one last further question is

In case I want to have EUR, I can still keep the same code without changing anything, am I correct?

currency=substr(fname,length(fname)-2);

TIA.

 

 
Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
FreelanceReinh
Jade | Level 19

@Phil_NZ wrote:

 

In case I want to have EUR, I can still keep the same code without changing anything, am I correct?

currency=substr(fname,length(fname)-2);

Yes, once the suffix has been removed (by the SCAN function), fname ends in the currency code and the above statement will retrieve that code. Of course, this assumes that the currency code consists of three characters and is not followed by other name characters (e.g. an underscore).

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 14 replies
  • 1508 views
  • 5 likes
  • 3 in conversation