BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Paul_de_Barros
Obsidian | Level 7

I'm trying to use a data step to create a file, but only create it if there are records to be written to the file.  Since the data step exits when the SET statement runs and there are no more records, I though that putting the FILE statement after the SET statement would prevent the file from being created if there were no records in the dataset named in the SET statement.  However, the code below create an empty text file even when the rowcount of Work.dataset1 is zero.  Am I doing something wrong?  Why is the FILE statement executing at all?  Is what I am trying to do feasible, or should I be writing macro code?

 

 

%Let filepath = /data/file.txt;

DATA _NULL_;
    SET Work.dataset1;
    FILE "&FilePath";
    IF _N_ = 1 THEN DO;
        put 'Line 1';
        put 'Line 2';
        put ' ';
        put 'First Last DOB'; /* These are the column headers. */
    END;
    put First ' ' Last ' ' DOB;
RUN;

 

 

Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
SASJedi
SAS Super FREQ

Here is a solution set up for testing with the SASHELP.CLASS data set:

/* Make test data - uncomment stop to make empty data set */
data dataset1;
/*   stop;*/
   set sashelp.class(obs=1);
run;

DATA _NULL_;
    /* If there are observations, proceed*/
    if nobs>0 then fileloc="&FilePath";
    /*Otherwise, quit here */
    else stop;
    SET Work.dataset1 nobs=nobs;
    file x filevar=fileloc;
    IF _N_ = 1 THEN DO;
        put 'First Age'; /* These are the column headers. */
    END;
    put Name ' ' Age;
RUN;
Check out my Jedi SAS Tricks for SAS Users

View solution in original post

7 REPLIES 7
SASJedi
SAS Super FREQ

Here is a solution set up for testing with the SASHELP.CLASS data set:

/* Make test data - uncomment stop to make empty data set */
data dataset1;
/*   stop;*/
   set sashelp.class(obs=1);
run;

DATA _NULL_;
    /* If there are observations, proceed*/
    if nobs>0 then fileloc="&FilePath";
    /*Otherwise, quit here */
    else stop;
    SET Work.dataset1 nobs=nobs;
    file x filevar=fileloc;
    IF _N_ = 1 THEN DO;
        put 'First Age'; /* These are the column headers. */
    END;
    put Name ' ' Age;
RUN;
Check out my Jedi SAS Tricks for SAS Users
Paul_de_Barros
Obsidian | Level 7

@SASJedi Thank you, that definitely worked.  However, I don't entirely understand why. 

 

First, if I modify your code to put the FILE statement inside the conditional, it still creates the output file.  This means that the FILE statement seems to execute no matter what.  Does the FILE statement in your version execute, but no output file is created because fileloc is undefined?

 

Second, in your code, how is it that the value of nobs is available before you define it?  On the first iteration of the implicit loop of the data set, the conditional tests the value of nobs before the SET statement establishes it.  Shouldn't that throw an error?  Or is the value of nobs determined during compilation and not during execution?

Tom
Super User Tom
Super User

The posted code you are replying to was the test when the dataset was NOT empty, so the file should have been made.

The NOBS= option on the SET statement will populate the value of the variable before the data step actually started iterating.

 

But there is no need to test the number of observations (in this case).  When you use the FILEVAR= option on the FILE statement SAS will not attempt to open the file until the FILE statement actually executes.

 

Here is a working example:  The first time when OBS=0 dataset option the step stops before ever getting to the FILE statement so the file is not created.

%let fname=%sysfunc(pathname(work))/test.txt;

data _null_;
  set sashelp.class(obs=0);
  filevar="&fname";
  file dummy filevar=filevar;
  put 'hello there';
run;

data _null_;
  found=fileexist("&fname");
  put found=;
run;

data _null_;
  set sashelp.class(obs=2);
  filevar="&fname";
  file dummy filevar=filevar;
  put 'hello there';
run;

data _null_;
  found=fileexist("&fname");
  put found=;
run;
Paul_de_Barros
Obsidian | Level 7

This is making it a lot clearer, thank you.  A few more things, if you don't mind me continuing to pester you:

  1. I should have noted in my previous reply that I had uncommented the stop already.
  2. You mentioned that the NOBS= option on the SET statement will populate the value of the variable before the data step starts iterating.  Does this mean that it gets populated during the compilation phase?  Or is the execution phase more complicated than I thought, with SAS looking for statements to run before the rest, regardless of their order in the code?  Or is there another phase between compilation and execution?
  3. You mentioned that, when the FILEVAR= option is used on the FILE statement, SAS won't try to open the file until the FILE statement actually executes.  I assume this means that, without the FILEVAR= option included, SAS attempts to open the file before the FILE statement executes.  When does it do that?

Thanks again.

Tom
Super User Tom
Super User

I don't know exactly the methods SAS uses internally to do run the data step.  For all we know that are 47 steps between definition and execution.  But there are definitely statements like SET and FILE that have both executable parts and definition impacts on the data step.   Even a simple assignment statement can have an impact on the definition, for example if the variables referenced were never defined before the compiler sees that statement.

 

For the FILE statement the compiler must notice that the target filename is static and optimize the data step to run that first (or at least open the file first).

 

SASJedi
SAS Super FREQ

The most common form of the FILE statement uses a direct path to the file:

data _null_;
   file "/mypath/myfolder/myfile.txt";
   set work.cars;
   put "TEST";
run;

Or a fileref previously assigned to a file:

filename thisFile "/mypath/myfolder/myfile.txt";
data _null_;
   file thisFile;
   put "TEST";
run;

In both of those cases, the DATA step file reference is static - so when the DATA step complies, it knows at compile time where to write the PUT values. But what if you wanted to conditionally determine where to write? For this, we have to delay locating the output file until execution time - so we use FILEVAR instead of the hard-coded file location.

 

Now, consider the code below. During the compile phase, the sashelp.class descriptor portion is read in so that the PDV variables can be properly set up for processing. But when the execution phase begins, the STOP statement stops the DATA step before the SET statement can execute, so no observations are written to dataset1 - the data set is empty.

 

 

data dataset1;
   stop;
   set sashelp.class(obs=1);
run;

 

 

Next, this DATA step executes:   

DATA _NULL_;
    /* If there are observations, proceed*/
    if nobs>0 then fileloc="&FilePath";
    /*Otherwise, quit here */
    else stop;
    SET dataset1 nobs=nobs;
    file x filevar=fileloc;
    IF _N_ = 1 THEN DO;
        put 'First Age'; /* These are the column headers. */
    END;
    put Name ' ' Age;
RUN;

At compile time, the compiler notes that the FILE statement is using FILEVAR, and defers locating the output file until the FILE statement executes during the execution phase. This means that the output file isn't created yet. As usual, the compiler reads the descriptor portion for the input dataset, in this case dataset1, in order to set up the PDV. But because the descriptor data also contains the number of observations in the data set, and we have used the nobs= option on the SET statement, the compiler also populates the value of NOBS during the compile phase. So when the execution phase starts, we already know the number of observations in  dataset1.

 

The very first thing our DATA step does is check the value of NOBS. If there are observations, the FILEVAR value is set. When the FILE statement subsequently executes, it will create our output file. But if there are no observations, the DATA step STOP statement ends processing immediately. Beacuse the FILE statement never executed, the output file was not created. 

 

I hope that clarifies the code for you.

May the SAS be with you!
Mark

 

 

 

 

Check out my Jedi SAS Tricks for SAS Users
Paul_de_Barros
Obsidian | Level 7
@SASJedi, this was fantastic, thank you.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1470 views
  • 6 likes
  • 3 in conversation