BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JacquesR
Quartz | Level 8

Hi all

 

I have a directory with hundreds of JSON files. Some are empty.

I want to import the non-empty files and append them to one dataset.

%MACRO ImportDataTop(dset=,dfile=);
    /* First test to see if the file is not an empty JSON string */
    DATA NULL;
        INFILE "&DataDir.&dfile" LRECL=256 TRUNCOVER;
        INPUT Ln $256.;
        IF _N_ EQ 1 THEN DO;
            IF Ln EQ "{}" THEN CALL SYMPUT("JSONTest",0); ELSE CALL SYMPUT("JSONTest",1);
        END;
    RUN;
    %PUT &JSONTest;
    %IF &JSONTest EQ 1 %THEN %DO;
        /* Use the SAS json reader in LIBNAME to automatically read the files */
        LIBNAME tmpFile JSON "&DataDir.&dfile";
        DATA WORK.tmpFile;
            SET tmpFile.item;
        /* Do data calculations... */
        RUN;
        /* Append the data to a growing datafile */
            PROC DATASETS
                NOPRINT
                LIBRARY=MYLIB;
                APPEND BASE=&dset
                       DATA=WORK.tmpFile
                       FORCE;
            QUIT;
        PROC DELETE DATA=WORK.tmpFile; RUN;
    %END;
    %ELSE %DO;
        %PUT Blank JSON string in &dfile;
    %END;    
%MEND ImportDataTop;

/* Get the full list of data files */
FILENAME _FQuery PIPE "dir ""&DataDir.(*);AU*;*Queries;*;2021-06-02.json"" /b ";
DATA ListQueries; INFILE _FQuery LRECL=256 TRUNCOVER; INPUT file_name $256.; run; PROC DELETE DATA=MYLIB.CollatedDataSet;RUN; DATA NULL; SET ListQueries; CALL EXECUTE(CATT('%ImportDataTop(dset=CollatedDataSet,dfile=',file_name,');')); RUN;

But when I run this, &JSONTest is not processed properly. It is, from the %put statement, simply 1 all the time, regardless of the actual content of the file.

I cannot seem to figure out why call symput is not giving me the correct value into &JSONTest. Using options mprint, it seems as if SAS is generating all the code for each call execute iteration without actually doing the test I have coded in.

 

Any suggestions?

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

As was mentioned, %nrstr is going to be key here.  In the meantime, here are a few minor issues to consider.

 

Don't ignore log messages.  When you have numeric to character conversion, it indicates you are not necessarily getting the results you expected in that step.  So figure out why that happens and address it.  One of the log messages mentioned that a file doesn't actually exist.  So that needs attention.

 

In this step, minor changes are indicated:

    DATA NULL;
        INFILE "&DataDir.&dfile" LRECL=256 TRUNCOVER;
        INPUT Ln $256.;
        IF _N_ EQ 1 THEN DO;
            IF Ln EQ "{}" THEN CALL SYMPUT("JSONTest",0); ELSE CALL SYMPUT("JSONTest",1);
        END;
    RUN;
    %PUT &JSONTest;

Let's assume you already switched from SYMPUT to SYMPUTX.  That may or may not have an impact on the rest of the program, depending on which set of code you are actually using.  (I can see from your posts you have experimented with a few variations.)  Two more changes are indicated:

    DATA _NULL_;
        INFILE "&DataDir.&dfile" LRECL=256 TRUNCOVER;
        INPUT Ln $256.;
        IF _N_ EQ 1;
        IF Ln EQ "{}" THEN CALL SYMPUT("JSONTest",0);    
        ELSE CALL SYMPUT("JSONTest",1);
        STOP;
    RUN;
    %PUT *&JSONTest*;

NULL is actually a data set name.  You can see the log messages about the data set NULL.  To avoid creating a data set, while going through the motions of a DATA step (appropriate in this case), use _NULL_ instead.

 

Once that data step has read in the first line, there is no need to read in any more data.  Your current data step reads in every line, most of which you don't really need.  So once SYMPUTX has executed, STOP the data step.

 

Finally, the %PUT statement should write out a clearer message that will point out whether you have extra characters in your macro variable.  One way is to add the asterisks in the sample code above.

 

Despite all this, %nrstr will still be necessary and the key to successful debugging.

View solution in original post

11 REPLIES 11
Astounding
PROC Star
You are observing a feature of call execute. It immediately executes macro language statements such as %if %then, but must wait to execute SAS language statements such as call symput. To solve this issue, wrap the executed macro call in %nrstr:

call execute (CATT(%nrstr('%ImportanceDataTop'), 'dset=.....));
JacquesR
Quartz | Level 8

Nope, that is just causing errors with the macro variables in the procedure:

NOTE: CALL EXECUTE generated line.
1 + DATA NULL; INFILE "c:\Users\j...\C...\" LRECL=256
TRUNCOVER; INPUT Ln $256.; IF _N_ EQ 1 THEN DO; IF Ln
2 + EQ "{}" THEN CALL SYMPUT("JSONTest",0); ELSE CALL SYMPUT("JSONTest",1); END;
RUN;

NOTE: Numeric values have been converted to character values at the places given by:
(Line):(Column).
2:38 2:70
ERROR: Invalid file, c:\Users\j...\C... .
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.NULL may be incomplete. When this step was stopped there were 0
observations and 1 variables.
WARNING: Data set WORK.NULL was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds

 

2 +
LIBNAME tmpFile JSON "c:\Users\j...\C...\";
NOTE: JSON data is only read once. To read the JSON again, reassign the JSON LIBNAME.
ERROR: Invalid file, c:\Users\j...\C... .
ERROR: Error in the LIBNAME statement.

 

Kurt_Bremser
Super User
A NOTE about conversion of numeric to character or character to numeric is always a BAD THING.
Using 0 as second argument to CALL SYMPUT causes a conversion with the default format of BEST12., so you get a string containing 11 blanks and the 0. Since CALL SYMPUT will not strip the blanks, the comparison fails. Another reason to use CALL SYMPUTX, which converts on its own (no NOTE) and strips leading and trailing blanks.
Kurt_Bremser
Super User
It is a timing issue, as already mentioned.
Apart from that, be careful with CALL SYMPUT. Rather use CALL SYMPUTX with L as the third argument, so you do not accidentally mess up variables in an outer macro scope.
JacquesR
Quartz | Level 8

I'm sorry if I seem slow, but I'm still not getting it.

Even if I use something like:

IF Ln EQ "{}" THEN CALL SYMPUTX("JSONTest","A","L"); ELSE CALL SYMPUTX("JSONTest","B","L");

The value of &JSONTest does not change from iteration to iteration of the call execute.

And this is despite the value of ln eq "{}" changing from iteration to iteration.

Kurt_Bremser
Super User

Did you use %NRSTR?

Are there no leading blanks in variable ln?

Are there no un-displayable characters in ln?

Run the code in a new SAS session, to make sure that no previous macro variable jsontest exists in the global scope.

Quentin
Super User

The timing of CALL EXECUTE is tricky.  It might help if you start with a very simple example.

 

This macro reads in a dataset and uses CALL SYMPUT to write a value the macro var NAME:

%macro try() ;
  %local name ;
  data _null_ ;
    set sashelp.class ;
    if _n_ EQ 1 then call symputx("Name",name) ;
  run ;
  %put &=name ;
%mend try ;

%try()

If you invoke the macro with CALL EXECUTE, it does not work as you hoped:

data _null_ ;
  call execute('%try()') ;
run ;

 

Log is:

510   data _null_ ;
511     call execute('%try()') ;
512   run ;

MPRINT(TRY):   data _null_ ;
MPRINT(TRY):   set sashelp.class ;
MPRINT(TRY):   if _n_ EQ 1 then call symputx("Name",name) ;
MPRINT(TRY):   run ;
NAME=
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


NOTE: CALL EXECUTE generated line.
1    + data _null_ ;     set sashelp.class ;     if _n_ EQ 1 then call symputx("Name",name) ;   run ;

NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

So the macro variable NAME is still empty.  This happened because of the referenced timing problem.  CALL EXECUTE actually executed the macro, and when it did that it generated all the SAS code and executed that the macro statements.  So the %PUT &=Name statement executes *before* the data _null_ step which writes a value to NAME has even compiled.

 

If you wrap the macro call in %NRSTR(), it changes CALL EXECUTE from executing the macro, to only generating the macro call.

data _null_ ;
  call execute('%nrstr(%try())') ;
run ;

Log:

523   data _null_ ;
524     call execute('%nrstr(%try())') ;
525   run ;

NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


NOTE: CALL EXECUTE generated line.
1    + %try()
MPRINT(TRY):   data _null_ ;
MPRINT(TRY):   set sashelp.class ;
MPRINT(TRY):   if _n_ EQ 1 then call symputx("Name",name) ;
MPRINT(TRY):   run ;

NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


NAME=Alfred

Note that above, the line generated by CALL EXECUTE (with the + prefix) is only the macro call.   This allows the 'usual' timing when the macro executes.  The DATA step is executed first, and then the %PUT statement is executed.

 

BASUG is hosting free webinars Next up: Mike Raithel presenting on validating data files on Wednesday July 17. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
Astounding
PROC Star

As was mentioned, %nrstr is going to be key here.  In the meantime, here are a few minor issues to consider.

 

Don't ignore log messages.  When you have numeric to character conversion, it indicates you are not necessarily getting the results you expected in that step.  So figure out why that happens and address it.  One of the log messages mentioned that a file doesn't actually exist.  So that needs attention.

 

In this step, minor changes are indicated:

    DATA NULL;
        INFILE "&DataDir.&dfile" LRECL=256 TRUNCOVER;
        INPUT Ln $256.;
        IF _N_ EQ 1 THEN DO;
            IF Ln EQ "{}" THEN CALL SYMPUT("JSONTest",0); ELSE CALL SYMPUT("JSONTest",1);
        END;
    RUN;
    %PUT &JSONTest;

Let's assume you already switched from SYMPUT to SYMPUTX.  That may or may not have an impact on the rest of the program, depending on which set of code you are actually using.  (I can see from your posts you have experimented with a few variations.)  Two more changes are indicated:

    DATA _NULL_;
        INFILE "&DataDir.&dfile" LRECL=256 TRUNCOVER;
        INPUT Ln $256.;
        IF _N_ EQ 1;
        IF Ln EQ "{}" THEN CALL SYMPUT("JSONTest",0);    
        ELSE CALL SYMPUT("JSONTest",1);
        STOP;
    RUN;
    %PUT *&JSONTest*;

NULL is actually a data set name.  You can see the log messages about the data set NULL.  To avoid creating a data set, while going through the motions of a DATA step (appropriate in this case), use _NULL_ instead.

 

Once that data step has read in the first line, there is no need to read in any more data.  Your current data step reads in every line, most of which you don't really need.  So once SYMPUTX has executed, STOP the data step.

 

Finally, the %PUT statement should write out a clearer message that will point out whether you have extra characters in your macro variable.  One way is to add the asterisks in the sample code above.

 

Despite all this, %nrstr will still be necessary and the key to successful debugging.

JacquesR
Quartz | Level 8

Thanks for that tip about data null. Somehow, I seem to have got confused about that.

I think in modifying my code, I was sloppy, as I did have it right in the macro, but didn't type it over correctly.

Tom
Super User Tom
Super User

To avoid timing issue caused when the macro executes wrap the macro name into %NRSTR() .

DATA _NULL_;
    SET ListQueries;
    CALL EXECUTE(CATT('%nrstr(%ImportDataTop)(dset=CollatedDataSet,dfile=',file_name,');'));
RUN;

It will also make the SAS log easier to read.

So now your log will look like:

+    %ImportDataTop(dset=CollatedDataSet,dfile=...

Instead of 

+    DATA NULL; INFILE "...
JacquesR
Quartz | Level 8

Thanks to all for the input.

The system really should allow one to mark more than one answer as the solution.

Here, then, is the working code:

%MACRO ImportDataTop(dset=,dfile=);
    /* First test to see if the file is not an empty JSON string */
    DATA _NULL_;
        INFILE "&DataDir.&dfile" LRECL=256 TRUNCOVER;
        INPUT Ln $256.;
        IF _N_ EQ 1;
            IF Ln EQ "{}" THEN CALL SYMPUTX("JSONTest",0); ELSE CALL SYMPUTX("JSONTest",1);
        STOP;
    RUN;
    %PUT |&JSONTest.|;
    %IF &JSONTest EQ 1 %THEN %DO;
        /* Use the SAS json reader in LIBNAME to automatically read the files */
        LIBNAME tmpFile JSON "&DataDir.&dfile";
        DATA WORK.tmpFile;
            SET tmpFile.item;
        /* Do data calculations... */
        RUN;
        /* Append the data to a growing datafile */
            PROC DATASETS
                NOPRINT
                LIBRARY=MYLIB;
                APPEND BASE=&dset
                       DATA=WORK.tmpFile
                       FORCE;
            QUIT;
        PROC DELETE DATA=WORK.tmpFile; RUN;
    %END;
    %ELSE %DO;
        %PUT Blank JSON string in &dfile;
    %END;    
%MEND ImportDataTop;

/* Get the full list of data files */FILENAME _FQuery PIPE "dir ""&DataDir.(*);AU*;*Queries;*;2021-06-02.json"" /b ";DATA ListQueries;
    INFILE _FQuery LRECL=256 TRUNCOVER;
    INPUT file_name $256.;
run;

PROC DELETE DATA=MYLIB.CollatedDataSet;RUN;
DATA NULL;
    SET ListQueries;
    CALL EXECUTE(CATT('%NRSTR(%ImportDataTop)(dset=VaccineSearchesQueries,dfile=',file_name,');'));
RUN;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1281 views
  • 0 likes
  • 5 in conversation