Folks,
I have a macro that runs through N json files and parses them into SAS datasets. However, I've noticed in certain cases the dataset which has address information will be blank as it addresses are not mandatory. As a workaround to this I attempted to put in a if then else do, but SAS appears to be ignoring it. Does anyone have any idea why this is?
%macro json_map(DirectoryAndDataset= );
/*Firstly we read in the dataset called DatasetsInADirectory and keep one record each time depending on what iteration we are in*/
data DatasetsInADirectory_1;
set DatasetsInADirectory;
where DirectoryAndDataset="&DirectoryAndDataset";
run;
/*Use the filename function to read in the individual json file which we have broken into N batches*/
filename json3 "\\collect01\\Migration_03\Datasets\Rev\PM\Rec_&date\paye\&DirectoryAndDataset..json";
/*The next three steps initialises the SAS JSON engine. real_map refers to the custom json map we create which puts the
parsed data into an easier to handle SAS dataset. When the json engine stops parsing the data it outputs it to the libname called json3*/
libname json3 JSON fileref=json3;
filename real_map "\\collect01\\Migration_03\Datasets\Rev\PM\full_json_map_with_labels.map";
libname json3 json fileref=json3 map=real_map;
/* SAS is unable to directly write the parsed json files to another location. Therefore use proc copy to write the datasets
you want into the location out_json. Exclude the dataset called alldata as it is not needed
We are left with two datasets after this*/
proc copy in=json3 out=out_json memtype=data;
exclude alldata;
run;
/*Appears to be an issue copying a dataset straight over using a SAS mapping engine using a datasetp. This causes issues later on. Use an SQL
step to create the temp dataset*/
PROC SQL;
CREATE TABLE everything_1 AS
SELECT *
FROM out_json.everything;
QUIT;
/*It appears that supplying address information is not mandatory so in some of the n batches the out_json.address_addresslines will
be blank. Putting in a caluse is casues issues too so just create a dummy dataset if N=0*/;
proc sql noprint;
select
count(*)
into
:rowCount trimmed
from
json3.address_addresslines
;
quit;
%put NOTE: &=rowCount;
%if &rowCount = 0 %then %do;
data out_json.&DirectoryAndDataset;
length json_file $40;
set everything_1;
json_file="&DirectoryAndDataset";
run;
%end;
%else %do;
/*For the dataset out_json.address_addresslines we transpose the address information into one variable*/
proc transpose data=out_json.address_addresslines out=addresses prefix=add_;
by ordinal_root;
var addressLine;
run;
data address;
set addresses;
full_address=catx(',', of add_:);
run;
/*Use SQL to join the two datasets called out_json.everything and work.address to create the dataset out_json.data_of_interest
This should create a replica of what appears in the json file i.e. each payslip record in the json file will have the same record in the SAS file;
We use Ordinal root as the linking key;*/
proc sql;
create table out_json.data_of_interest as select *
from
out_json.everything
full join
work.address(rename=(ordinal_root=ordinal_temp))
on everything.ordinal_root=address.ordinal_temp;
quit;
/*Finally we use the proc datasets option to delete the temporary datasets address_addresslines and everything.
We rename the dataset out_json.data_of_interest to &DirectoryAndDataset (this depends on what Json you're
reading into) so when the macro runs through each dataset it doesn't
write over the previous dataset*/
proc datasets lib=out_json nodetails nolist;
delete address_addresslines everything;
change data_of_interest=&DirectoryAndDataset;
run;
data out_json.&DirectoryAndDataset;
length json_file $40;
set out_json.&DirectoryAndDataset;
json_file="&DirectoryAndDataset";
/*Also create a variable called json_file so you know from which record each json file comes from. This allows us to go
back into the raw json file to check if there is an issue with the actual data or just how it was proccessed*/
run;
%end;
%mend json_map;
*The above is the macro which is used, the below is the function which carries the process out for all N datasets;
/*%json_map(DirectoryAndDataset= );
/*Put a timer at begining and end to see how long processing takes for everything*/
%let datetime_start = %sysfunc(TIME());
%put START TIME: %sysfunc(datetime(),datetime14.);
data macro_call;
set DatasetsInADirectory;
*build macro call string;
str = catt('%json_map(DirectoryAndDataset =', DirectoryAndDataset, ');');
*call macro;
call execute(str);
run;
%put END TIME: %sysfunc(datetime(),datetime14.);
%put PROCESSING TIME: %sysfunc(putn(%sysevalf(%sysfunc(TIME())-&datetime_start.),mmss.)) (mm:ss);
I know dataset json file paye_123 has no address information so that dataset out_json.address_addresslines will be blank. However, SAS will not take this into account when I run macro_call function. It will however, when I individually run one dataset at a time into the macro.
Is there a reason for this?
Were you able to figure out which step in that long macro is being confused? If so please tell us so we can just concentrate on that (or at least rule it out).
Is this the step that is failing?
proc sql noprint;
select count(*) into :rowCount trimmed from json3.address_addresslines;
quit;
If so then perhaps you need to test if the dataset exists or not and do something else in that case. Something like:
%let rowCount=-1;
%if %sysfunc(exist(json3.address_addresslines)) %then %do;
proc sql noprint;
select count(*) into :rowCount trimmed from json3.address_addresslines;
quit;
%end;
Hi Tom,
It seems like SAS is ignoring the SQL count step?
NOTE: PROCEDURE SQL used (Total process time):
real time 0.23 seconds
cpu time 0.07 seconds
165 + ; proc sql noprint; select
count(*) into :rowCount trimmed from json3.address_addresslines ; quit; proc transpose
55 The SAS System 15:20 Wednesday, November 27, 2019
NOTE: PROCEDURE SQL used (Total process time):
real time 0.12 seconds
cpu time 0.12 seconds
166 + data=out_json.address_addresslines out=addresses prefix=add_; by ordinal_root; var addressLine; run; data
address; set addresses; full_address=catx(',', of add_:); run; proc sql; create table out_json.data_of_interest as
select *
NOTE: There were 0 observations read from the data set OUT_JSON.ADDRESS_ADDRESSLINES.
NOTE: The data set WORK.ADDRESSES has 1 observations and 2 variables.
NOTE: PROCEDURE TRANSPOSE used (Total process time):
real time 0.04 seconds
cpu time 0.00 seconds
71: LINE and COLUMN cannot be determined.
NOTE: NOSPOOL is on. Rerunning with OPTION SPOOL might allow recovery of the LINE and COLUMN where the error has occurred.
ERROR 71-185: The CATX function call does not have enough arguments.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.ADDRESS may be incomplete. When this step was stopped there were 0 observations and 3 variables.
WARNING: Data set WORK.ADDRESS was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
That cannot be the log the posted code. SAS logs of normal program execution do not look like that.
If looks like it is log of code generated by CALL EXECUTE(). If so then you have a timing issue. When you submit macro statements with CALL EXECUTE() the macro logic runs immediately and the resulting generated code is what gets pushed into the stack to run after the data step ends. This means that the test of the macro variable generated by the INTO clause of the SQL query is tested BEFORE it has had a chance to execute.
Did you wrap that code into a macro and use call execute to call the macro? If so then use %NRSTR() in the code you push with CALL EXECUTE() to delay the execution of the macro until it is retrieved from the stack. So write your macro call like this:
call execute(cats('%nrstr(%mymacro)','(parm1,parm2,parm3)'));
And what you will see in the SAS log will look like this:
165 + %mymacro(parm1,parm2,parm3)
Instead of this smashed together confusion:
166 + data=out_json.address_addresslines out=addresses prefix=add_; by ordinal_root; var addressLine; run; data
address; set addresses; full_address=catx(',', of add_:); run; proc sql; create table out_json.data_of_interest as
select *
Thanks Tom.
So my macro call is this.
data macro_call;
set DatasetsInADirectory;
*build macro call string;
str = catt('%json_map(DirectoryAndDataset =', DirectoryAndDataset, ');');
*call macro;
call execute(str);
run;
So how would I replicate with your example. Thanks very much for your help. It's much appreciated.
What did you try? Did you try wrapping the %XXX inside of %NRSTR() like I said?
str = catt('%nsrstr(%json_map)(DirectoryAndDataset =', DirectoryAndDataset, ');');
Hi Tom,
Inserting this begins to cause a number of errors.
Namely this;
WARNING: Apparent invocation of macro NSRSTR not resolved.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.