BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Quentin
Super User

Looks like you're manually reading the json files and writing a new json file, right?

 

There is a filename=  option file the INFILE statement which will create a temporary variable that has the name of the current input file.  So it might be possible to  use that to add a field to your output json file with the name of the file each record came from.  

 

But seems like this would be easier to start by reading in each JSON file (with JSON engine), and then concatenate them in SAS.  Or perhaps make a single filreref which points to all the JSON files, and read that using the JSON engine.

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
bayzid
Obsidian | Level 7

Hi Quentin,

Can you please show me an example code?

bayzid
Obsidian | Level 7

Hi ballardw,

I know that it is very easy to add source file name while concatenating multiple SAS files. But for my situation, i need to concatenate the JSON files before converting to SAS file.

Reeza
Super User

@bayzid wrote:

Hi ballardw,

I know that it is very easy to add source file name while concatenating multiple SAS files. But for my situation, i need to concatenate the JSON files before converting to SAS file.


I think this is where we disagree with you. You should read each file separately and append the results into one SAS file with the dates. Can you clarify why you think you need to combine them first?

bayzid
Obsidian | Level 7

Hi Reeza,
For most of the variables, the length and variable type varies across datasets. SAS cannot handle this during concatenation.

Reeza
Super User

@bayzid wrote:

Hi Reeza,
For most of the variables, the length and variable type varies across datasets. SAS cannot handle this during concatenation.


That should have been your question. Fix your folder path accordingly and this will read all the files and combine them into one. Then use PROC TRANSPOSE to flip if you want the wide format. 

 

filename dirtree url 
    'https://raw.githubusercontent.com/sasutils/macros/master/dirtree.sas';
%include dirtree /source2;
%dirtree(/home/fkhurshed/CLASS1    /* Pipe delimited directory list (default=.) */
        , out=filelist  /* Output dataset name */
        , maxdepth=1 /* Maximum tree depth */);

%macro read_json(myfile=);
    filename mydata &myfile.;
    libname myjson JSON fileref=mydata;

    proc datasets lib=myjson;
    quit;

    *how to parse the file has not been specified - looks like you need to specify the lengths;

    data file2save;
        length source $200. p1 $256. Value $256.;
        set myjson.alldata;
        source=&myfile.;
    run;

    proc append base=master data=file2save force;
    run;

    proc sql;
        drop table file2save;
    quit;

    filename mydata;
    libname myjson;
%mend read_json;

proc sql;
    drop table master;
quit;

data run_list;
    set filelist;
    where type='F';
    str=catt('%read_json(myfile="/home/fkhurshed/CLASS1/', filename, '");');
    call execute(str);
run;

 

 

 

bayzid
Obsidian | Level 7

Hi Reeza,
Thanks for the code. It ran with lots of errors. The Master dataset was empty. The log file is attached.

Tom
Super User Tom
Super User

@bayzid wrote:

Hi Reeza,
For most of the variables, the length and variable type varies across datasets. SAS cannot handle this during concatenation.


When you read the JSON files use the SAME mapping each time and the variables will be defined consistently.

Quentin
Super User

I would use a macro to read one file at a time and capture the name of the file, like:

 

%macro readjson(file=/*json file*/,out=/*output sas dataset*/) ;

filename jfile "&file" ;
libname jfile json;

data &out ;
  set jfile.root ;
  length sourcefilename $200 ;
  retain sourcefilename "%scan(&file,-1,/\)" ;
run ;

filename jfile clear ;
libname jfile clear ;

%mend readjson ;

%readjson(file=C:\...\MyDrug_8c209d34-1b3c-48a5-ae3a-90763c4f4d2e.json,out=want1)
%readjson(file=C:\...\MyDrug_5b056720-1620-4e67-9f42-7a0587da9a08.json,out=want2)

data all ;
  set want1 want2 ;
run ;

One challenge with that approach is you end up with variable type conflicts (your column Schedule) and length conflicts.  So with this approach, I use a data dictionary to impose types/lengths after reading the data. Or, probably better, you could specify a json map when you read the data.  I suppose putting all of the data into one json file would avoid this problem, but it still feels messy.  If you're going to need to do this a lot, I think I would put in the time to define a map for the json file, and then use that to read each file into a SAS dataset, and then combine them.

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
bayzid
Obsidian | Level 7

Hi,
Thanks for the code. Unfortunately, it didn't work. Data want2 was not created and data want1 was empty.
I am trying to concatenate all the JSON file before conversation to SAS due to the reason you have mentioned.

ballardw
Super User

I have no idea how you would do that in JSON objects. However if your read the JSON into separate data sets SAS can add the source data set name very easily. If you want the JSON file name that would mean you want to do something when creating the SAS data set from JSON to include the file, which may depend on how you are reading it.

 

If you have two (or more) SAS data sets the example code would look like:

data want;
    set data1 data2 indsname=fname;
    source = fname;
run;

The INDSNAME= option assigns the name of the data set contributing the observation to a temporary variable named after the =, in this case Fname. Temporary means that it is not written to the data set. To have the value kept you assign it to another variable that will be in the output.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 56 replies
  • 2215 views
  • 7 likes
  • 6 in conversation