BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
bayzid
Obsidian | Level 7

I am trying to concatenate multiple JSON files and then converting the resulting file into SAS format. How can I include the source filename as a new variable in the concatenated file?

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You could try adding the DSD option, that will allow the quotes to protect the delimiters.

data tall ;
  length row col 8 name $32 value $256 ;
  length filen filename $256 ;
  infile "c:\downloads\MyDrug*.json" dlm='[{,:}]' dsd truncover length=ll column=cc filename=filen;
  input @;
  if filen ne lag(filen) then row=0;
  filename=filen;
  row+1;
  do col=1 by 1 until(cc>ll);
    input name value @;
    name=dequote(name);
    if value='null' then value=' ';
    else value=dequote(value);
    if name ne ' ' then output;
  end;
run;

The [{ or ,{ at the start of each line will cause there to be two extra empty values, but the IF statement that ignores the pairs without a name will cause those to be ignored.

 

View solution in original post

56 REPLIES 56
Quentin
PROC Star

Can you show sample code you are using to read and concatenate two JSON files?  And perhaps attach two small sample JSON files (or include code to generate the two sample JSON files).

Check out the Boston Area SAS Users Group (BASUG) video archives: https://www.basug.org/videos.
bayzid
Obsidian | Level 7

data _null_;
/* Iterate over all the files input */
infile '\json_test\MyDrug*.json' end=eof;
/* Open the file to write the output to */
file '\MyDrug_all.json';

/* Read a line of input */
input;

/* If this is the first line in the file print it to the output file verbatim */
if _n_ = 1 then
put _infile_;
/* If this is the last line print it out verbatim */
else if eof then
put _infile_;
/* Otherwise we are in the middle somewhere */
else do
/* Check what the first character of the line is */
/* Its either [, ], or , */
first_character = substr(_infile_,1,1);
select (first_character);
/* If its [ we must remove it */
when ("[") do;
line_without_first_character = substr(_infile_,2);
put line_without_first_character;
end;
/* If its ] we must swap it to a , to allow more objects in the array */
when ("]") put "," @@;
otherwise put _infile_;
end;
end;
run;

bayzid
Obsidian | Level 7
 
Tom
Super User Tom
Super User

Your JSON is sooooooo simple you could just read it as name/value pairs directly.

data tall ;
  length row col 8 name $32 value $256 ;
  length filen filename $256 ;
  infile "c:\downloads\MyDrug*.json" dlm='[{,:}]' truncover length=ll column=cc filename=filen;
  input @;
  if filen ne lag(filen) then row=0;
  filename=filen;
  row+1;
  do col=1 by 1 until(cc>ll);
    input name value @ ;
    name=dequote(name);
    if value='null' then value=' ';
    else value=dequote(value);
    if name ne ' ' then output;
  end;
run;

proc transpose data=tall out=wide(drop=_name_);
  by filename row ;
  id name;
  var value;
run;

proc print data=wide ;
 where row=1;
run;

Result

Tom_0-1681942571760.png

 

bayzid
Obsidian | Level 7

Hi Tom,
That's excellent. I am almost there. With my complete data pool (>83k JSON files into 21 groups) the tall is being created without any issue but the transpose proc is not working. It's returning the following error message.
NOTE: The above message was for the following BY group:
filename=C:\...ts\JSON_1st_0203
ERROR: The ID value "null" occurs twice in the same BY group.
ERROR: The ID value "false" occurs twice in the same BY group.

bayzid
Obsidian | Level 7

Hi Tom,
I found the reason for the error message. You are using comma as a delimiter but it also appears in some of the values. Then the problem happens.

bayzid_0-1681949526977.pngbayzid_1-1681949603832.png

Is there a way to get around this problem easily?

Tom
Super User Tom
Super User

You could try adding the DSD option, that will allow the quotes to protect the delimiters.

data tall ;
  length row col 8 name $32 value $256 ;
  length filen filename $256 ;
  infile "c:\downloads\MyDrug*.json" dlm='[{,:}]' dsd truncover length=ll column=cc filename=filen;
  input @;
  if filen ne lag(filen) then row=0;
  filename=filen;
  row+1;
  do col=1 by 1 until(cc>ll);
    input name value @;
    name=dequote(name);
    if value='null' then value=' ';
    else value=dequote(value);
    if name ne ' ' then output;
  end;
run;

The [{ or ,{ at the start of each line will cause there to be two extra empty values, but the IF statement that ignores the pairs without a name will cause those to be ignored.

 

bayzid
Obsidian | Level 7

Hi Tom,

That's fantastic. It worked.

bayzid
Obsidian | Level 7

Hi Tom,
With one of the data files i am recieving the following error message.
NOTE: 29 records were read from the infile "O:\All Staff\BESTMED\Backup\final\Resident*.json".
The minimum record length was 194.
The maximum record length was 866.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.TALL may be incomplete. When this step was stopped there were
35926321 observations and 5 variables.
I changed the length of the value to 900 but still didn't work.
The Note doesn't give exact file name. I have thousands of files as resident*.json.

Quentin
PROC Star

Check your log, there should be an ERROR message, and probably an informative dump of data.

 

You might try adding something like:

if _error_ then putlog "Uh-oh " filename= ;

after the input statement, if it looks like that is what is causing the problem.

Check out the Boston Area SAS Users Group (BASUG) video archives: https://www.basug.org/videos.
bayzid
Obsidian | Level 7

I put the statement as below. But it didn't create any text document.

input @;
if _error_ then putlog "Uh-oh " filename="O:\All Staff\BESTMED\Backup\final\error.txt" ;

 

Quentin
PROC Star

The putlog statement won't write a file, it writes a message to the log.  I was thinking place:

if _error_ then putlog "Uh-oh " filename= ;

After second input statement, to see if it ever detects that SAS has set the _error_ variable to true.  

 

But I agree with Tom, since you're reading values as character, I don't see anything that could be causing an error.

 

Can you run the step and save the log, and attach the full log.  There should be a more informative error message somewhere.

 

Tom's idea about a file lock/permissions issue is a good one.  If the json files are sitting on some remote drive, maybe try copying them to a local drive that only you can access, and see if the program works then.  If it does, that would suggest a permissions issue, or some other network issue.  If it still errors locally, the next question would be does it always error at the same point (same record count in the output file)?  

 

Another trouble-shooting step: take Tom's code, and turn it into a macro which will accept the name of a single JSON file as a parameter, and process a single file.  Then use CALL EXECUTE or whatever to call the macro once per file, and see which (if any) file causes it to error.

Check out the Boston Area SAS Users Group (BASUG) video archives: https://www.basug.org/videos.
BayzidurRahman
Obsidian | Level 7

The problem is with the delimiters. Some of the delimiters are also in some values.

BayzidurRahman_0-1682219987480.pngBayzidurRahman_1-1682220034345.png

 

Reeza
Super User
Try reading the file with the JSON libname instead then.

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg

 

 

Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 56 replies
  • 1161 views
  • 7 likes
  • 6 in conversation