I am trying to import a ndjson file into sas using the below solution,
https://communities.sas.com/t5/SAS-Programming/Reading-NDJSON-Newline-Delimited-JSON/m-p/665720
we were able to load a smaller length size file with the above solution, but facing issues while using the same logic with a different file.
i.e. https://bcda.cms.gov/assets/data/ExplanationOfBenefit.ndjson
Some lines from the file are getting missed out while import.
Thanks.
filename jsonl "/bcda/ExplanationOfBenefit.ndjson"; filename json temp; filename map "/bcda/ExplanationOfBenefit.map"; /* Convert the NDJSON to JSON */ data _null_; infile jsonl end=eof; file json; put '[{"records":['; /* Required for JSON formatting */ do until (eof); input; line = STRIP(_infile_); if (eof=0) then line=cats(line, ','); put line; end; put ']}]'; /* Required for JSON formatting */ stop; run; libname json json automap=create map=map; proc contents data=json._all_; run;
The max length for a var also the _INFILE_ used in the example code is 32767. Many lines in your data are longer than 32767, so the code will not work.
I changed the code, so that a line that is longer than 32767 is also processed correctly. The logic is lightly different for the reading.
See here:
/* Convert the NDJSON to JSON */
data _null_;
  length inputline $ 32767;
  infile jsonl end=eod lrecl=1048576 LENGTH=_ll truncover;
  file json lrecl=1048576;
  if _n_ = 1 then do;
    put '[{"records":['; /* Required for JSON formatting */
  end;
  input @;
  pos = 1;
  do i = 1 to ceil(_ll / 32767);
    input @pos inputline $char32767. @;
    il_l = length(inputline);
    put @pos inputline $varying32767. il_l @;
    pos = i * 32767 + 1;
  end;
  if eod = 0 then do;
    put ",";
  end;
  if eod = 1 then do;
    put;   
    put ']}]'; /* Required for JSON formatting */
  end;
run;The max length for a var also the _INFILE_ used in the example code is 32767. Many lines in your data are longer than 32767, so the code will not work.
I changed the code, so that a line that is longer than 32767 is also processed correctly. The logic is lightly different for the reading.
See here:
/* Convert the NDJSON to JSON */
data _null_;
  length inputline $ 32767;
  infile jsonl end=eod lrecl=1048576 LENGTH=_ll truncover;
  file json lrecl=1048576;
  if _n_ = 1 then do;
    put '[{"records":['; /* Required for JSON formatting */
  end;
  input @;
  pos = 1;
  do i = 1 to ceil(_ll / 32767);
    input @pos inputline $char32767. @;
    il_l = length(inputline);
    put @pos inputline $varying32767. il_l @;
    pos = i * 32767 + 1;
  end;
  if eod = 0 then do;
    put ",";
  end;
  if eod = 1 then do;
    put;   
    put ']}]'; /* Required for JSON formatting */
  end;
run;Just make sure LRECL is big enough for the longest line. If lines have different lengths does not matter check your SAS log for something like:
NOTE: 1588 records were read from the infile JSONL.
      The minimum record length was 8218.
      The maximum record length was 65536.
      One or more lines were truncated.
If you do not see this message, then all should be ok.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
