BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Chuck_IV4
Obsidian | Level 7
So I have read in a supposed JSON file from a vendor and normally, when I get a JSON file, when viewing it, the data is all on one line. The JSON engine handles this just fine.

However, the file I am receiving from this vendor, has it separated somehow, where each record shows on its own line. This is causing issues with the JSON engine. It tells me...

ERROR: Invalid JSON in input near line 2 column 2: Unexpected characters found after valid JSON text.

If I take just one line of data from the file, the JSON engine reads it perfectly, but it chokes on anything past the first line because I think it is possibly putting a CR/LF at the end of each line and this is causing the issue. Is there some type of option I can set with the JSON engine to tell it to ignore the characters at the end, or will I need to filter these out somehow via maybe an infile statement?

Thank You.
1 ACCEPTED SOLUTION

Accepted Solutions
Chuck_IV4
Obsidian | Level 7

Unfortunately yes, that is my answer. I just talked to SAS and the JSON engine specialist says it cannot read that type of file.

Hopefully they have this on their to do list as it seems like only a small update would be needed to handle the shifting to a new line, As I mentioned, if I have a file with just one line, the JSON engine works perfectly, it just doesn't know what to do after it hit's the end of the first line.

View solution in original post

11 REPLIES 11
RW9
Diamond | Level 26 RW9
Diamond | Level 26

That's a bit peculiar, most of the json files I have seen are formatted nicely to view in a text editor, so have end of lines.  Is there perhaps a non printed or special character in there?  Maybe something happened when transferring the file, maybe unix eof on windows or something like that.  Really stabbing in the dark here, I can find nothing on end of line for json causing an issue.  Can you perhaps post a few lines of the file, or the file.

Chuck_IV4
Obsidian | Level 7
Here are two lines. In looking at the file, it is missing some elements of a normal JSON file(at least one that I am used to seeing)...

{"event":"ws_page_load","properties":{"time":1536988982,"distinct_id":"165dd2f00c2468-0bcb9b2618f4ee-a35346f-130980-165dd2f00c3c55","$browser":"Chrome","$browser_version":64,"$city":"Trumbull","$current_url":"https://rgs-web-prod.fingerprintplay.com/teachers/login","$initial_referrer":"$direct","$initial_ref... (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.167 Safari/537.36"}}
{"event":"ws_page_load","properties":{"time":1536988982,"distinct_id":"165dd2f00c2468-0bcb9b2618f4ee-a35346f-130980-165dd2f00c3c55","$browser":"Chrome","$browser_version":64,"$city":"Trumbull","$current_url":"https://rgs-web-prod.fingerprintplay.com/teachers/login","$initial_referrer":"$direct","$initial_ref... (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.167 Safari/537.36"}}
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Well, looking at those lines, one thing jumped out at me in the row:

"$initial_ref... (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.167 Safari/537.36"

Doesn't appear to be a parameter:reponse.  Other than that I don't see a problem. 

 

Maybe someone else can chime in here.

Chuck_IV4
Obsidian | Level 7
As a note, those are the first two lines in the file.
Chuck_IV4
Obsidian | Level 7
Here is the note from the vendor about the return format...

Return Format:

One event per line, sorted by increasing timestamp. Each line is a valid JSON object although the return itself is valid JSON but instead JSONL. Timestamps are expressed in seconds since January 1, 1970 in your project's timezone, not UTC as a true epoch timestamp. For example, if your project is set to Pacific time, you would need to add 8 hours (or 7 hours if not in daylights savings time) (60 min * 60 secs * 8 hours) to the timestamp in order to convert this timestamp into UTC. This means that converting the raw exported timestamps using many epoch converters will result in representing times with the incorrect offset.


On a side note, is there a way to edit a previous post?
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Click the gear icon in the top right of the post, you should be able to edit your own posts.

Chuck_IV4
Obsidian | Level 7

Ahh, I was in some strangely formatted version of the site and wasn't seeing the gear icon. After closing my browser and reopening it, the site looks like it used to.

 

Anyway, it seems like that the return is something called JSONL, which the JSON engine seems to have issues with. I'm wondering if there's some type of option in maybe a custom mapping file to let it know it's JSONL vs JSON.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Ah, well that might be your answer.  JSONL appears to be a slight deviation from JSON, so its possible the JSON engine can't read it directly.

http://jsonlines.org/

You may need to read it some other way.

Chuck_IV4
Obsidian | Level 7

Unfortunately yes, that is my answer. I just talked to SAS and the JSON engine specialist says it cannot read that type of file.

Hopefully they have this on their to do list as it seems like only a small update would be needed to handle the shifting to a new line, As I mentioned, if I have a file with just one line, the JSON engine works perfectly, it just doesn't know what to do after it hit's the end of the first line.

ChrisHemedinger
Community Manager

I've seen these types of JSONL files before -- you could use SAS DATA step to pass through the file and break up the larger collection into individual files, then point the JSON engine at each of those.  Then rejoin the data in DATA step or SQL.

Check out SAS Innovate on-demand content! Watch the main stage sessions, keynotes, and over 20 technical breakout sessions!
Chuck_IV4
Obsidian | Level 7

That's an interesting thought. Reading it in line by line, outputting each line to a temp file then using the JSON engine to read that file and append it to the current dataset. Only issue I can see is making sure I create a map file that has the max length of each field.

 

I may give this a shot. It sounds a lot easier that parsing the lines myself.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 2605 views
  • 1 like
  • 3 in conversation