JSON engine add object ID to ALLDATA?

Quentin · Posted 04-18-2022 05:22 PM

Hi All,

I've been playing with the JSON engine and the ALLDATA dataset it creates. It seems... almost useful?

It looks like it's just a collection of name-value pairs, without any indication of a key such as a row number / object number / ordinal root / something that would allow you to identify the group of name-value pairs that came from a single object.

For example, given this JSON data with three objects:

filename myjson temp;

data _null_;
  file myjson ;
  put '[' ;
  put  '{"x":1}';
  put  ',' ;
  put  '{"x":10, "y":100}';
  put  ',' ;
  put  '{"z":1000}';
  put ']';
run;

libname myj JSON fileref=myjson;

data want ;
  set myj.alldata ;
  put (_all_)(=) ;
run ;

Alldata has:

345  data alldata ;
346    set myj.alldata ;
347    put (_all_)(=) ;
348  run ;

P=1 P1=x V=1 Value=1
P=1 P1=x V=1 Value=10
P=1 P1=y V=1 Value=100
P=1 P1=z V=1 Value=1000

So looking at work.AllData, there is no object ID which would allow you to see with values came from which object. You might assume that the data has two objects, when in fact it has three.

Is there some way to add an Object ID to alldata?

In a simple case if I could assume that the first name-value pair in an object is always a unique ID, I wouldn't need the JSON engine to make an ObjectID for me. But since a JSON object has unordered name-value pairs, even if I know my data has an ID, it doesn't seem safe to assume that the first name-value pair will always be that ID. Without an ObjectID I'm struggling to see how this ALLDATA table could be useful. Am I missing some option?

Thanks,

-Q.

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.

ChrisHemedinger · Posted 04-18-2022 05:59 PM

It's up to whoever designs the JSON to add the extra object name/layer if it's important for the data. I've used the JSON engine with dozens of different services/sources and if there has ever been ambiguity, it's due to the construction of the original data and not the engine.

A challenging source can come from GraphQL sources, which are usually designed to fill a UI component on a web page (think React) and not fill a 2-D data model, so you really need to know more about what to expect from the data in that case. I answered a similar question on StackOverflow.

Check out SAS Innovate on-demand content! Watch the main stage sessions, keynotes, and over 20 technical breakout sessions!

Quentin · Posted 04-18-2022 06:17 PM

Thanks Chris,

I'm willing to expect the provider to have a recordID field for each object. But I don't think I should require them to enforce that the recordID is always the first name-value pair of an object.

If they give me:

filename myjson temp;

data _null_;
  file myjson ;
  put '[' ;
  put  '{"x":1,"ID":"001"}';
  put  ',' ;
  put  '{"ID":"002","x":10,"y":100}';
  put  ',' ;
  put  '{"z":1000,"ID":"003"}';
  put ']';
run;

libname myjson JSON fileref=myjson;

data want ;
  set myjson.alldata ;
  put (_all_)(=) ;
run ;

That seems fair (since JSON objects are unordered), but the ALLDATA table loses the information about which attributes are associated with each ID. It returens:

16   data want ;
17     set myjson.alldata ;
18     put (_all_)(=) ;
19   run ;

P=1 P1=x V=1 Value=1
P=1 P1=ID V=1 Value=001
P=1 P1=ID V=1 Value=002
P=1 P1=x V=1 Value=10
P=1 P1=y V=1 Value=100
P=1 P1=z V=1 Value=1000
P=1 P1=ID V=1 Value=003

I guess for ALLDATA to be useful, I have to assume that attribute order is meaningful, which seems odd, since a json object is an unordered set of name-value pairs. So a JSON provider should be free to change the order of attributes anytime they want.

--Q.

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.

JSON engine add object ID to ALLDATA?

Re: JSON engine add object ID to ALLDATA?

Re: JSON engine add object ID to ALLDATA?

JSON engine add object ID to ALLDATA?

Re: JSON engine add object ID to ALLDATA?

Re: JSON engine add object ID to ALLDATA?

Click image to register for webinar

Classroom Training Available!