BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ICL1986
Fluorite | Level 6

Hi, I'm trying to get the attached data into SAS in order to analyse it alongside another data set. I don't have any experience working with json formatted data, so am having a lot of difficulty trying to get it in in a usable format. Trawled through the boards and found this solution (with edits for my data structure):

 

data x;
infile 'C:\Desktop\psc.txt' dsd lrecl=30000000 dlm='{[:,';
input x : $2000.@@;
run;
data temp;
merge x x(firstobs=2 rename=(x=_x));
if lowcase(x) in ("company_number","data__address__address_line_1","data__address__address_line_2","data__address__care_of","data__address__country","data__address__locality","data__address__po_box","data__address__postal_code","data__address__premises","data__address__region","data__ceased_on","data__country_of_residence","data__date_of_birth__month","data__date_of_birth__year","data__description","data__etag","data__exemptions__psc_exempt_as_shares_admitted_on_market__exemption_type","data__exemptions__psc_exempt_as_shares_admitted_on_market__items","data__exemptions__psc_exempt_as_trading_on_regulated_market__exemption_type","data__exemptions__psc_exempt_as_trading_on_regulated_market__items","data__exemptions_count","data__generated_at","data__identification__country_registered","data__identification__legal_authority","data__identification__legal_form","data__identification__place_registered","data__identification__registration_number","data__kind","data__links__self","data__name","data__name_elements__forename","data__name_elements__honours","data__name_elements__middle_name","data__name_elements__surname","data__name_elements__title","data__nationality","data__natures_of_control","data__notified_on","data__persons_of_significant_control_count","data__statement","data__statements_count");
run;
data temp;
set temp;
if lowcase(x)='company_number' then group+1;
run;

proc transpose data=temp out=want(drop=_:);
by group;
id x;
var _x;
run;

 

Currently the ouput is a series 'group', and then the company numbers that have been successfully extracted. I'm a beginner with SAS programming language, but it seems like the other sets of data aren't defined in the code, but I'm really stuck with how to fix it.

 

Thanks, Ian

1 ACCEPTED SOLUTION

Accepted Solutions
BillM_SAS
SAS Employee

I found the JSON in the attached psc.txt file is invalid. I modified the file as I think it should be for what you appear to desire (and the modified file is attached to this post). With a valid JSON file, I ran the JSON LIBNAME engine (available in SAS 9.4 maintenance release 4) to produce 4 SAS data sets. I then used the SQL procedure to combine the SAS data sets into records similar to the JSON data. This demonstrates how to reassemble the hierarchical data in the JSON data file that was flatten into SAS data sets. The complete SAS program is included and shows how easy it is to import JSON data into SAS data set using the JSON LIBNAME engine.

 

libname x5 json 'psc5.txt';
proc datasets lib=x5; 
     quit;
proc contents data=x5._all_; 
     run;

proc sql;
     create table work.records as
     select root.company_number, data.etag, data.kind, data_links.self,
            data.notified_on, data.statement, data.ceased_on
     from x5.root as root, X5.DATA_LINKS as DATA_LINKS, X5.DATA as data
     where root.ordinal_root = data.ordinal_root and
           data.ordinal_data = data_links.ordinal_data
           ;
     quit;
proc print data=work.records(obs=5); 
     run;

 

 

    company_
Obs  number                       etag

  1 10869580    f327a09d5a3a08af72b03211f3a2e148814d75dd
  2 10869582    83f75b47516bb1c7bac875c7f97134cb51711f2d
  3 10869588    e730a688879c2f724e8c881267497f9b10ba2688
  4 10869591    83dcecd921dee3f8099bd90a4d355b104ffacc9f
  5 10869596    f570d6220979853777d6bad1418cafd5a4f7c418


Obs                    kind

  1 persons-with-significant-control-statement
  2 persons-with-significant-control-statement
  3 persons-with-significant-control-statement
  4 persons-with-significant-control-statement
  5 persons-with-significant-control-statement


Obs                                    self

  1 /company/10869580/persons-with-significant-control-statements/JZsKO5aylMYj
  2 /company/10869582/persons-with-significant-control-statements/5xQ7ZCMaLDHO
  3 /company/10869588/persons-with-significant-control-statements/rHluQkBUwwXw
  4 /company/10869591/persons-with-significant-control-statements/rK8w6OtlgY9f
  5 /company/10869596/persons-with-significant-control-statements/w1mzlYcY-BoB

    notified_                                                      ceased_
Obs     on                          statement                        on

  1 2017-07-17   no-individual-or-entity-with-signficant-control
  2 2017-07-17   no-individual-or-entity-with-signficant-control
  3 2017-07-17   no-individual-or-entity-with-signficant-control
  4 2017-07-17   no-individual-or-entity-with-signficant-control
  5 2017-07-17   no-individual-or-entity-with-signficant-control

 

NOTE: There were 5 observations read from the data set WORK.RECORDS.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.04 seconds
      cpu time            0.00 seconds

View solution in original post

8 REPLIES 8
ChrisBrooks
Ammonite | Level 13

This is a bit of a tough ask if you're a beginner in SAS - in my experience the best way to parse JSON is to use the Proc DS2 JSON parser. There's a good example in this artice https://blogs.sas.com/content/sasdummy/2015/09/28/parse-json-from-sas/

 

It looks like your JSON is fairly straightforward so if you give it a try and post again if you have any problems.

ICL1986
Fluorite | Level 6

Think that solution's a bit beyond me as well! I'll have an ask around at work to see whether anyone else can give a hand.

 

Thanks for you help, Ian

AlanC
Barite | Level 11

See if this helps:

 

https://blogs.sas.com/content/sasdummy/2016/12/02/json-libname-engine-sas/?postid=710499380

 

I have not tried this code yet since I normally use JSON engines outside of SAS but 9.4M4 has a new libname engine for JSON.

https://github.com/savian-net
Ksharp
Super User

Yeah. The best to solve this kind of problem is using JSON engine of LIBNAME statement, if you have SAS 9.4 m4.

 

filename x 'c:\temp\psc.txt';
data x;
 infile x recfm=n dlm='{},';
 input x : $2000.@@;
 a=dequote(scan(x,1,':'));
 b=dequote(scan(x,-1,':','m'));
 if anyspace(x)=1 then do;group+1;delete;end;
run;
proc transpose data=x out=want;
by group;
id a;
var b;
run;
ICL1986
Fluorite | Level 6

This seems to have worked! Thanks so much.

BillM_SAS
SAS Employee

I found the JSON in the attached psc.txt file is invalid. I modified the file as I think it should be for what you appear to desire (and the modified file is attached to this post). With a valid JSON file, I ran the JSON LIBNAME engine (available in SAS 9.4 maintenance release 4) to produce 4 SAS data sets. I then used the SQL procedure to combine the SAS data sets into records similar to the JSON data. This demonstrates how to reassemble the hierarchical data in the JSON data file that was flatten into SAS data sets. The complete SAS program is included and shows how easy it is to import JSON data into SAS data set using the JSON LIBNAME engine.

 

libname x5 json 'psc5.txt';
proc datasets lib=x5; 
     quit;
proc contents data=x5._all_; 
     run;

proc sql;
     create table work.records as
     select root.company_number, data.etag, data.kind, data_links.self,
            data.notified_on, data.statement, data.ceased_on
     from x5.root as root, X5.DATA_LINKS as DATA_LINKS, X5.DATA as data
     where root.ordinal_root = data.ordinal_root and
           data.ordinal_data = data_links.ordinal_data
           ;
     quit;
proc print data=work.records(obs=5); 
     run;

 

 

    company_
Obs  number                       etag

  1 10869580    f327a09d5a3a08af72b03211f3a2e148814d75dd
  2 10869582    83f75b47516bb1c7bac875c7f97134cb51711f2d
  3 10869588    e730a688879c2f724e8c881267497f9b10ba2688
  4 10869591    83dcecd921dee3f8099bd90a4d355b104ffacc9f
  5 10869596    f570d6220979853777d6bad1418cafd5a4f7c418


Obs                    kind

  1 persons-with-significant-control-statement
  2 persons-with-significant-control-statement
  3 persons-with-significant-control-statement
  4 persons-with-significant-control-statement
  5 persons-with-significant-control-statement


Obs                                    self

  1 /company/10869580/persons-with-significant-control-statements/JZsKO5aylMYj
  2 /company/10869582/persons-with-significant-control-statements/5xQ7ZCMaLDHO
  3 /company/10869588/persons-with-significant-control-statements/rHluQkBUwwXw
  4 /company/10869591/persons-with-significant-control-statements/rK8w6OtlgY9f
  5 /company/10869596/persons-with-significant-control-statements/w1mzlYcY-BoB

    notified_                                                      ceased_
Obs     on                          statement                        on

  1 2017-07-17   no-individual-or-entity-with-signficant-control
  2 2017-07-17   no-individual-or-entity-with-signficant-control
  3 2017-07-17   no-individual-or-entity-with-signficant-control
  4 2017-07-17   no-individual-or-entity-with-signficant-control
  5 2017-07-17   no-individual-or-entity-with-signficant-control

 

NOTE: There were 5 observations read from the data set WORK.RECORDS.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.04 seconds
      cpu time            0.00 seconds

Tom
Super User Tom
Super User

What was wrong with the original file and what did you do to fix it? 

Did you just wrap the whole thing in [] with commas?

So something like this?

data _null_;
   infile 'badfile' end=eof;
   file 'goodfile' ;
   retain sep '[' ;
   input ;
   put sep _infile_ ;
   if eof then put ']';
   sep=',';
run;
BillM_SAS
SAS Employee

The JSON LIBNAME engine did not like the JSON in the psc.txt file originally posted. I then ran that JSON file through a JSON pretty printer and it also noted the file was invalid. So yes, after I examined the file I added commas and wrapped the contents of the whole file with brackets to make it a valid JSON file. 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 9361 views
  • 2 likes
  • 6 in conversation