Hi, I'm trying to get the attached data into SAS in order to analyse it alongside another data set. I don't have any experience working with json formatted data, so am having a lot of difficulty trying to get it in in a usable format. Trawled through the boards and found this solution (with edits for my data structure):
data x;
infile 'C:\Desktop\psc.txt' dsd lrecl=30000000 dlm='{[:,';
input x : $2000.@@;
run;
data temp;
merge x x(firstobs=2 rename=(x=_x));
if lowcase(x) in ("company_number","data__address__address_line_1","data__address__address_line_2","data__address__care_of","data__address__country","data__address__locality","data__address__po_box","data__address__postal_code","data__address__premises","data__address__region","data__ceased_on","data__country_of_residence","data__date_of_birth__month","data__date_of_birth__year","data__description","data__etag","data__exemptions__psc_exempt_as_shares_admitted_on_market__exemption_type","data__exemptions__psc_exempt_as_shares_admitted_on_market__items","data__exemptions__psc_exempt_as_trading_on_regulated_market__exemption_type","data__exemptions__psc_exempt_as_trading_on_regulated_market__items","data__exemptions_count","data__generated_at","data__identification__country_registered","data__identification__legal_authority","data__identification__legal_form","data__identification__place_registered","data__identification__registration_number","data__kind","data__links__self","data__name","data__name_elements__forename","data__name_elements__honours","data__name_elements__middle_name","data__name_elements__surname","data__name_elements__title","data__nationality","data__natures_of_control","data__notified_on","data__persons_of_significant_control_count","data__statement","data__statements_count");
run;
data temp;
set temp;
if lowcase(x)='company_number' then group+1;
run;
proc transpose data=temp out=want(drop=_:);
by group;
id x;
var _x;
run;
Currently the ouput is a series 'group', and then the company numbers that have been successfully extracted. I'm a beginner with SAS programming language, but it seems like the other sets of data aren't defined in the code, but I'm really stuck with how to fix it.
Thanks, Ian
I found the JSON in the attached psc.txt file is invalid. I modified the file as I think it should be for what you appear to desire (and the modified file is attached to this post). With a valid JSON file, I ran the JSON LIBNAME engine (available in SAS 9.4 maintenance release 4) to produce 4 SAS data sets. I then used the SQL procedure to combine the SAS data sets into records similar to the JSON data. This demonstrates how to reassemble the hierarchical data in the JSON data file that was flatten into SAS data sets. The complete SAS program is included and shows how easy it is to import JSON data into SAS data set using the JSON LIBNAME engine.
libname x5 json 'psc5.txt';
proc datasets lib=x5;
quit;
proc contents data=x5._all_;
run;
proc sql;
create table work.records as
select root.company_number, data.etag, data.kind, data_links.self,
data.notified_on, data.statement, data.ceased_on
from x5.root as root, X5.DATA_LINKS as DATA_LINKS, X5.DATA as data
where root.ordinal_root = data.ordinal_root and
data.ordinal_data = data_links.ordinal_data
;
quit;
proc print data=work.records(obs=5);
run;
company_
Obs number etag
1 10869580 f327a09d5a3a08af72b03211f3a2e148814d75dd
2 10869582 83f75b47516bb1c7bac875c7f97134cb51711f2d
3 10869588 e730a688879c2f724e8c881267497f9b10ba2688
4 10869591 83dcecd921dee3f8099bd90a4d355b104ffacc9f
5 10869596 f570d6220979853777d6bad1418cafd5a4f7c418
Obs kind
1 persons-with-significant-control-statement
2 persons-with-significant-control-statement
3 persons-with-significant-control-statement
4 persons-with-significant-control-statement
5 persons-with-significant-control-statement
Obs self
1 /company/10869580/persons-with-significant-control-statements/JZsKO5aylMYj
2 /company/10869582/persons-with-significant-control-statements/5xQ7ZCMaLDHO
3 /company/10869588/persons-with-significant-control-statements/rHluQkBUwwXw
4 /company/10869591/persons-with-significant-control-statements/rK8w6OtlgY9f
5 /company/10869596/persons-with-significant-control-statements/w1mzlYcY-BoB
notified_ ceased_
Obs on statement on
1 2017-07-17 no-individual-or-entity-with-signficant-control
2 2017-07-17 no-individual-or-entity-with-signficant-control
3 2017-07-17 no-individual-or-entity-with-signficant-control
4 2017-07-17 no-individual-or-entity-with-signficant-control
5 2017-07-17 no-individual-or-entity-with-signficant-control
NOTE: There were 5 observations read from the data set WORK.RECORDS.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.04 seconds
cpu time 0.00 seconds
This is a bit of a tough ask if you're a beginner in SAS - in my experience the best way to parse JSON is to use the Proc DS2 JSON parser. There's a good example in this artice https://blogs.sas.com/content/sasdummy/2015/09/28/parse-json-from-sas/
It looks like your JSON is fairly straightforward so if you give it a try and post again if you have any problems.
Think that solution's a bit beyond me as well! I'll have an ask around at work to see whether anyone else can give a hand.
Thanks for you help, Ian
See if this helps:
https://blogs.sas.com/content/sasdummy/2016/12/02/json-libname-engine-sas/?postid=710499380
I have not tried this code yet since I normally use JSON engines outside of SAS but 9.4M4 has a new libname engine for JSON.
Yeah. The best to solve this kind of problem is using JSON engine of LIBNAME statement, if you have SAS 9.4 m4.
filename x 'c:\temp\psc.txt';
data x;
infile x recfm=n dlm='{},';
input x : $2000.@@;
a=dequote(scan(x,1,':'));
b=dequote(scan(x,-1,':','m'));
if anyspace(x)=1 then do;group+1;delete;end;
run;
proc transpose data=x out=want;
by group;
id a;
var b;
run;
This seems to have worked! Thanks so much.
I found the JSON in the attached psc.txt file is invalid. I modified the file as I think it should be for what you appear to desire (and the modified file is attached to this post). With a valid JSON file, I ran the JSON LIBNAME engine (available in SAS 9.4 maintenance release 4) to produce 4 SAS data sets. I then used the SQL procedure to combine the SAS data sets into records similar to the JSON data. This demonstrates how to reassemble the hierarchical data in the JSON data file that was flatten into SAS data sets. The complete SAS program is included and shows how easy it is to import JSON data into SAS data set using the JSON LIBNAME engine.
libname x5 json 'psc5.txt';
proc datasets lib=x5;
quit;
proc contents data=x5._all_;
run;
proc sql;
create table work.records as
select root.company_number, data.etag, data.kind, data_links.self,
data.notified_on, data.statement, data.ceased_on
from x5.root as root, X5.DATA_LINKS as DATA_LINKS, X5.DATA as data
where root.ordinal_root = data.ordinal_root and
data.ordinal_data = data_links.ordinal_data
;
quit;
proc print data=work.records(obs=5);
run;
company_
Obs number etag
1 10869580 f327a09d5a3a08af72b03211f3a2e148814d75dd
2 10869582 83f75b47516bb1c7bac875c7f97134cb51711f2d
3 10869588 e730a688879c2f724e8c881267497f9b10ba2688
4 10869591 83dcecd921dee3f8099bd90a4d355b104ffacc9f
5 10869596 f570d6220979853777d6bad1418cafd5a4f7c418
Obs kind
1 persons-with-significant-control-statement
2 persons-with-significant-control-statement
3 persons-with-significant-control-statement
4 persons-with-significant-control-statement
5 persons-with-significant-control-statement
Obs self
1 /company/10869580/persons-with-significant-control-statements/JZsKO5aylMYj
2 /company/10869582/persons-with-significant-control-statements/5xQ7ZCMaLDHO
3 /company/10869588/persons-with-significant-control-statements/rHluQkBUwwXw
4 /company/10869591/persons-with-significant-control-statements/rK8w6OtlgY9f
5 /company/10869596/persons-with-significant-control-statements/w1mzlYcY-BoB
notified_ ceased_
Obs on statement on
1 2017-07-17 no-individual-or-entity-with-signficant-control
2 2017-07-17 no-individual-or-entity-with-signficant-control
3 2017-07-17 no-individual-or-entity-with-signficant-control
4 2017-07-17 no-individual-or-entity-with-signficant-control
5 2017-07-17 no-individual-or-entity-with-signficant-control
NOTE: There were 5 observations read from the data set WORK.RECORDS.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.04 seconds
cpu time 0.00 seconds
What was wrong with the original file and what did you do to fix it?
Did you just wrap the whole thing in [] with commas?
So something like this?
data _null_;
infile 'badfile' end=eof;
file 'goodfile' ;
retain sep '[' ;
input ;
put sep _infile_ ;
if eof then put ']';
sep=',';
run;
The JSON LIBNAME engine did not like the JSON in the psc.txt file originally posted. I then ran that JSON file through a JSON pretty printer and it also noted the file was invalid. So yes, after I examined the file I added commas and wrapped the contents of the whole file with brackets to make it a valid JSON file.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.