Hello,
I need to convert a JSON file into a SAS dataset that I can manage and I am having trouble importing it in a way that is usable.
Any suggestions would be greatly apprecated.
I have attached the file to this post (it is the text file verison). Below is a code of what I have:
data have;
infile 'x:\SeriesDataOut2.json' recfm=n dlm='{}[],';
input value : $200. @@;
if value in: ('"IndustrYDescription":' '"NoteRef"' '"TableID"' ) then delete;
run;
data temp;
set have;
length name v $ 100;
if value =: '"Results":' then n+1;
name=scan(value,2,'"','m');
v=scan(value,-2,'"','m');
drop value;
if name='' then delete;
run;
proc sort data=temp;
by n; run;
/*the proc transpose is kicking out a lot of errors*/
proc transpose data=temp out=want(drop=_name_ n) ;
by n;
var v;
id name;
run;
But you didn't post the output yet .
data temp;
infile '/folders/myfolders/SeriesDataOut.txt' dsd lrecl=300000000 dlm='[]{},:';
input x : $2000. @@;
retain flag;
if x ='Data' then flag=1;
if x='Notes' then group+1;
if flag and x not in (' ' 'Data' 'Notes');
run;
data a b;
set temp;
if group=0 then output a;
else output b;
run;
data a;
set a;
if mod(_n_,2)=1 then n+1;
drop group flag;
run;
proc transpose data=a out=temp_a;
by n;
var x;
run;
data temp_a;
set temp_a;
if col1='TableID' then group+1;
drop n _name_;
run;
proc transpose data=temp_a out=want_a(drop=group _name_);
by group;
id col1;
var col2;
run;
data b;
set b;
if mod(_n_,2)=1 then n+1;
drop group flag;
run;
proc transpose data=b out=temp_b;
by n;
var x;
run;
data temp_b;
set temp_b;
if col1='NoteRef' then group+1;
drop n _name_;
run;
proc transpose data=temp_b out=want_b(drop=group _name_);
by group;
id col1;
var col2;
run;
Have you seen this?
http://blogs.sas.com/content/sasdummy/2015/09/28/parse-json-from-sas/
Otherwise if you want to use a data step, here's a quick piece of code that reads your file, to get you started.
Please modify to suit your needs.
data READ;
infile "%sysfunc(pathname(WORK))\SeriesDataOut.txt" recfm=n dlm='}{';
input STR : $400. @@;
if _N_=1 then
call execute ('data WANT; length Industry DataValue $8 IndustrYDescription $400 NoteRef $12;');
if STR=:'"TableID"' then do;
STR=tranwrd(STR , '":"', '"n="');
STR=tranwrd(STR, '","', '";"' );
call execute(cats(STR,';output;'));
put STR=;
end;
run;
run;
proc print noobs;
run;
Industry | DataValue | IndustrYDescription | NoteRef | TableID | Frequency | Year | Quarter |
---|---|---|---|---|---|---|---|
11 | 447.3 | Agriculture, forestry, fishing, and hunting | 15 | 15 | A | 2012 | 2012 |
111CA | 400.9 | Farms | 15 | 15 | A | 2012 | 2012 |
113FF | 46.4 | Forestry, fishing, and related activities | 15 | 15 | A | 2012 | 2012 |
21 | 570.2 | Mining | 15 | 15 | A | 2012 | 2012 |
211 | 341.3 | Oil and gas extraction | 15 | 15 | A | 2012 | 2012 |
212 | 133.5 | Mining, except oil and gas | 15 | 15 | A | 2012 | 2012 |
213 | 95.4 | Support activities for mining | 15 | 15 | A | 2012 | 2012 |
[Edit: Removed an empty table that lingered here.]
The output that you have listed is what I was aiming for. However, when I used your code for some reason I wasn't able to replicate it.
But you didn't post the output yet .
data temp;
infile '/folders/myfolders/SeriesDataOut.txt' dsd lrecl=300000000 dlm='[]{},:';
input x : $2000. @@;
retain flag;
if x ='Data' then flag=1;
if x='Notes' then group+1;
if flag and x not in (' ' 'Data' 'Notes');
run;
data a b;
set temp;
if group=0 then output a;
else output b;
run;
data a;
set a;
if mod(_n_,2)=1 then n+1;
drop group flag;
run;
proc transpose data=a out=temp_a;
by n;
var x;
run;
data temp_a;
set temp_a;
if col1='TableID' then group+1;
drop n _name_;
run;
proc transpose data=temp_a out=want_a(drop=group _name_);
by group;
id col1;
var col2;
run;
data b;
set b;
if mod(_n_,2)=1 then n+1;
drop group flag;
run;
proc transpose data=b out=temp_b;
by n;
var x;
run;
data temp_b;
set temp_b;
if col1='NoteRef' then group+1;
drop n _name_;
run;
proc transpose data=temp_b out=want_b(drop=group _name_);
by group;
id col1;
var col2;
run;
SOLUTION USING OPs txt file
HAVE OPs txt file (which seems to be incomplete?)
======================================================================
{"BEAAPI":{"Request":{"RequestParam":[{"ParameterName":"YEAR","ParameterValue":"2012"},{"ParameterNa
me":"TABLEID","ParameterValue":"15"},{"ParameterName":"DATASETNAME","ParameterValue":"GDPBYINDUSTRY"
},{"ParameterName":"RESULTFORMAT","ParameterValue":"JSON"},{"ParameterName":"INDUSTRY","ParameterVal
ue":"ALL"},{"ParameterName":"METHOD","ParameterValue":"GETDATA"},{"ParameterName":"USERID","Paramete
rValue":"715A1CFC-2F60-47A0-8DDF-82ED8831520A"},{"ParameterName":"FREQUENCY","ParameterValue":"A"}]}
,"Results": {"Statistic":"GDP by Industry Table","UTCProductionTime":"2016-10-21T14:15:11.923","Dime
......
WANT SAS Dataset JSN
=====================
Middle Observation(1 ) of Last dataset = WORK.LVLA - Total Obs 1
-- CHARACTER --
-- CHARACTER --
REQUEST_REQUESTPARAM_PARAMETRNM C 4 YEAR Request.RequestParam.ParameterName
REQUEST_REQUESTPARAM_PARAMETRVL C 4 2012 Request.RequestParam.ParameterValue
REQUEST_REQUESTPARAM_PARMTRNM_1 C 7 TABLEID Request.RequestParam.ParameterName.1
REQUEST_REQUESTPARAM_PARMTRVL_1 C 2 15 Request.RequestParam.ParameterValue.1
REQUEST_REQUESTPARAM_PARMTRNM_2 C 11 DATASETNAME Request.RequestParam.ParameterName.2
REQUEST_REQUESTPARAM_PARMTRVL_2 C 13 GDPBYINDUSTRY Request.RequestParam.ParameterValue.2
REQUEST_REQUESTPARAM_PARMTRNM_3 C 12 RESULTFORMAT Request.RequestParam.ParameterName.3
REQUEST_REQUESTPARAM_PARMTRVL_3 C 4 JSON Request.RequestParam.ParameterValue.3
REQUEST_REQUESTPARAM_PARMTRNM_4 C 8 INDUSTRY Request.RequestParam.ParameterName.4
REQUEST_REQUESTPARAM_PARMTRVL_4 C 3 ALL Request.RequestParam.ParameterValue.4
REQUEST_REQUESTPARAM_PARMTRNM_5 C 6 METHOD Request.RequestParam.ParameterName.5
REQUEST_REQUESTPARAM_PARMTRVL_5 C 7 GETDATA Request.RequestParam.ParameterValue.5
REQUEST_REQUESTPARAM_PARMTRNM_6 C 6 USERID Request.RequestParam.ParameterName.6
REQUEST_REQUESTPARAM_PARMTRVL_6 C 36 715A1CFC-2F60-47 Request.RequestParam.ParameterValue.6
REQUEST_REQUESTPARAM_PARMTRNM_7 C 9 FREQUENCY Request.RequestParam.ParameterName.7
REQUEST_REQUESTPARAM_PARMTRVL_7 C 1 A Request.RequestParam.ParameterValue.7
RESULTS_STATISTIC C 21 GDP by Industry Results.Statistic
RESULTS_UTCPRODUCTIONTIME C 23 2016-10-21T14:15 Results.UTCProductionTime
RESULTS_DIMENSIONS_ORDINAL C 1 1 Results.Dimensions.Ordinal
RESULTS_DIMENSIONS_NAME C 7 TableID Results.Dimensions.Name
...
RESULTS_NOTES_NOTEREF_2 C 6 15.3.Q Results.Notes.NoteRef.2
RESULTS_NOTES_NOTETEXT_2 C 244 3. Consists of c Results.Notes.NoteText.2
RESULTS_NOTES_NOTEREF_3 C 6 15.4.Q Results.Notes.NoteRef.3
RESULTS_NOTES_NOTETEXT_3 C 244 (u): Underlying Results.Notes.NoteText.3
RESULTS_NOTES_NOTEREF_4 C 2 15 Results.Notes.NoteRef.4
RESULTS_NOTES_NOTETEXT_4 C 46 Gross Output by Results.Notes.NoteText.4
The CONTENTS Procedure
Data Set Name WORK.JSN Observations 1
Member Type DATA Variables 856
Engine V9 Indexes 0
Created 10/25/2016 10:41:53 Observation Length 6317
Last Modified 10/25/2016 10:41:53 Deleted Observations 0
Protection Compressed NO
Data Set Type Sorted NO
Label
Data Representation WINDOWS_64
Encoding wlatin1 Western (Windows)
Engine/Host Dependent Information
Data Set Page Size 65536
Number of Data Set Pages 3
First Data Page 3
Max Obs per Page 10
Obs in First Data Page 1
Number of Data Set Repairs 0
ExtendObsCounter YES
Alphabetic List of Variables and Attributes
# Variable Type Len Format Informat Label
1 BEAAPI_REQUEST_REQSTPRM_PRMTRNM Char 4 $4. $4. BEAAPI.Request.RequestParam.ParameterName
2 BEAAPI_REQUEST_REQSTPRM_PRMTRVL Char 4 $4. $4. BEAAPI.Request.RequestParam.ParameterValue
3 BEAAPI_REQUST_RQSTPRM_PRMTRNM_1 Char 7 $7. $7. BEAAPI.Request.RequestParam.ParameterName.1
5 BEAAPI_REQUST_RQSTPRM_PRMTRNM_2 Char 11 $11. $11. BEAAPI.Request.RequestParam.ParameterName.2
7 BEAAPI_REQUST_RQSTPRM_PRMTRNM_3 Char 12 $12. $12. BEAAPI.Request.RequestParam.ParameterName.3
9 BEAAPI_REQUST_RQSTPRM_PRMTRNM_4 Char 8 $8. $8. BEAAPI.Request.RequestParam.ParameterName.4
11 BEAAPI_REQUST_RQSTPRM_PRMTRNM_5 Char 6 $6. $6. BEAAPI.Request.RequestParam.ParameterName.5
...
848 BEAAPI_RESULTS_NOTES_NOTETEXT Char 100 $100. $100. BEAAPI.Results.Notes.NoteText
849 BEAAPI_RESULTS_NOTES_NOTEREF_1 Char 8 $8. $8. BEAAPI.Results.Notes.NoteRef.1
850 BEAAPI_RESULTS_NOTES_NOTETEXT_1 Char 244 $244. $244. BEAAPI.Results.Notes.NoteText.1
851 BEAAPI_RESULTS_NOTES_NOTEREF_2 Char 6 $6. $6. BEAAPI.Results.Notes.NoteRef.2
852 BEAAPI_RESULTS_NOTES_NOTETEXT_2 Char 244 $244. $244. BEAAPI.Results.Notes.NoteText.2
853 BEAAPI_RESULTS_NOTES_NOTEREF_3 Char 6 $6. $6. BEAAPI.Results.Notes.NoteRef.3
854 BEAAPI_RESULTS_NOTES_NOTETEXT_3 Char 244 $244. $244. BEAAPI.Results.Notes.NoteText.3
855 BEAAPI_RESULTS_NOTES_NOTEREF_4 Char 2 $2. $2. BEAAPI.Results.Notes.NoteRef.4
856 BEAAPI_RESULTS_NOTES_NOTETEXT_4 Char 46 $46. $46. BEAAPI.Results.Notes.NoteText.4
SOLUTION R WORKING CODE
=======================
jsn <- as.data.frame(fromJSON(paste(readLines(json_file), collapse="")));
FULL SOLUTION
It is possible to reshape the 'list' stucture in R.
SAS requires a rectangular structure.
%utl_submit_r64('
library("rjson");
library(foreign);
json_file <- "d:/json/seriesdataout.txt" ;
jsn <- as.data.frame(fromJSON(paste(readLines(json_file), collapse="")));
write.dta(jsn,convert.factors="string",version = 10L, "d:/dta/jsn.dta");
');
* if you have access to IML interface to R you can eliminate the truncation.
package 'rjson' was built under R version 3.2.5
Warning message: **
In readLines(json_file) :
incomplete final line found on 'd:/json/seriesdataout.txt'
Warning messages:
1: In write.dta(jsn, convert.factors = "string", version = 10L, "d:/dta/jsn.dta") :
abbreviating variable names
2: In write.dta(jsn, convert.factors = "string", version = 10L, "d:/dta/jsn.dta") :
character strings of >244 bytes in column 850 will be truncated
3: In write.dta(jsn, convert.factors = "string", version = 10L, "d:/dta/jsn.dta") :
character strings of >244 bytes in column 852 will be truncated
4: In write.dta(jsn, convert.factors = "string", version = 10L, "d:/dta/jsn.dta") :
character strings of >244 bytes in column 854 will be truncated
proc import out=jsn file='d:\dta\jsn.dta' replace;
;run;quit;
proc print data=jsnlyt;
run;quit;
Thank you. Your code gave me the results that was closest to what I was looking for. Sorry about not posting output. I have posted the output to the most recent response to this topics dialog.
Hello everyone,
Sorry about the late response, I was tied up at work. The output that I needed was supposed to look like the attached picture.
Mainly as the year as the variable and the different industries as rows. I understand that the JSON file had more information then what is in the attachment but it is a good start.
Thank you for posting the responses. All the answers was very useful. Ksharp response gave me results that was the closest to what I was looking for. Massive thanks Ksharp.
Mike
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.