Hello,
I need to convert a JSON file into a SAS dataset that I can manage and I am having trouble importing it in a way that is usable.
Any suggestions would be greatly apprecated.
I have attached the file to this post (it is the text file verison). Below is a code of what I have:
data have;
infile 'x:\SeriesDataOut2.json' recfm=n dlm='{}[],';
input value : $200. @@;
if value in: ('"IndustrYDescription":' '"NoteRef"' '"TableID"' ) then delete;
run;
data temp;
set have;
length name v $ 100;
if value =: '"Results":' then n+1;
name=scan(value,2,'"','m');
v=scan(value,-2,'"','m');
drop value;
if name='' then delete;
run;
proc sort data=temp;
by n; run;
/*the proc transpose is kicking out a lot of errors*/
proc transpose data=temp out=want(drop=_name_ n) ;
by n;
var v;
id name;
run;
But you didn't post the output yet .
data temp;
infile '/folders/myfolders/SeriesDataOut.txt' dsd lrecl=300000000 dlm='[]{},:';
input x : $2000. @@;
retain flag;
if x ='Data' then flag=1;
if x='Notes' then group+1;
if flag and x not in (' ' 'Data' 'Notes');
run;
data a b;
set temp;
if group=0 then output a;
else output b;
run;
data a;
set a;
if mod(_n_,2)=1 then n+1;
drop group flag;
run;
proc transpose data=a out=temp_a;
by n;
var x;
run;
data temp_a;
set temp_a;
if col1='TableID' then group+1;
drop n _name_;
run;
proc transpose data=temp_a out=want_a(drop=group _name_);
by group;
id col1;
var col2;
run;
data b;
set b;
if mod(_n_,2)=1 then n+1;
drop group flag;
run;
proc transpose data=b out=temp_b;
by n;
var x;
run;
data temp_b;
set temp_b;
if col1='NoteRef' then group+1;
drop n _name_;
run;
proc transpose data=temp_b out=want_b(drop=group _name_);
by group;
id col1;
var col2;
run;
Have you seen this?
http://blogs.sas.com/content/sasdummy/2015/09/28/parse-json-from-sas/
Otherwise if you want to use a data step, here's a quick piece of code that reads your file, to get you started.
Please modify to suit your needs.
data READ;
infile "%sysfunc(pathname(WORK))\SeriesDataOut.txt" recfm=n dlm='}{';
input STR : $400. @@;
if _N_=1 then
call execute ('data WANT; length Industry DataValue $8 IndustrYDescription $400 NoteRef $12;');
if STR=:'"TableID"' then do;
STR=tranwrd(STR , '":"', '"n="');
STR=tranwrd(STR, '","', '";"' );
call execute(cats(STR,';output;'));
put STR=;
end;
run;
run;
proc print noobs;
run;
Industry | DataValue | IndustrYDescription | NoteRef | TableID | Frequency | Year | Quarter |
---|---|---|---|---|---|---|---|
11 | 447.3 | Agriculture, forestry, fishing, and hunting | 15 | 15 | A | 2012 | 2012 |
111CA | 400.9 | Farms | 15 | 15 | A | 2012 | 2012 |
113FF | 46.4 | Forestry, fishing, and related activities | 15 | 15 | A | 2012 | 2012 |
21 | 570.2 | Mining | 15 | 15 | A | 2012 | 2012 |
211 | 341.3 | Oil and gas extraction | 15 | 15 | A | 2012 | 2012 |
212 | 133.5 | Mining, except oil and gas | 15 | 15 | A | 2012 | 2012 |
213 | 95.4 | Support activities for mining | 15 | 15 | A | 2012 | 2012 |
[Edit: Removed an empty table that lingered here.]
The output that you have listed is what I was aiming for. However, when I used your code for some reason I wasn't able to replicate it.
But you didn't post the output yet .
data temp;
infile '/folders/myfolders/SeriesDataOut.txt' dsd lrecl=300000000 dlm='[]{},:';
input x : $2000. @@;
retain flag;
if x ='Data' then flag=1;
if x='Notes' then group+1;
if flag and x not in (' ' 'Data' 'Notes');
run;
data a b;
set temp;
if group=0 then output a;
else output b;
run;
data a;
set a;
if mod(_n_,2)=1 then n+1;
drop group flag;
run;
proc transpose data=a out=temp_a;
by n;
var x;
run;
data temp_a;
set temp_a;
if col1='TableID' then group+1;
drop n _name_;
run;
proc transpose data=temp_a out=want_a(drop=group _name_);
by group;
id col1;
var col2;
run;
data b;
set b;
if mod(_n_,2)=1 then n+1;
drop group flag;
run;
proc transpose data=b out=temp_b;
by n;
var x;
run;
data temp_b;
set temp_b;
if col1='NoteRef' then group+1;
drop n _name_;
run;
proc transpose data=temp_b out=want_b(drop=group _name_);
by group;
id col1;
var col2;
run;
SOLUTION USING OPs txt file
HAVE OPs txt file (which seems to be incomplete?)
======================================================================
{"BEAAPI":{"Request":{"RequestParam":[{"ParameterName":"YEAR","ParameterValue":"2012"},{"ParameterNa
me":"TABLEID","ParameterValue":"15"},{"ParameterName":"DATASETNAME","ParameterValue":"GDPBYINDUSTRY"
},{"ParameterName":"RESULTFORMAT","ParameterValue":"JSON"},{"ParameterName":"INDUSTRY","ParameterVal
ue":"ALL"},{"ParameterName":"METHOD","ParameterValue":"GETDATA"},{"ParameterName":"USERID","Paramete
rValue":"715A1CFC-2F60-47A0-8DDF-82ED8831520A"},{"ParameterName":"FREQUENCY","ParameterValue":"A"}]}
,"Results": {"Statistic":"GDP by Industry Table","UTCProductionTime":"2016-10-21T14:15:11.923","Dime
......
WANT SAS Dataset JSN
=====================
Middle Observation(1 ) of Last dataset = WORK.LVLA - Total Obs 1
-- CHARACTER --
-- CHARACTER --
REQUEST_REQUESTPARAM_PARAMETRNM C 4 YEAR Request.RequestParam.ParameterName
REQUEST_REQUESTPARAM_PARAMETRVL C 4 2012 Request.RequestParam.ParameterValue
REQUEST_REQUESTPARAM_PARMTRNM_1 C 7 TABLEID Request.RequestParam.ParameterName.1
REQUEST_REQUESTPARAM_PARMTRVL_1 C 2 15 Request.RequestParam.ParameterValue.1
REQUEST_REQUESTPARAM_PARMTRNM_2 C 11 DATASETNAME Request.RequestParam.ParameterName.2
REQUEST_REQUESTPARAM_PARMTRVL_2 C 13 GDPBYINDUSTRY Request.RequestParam.ParameterValue.2
REQUEST_REQUESTPARAM_PARMTRNM_3 C 12 RESULTFORMAT Request.RequestParam.ParameterName.3
REQUEST_REQUESTPARAM_PARMTRVL_3 C 4 JSON Request.RequestParam.ParameterValue.3
REQUEST_REQUESTPARAM_PARMTRNM_4 C 8 INDUSTRY Request.RequestParam.ParameterName.4
REQUEST_REQUESTPARAM_PARMTRVL_4 C 3 ALL Request.RequestParam.ParameterValue.4
REQUEST_REQUESTPARAM_PARMTRNM_5 C 6 METHOD Request.RequestParam.ParameterName.5
REQUEST_REQUESTPARAM_PARMTRVL_5 C 7 GETDATA Request.RequestParam.ParameterValue.5
REQUEST_REQUESTPARAM_PARMTRNM_6 C 6 USERID Request.RequestParam.ParameterName.6
REQUEST_REQUESTPARAM_PARMTRVL_6 C 36 715A1CFC-2F60-47 Request.RequestParam.ParameterValue.6
REQUEST_REQUESTPARAM_PARMTRNM_7 C 9 FREQUENCY Request.RequestParam.ParameterName.7
REQUEST_REQUESTPARAM_PARMTRVL_7 C 1 A Request.RequestParam.ParameterValue.7
RESULTS_STATISTIC C 21 GDP by Industry Results.Statistic
RESULTS_UTCPRODUCTIONTIME C 23 2016-10-21T14:15 Results.UTCProductionTime
RESULTS_DIMENSIONS_ORDINAL C 1 1 Results.Dimensions.Ordinal
RESULTS_DIMENSIONS_NAME C 7 TableID Results.Dimensions.Name
...
RESULTS_NOTES_NOTEREF_2 C 6 15.3.Q Results.Notes.NoteRef.2
RESULTS_NOTES_NOTETEXT_2 C 244 3. Consists of c Results.Notes.NoteText.2
RESULTS_NOTES_NOTEREF_3 C 6 15.4.Q Results.Notes.NoteRef.3
RESULTS_NOTES_NOTETEXT_3 C 244 (u): Underlying Results.Notes.NoteText.3
RESULTS_NOTES_NOTEREF_4 C 2 15 Results.Notes.NoteRef.4
RESULTS_NOTES_NOTETEXT_4 C 46 Gross Output by Results.Notes.NoteText.4
The CONTENTS Procedure
Data Set Name WORK.JSN Observations 1
Member Type DATA Variables 856
Engine V9 Indexes 0
Created 10/25/2016 10:41:53 Observation Length 6317
Last Modified 10/25/2016 10:41:53 Deleted Observations 0
Protection Compressed NO
Data Set Type Sorted NO
Label
Data Representation WINDOWS_64
Encoding wlatin1 Western (Windows)
Engine/Host Dependent Information
Data Set Page Size 65536
Number of Data Set Pages 3
First Data Page 3
Max Obs per Page 10
Obs in First Data Page 1
Number of Data Set Repairs 0
ExtendObsCounter YES
Alphabetic List of Variables and Attributes
# Variable Type Len Format Informat Label
1 BEAAPI_REQUEST_REQSTPRM_PRMTRNM Char 4 $4. $4. BEAAPI.Request.RequestParam.ParameterName
2 BEAAPI_REQUEST_REQSTPRM_PRMTRVL Char 4 $4. $4. BEAAPI.Request.RequestParam.ParameterValue
3 BEAAPI_REQUST_RQSTPRM_PRMTRNM_1 Char 7 $7. $7. BEAAPI.Request.RequestParam.ParameterName.1
5 BEAAPI_REQUST_RQSTPRM_PRMTRNM_2 Char 11 $11. $11. BEAAPI.Request.RequestParam.ParameterName.2
7 BEAAPI_REQUST_RQSTPRM_PRMTRNM_3 Char 12 $12. $12. BEAAPI.Request.RequestParam.ParameterName.3
9 BEAAPI_REQUST_RQSTPRM_PRMTRNM_4 Char 8 $8. $8. BEAAPI.Request.RequestParam.ParameterName.4
11 BEAAPI_REQUST_RQSTPRM_PRMTRNM_5 Char 6 $6. $6. BEAAPI.Request.RequestParam.ParameterName.5
...
848 BEAAPI_RESULTS_NOTES_NOTETEXT Char 100 $100. $100. BEAAPI.Results.Notes.NoteText
849 BEAAPI_RESULTS_NOTES_NOTEREF_1 Char 8 $8. $8. BEAAPI.Results.Notes.NoteRef.1
850 BEAAPI_RESULTS_NOTES_NOTETEXT_1 Char 244 $244. $244. BEAAPI.Results.Notes.NoteText.1
851 BEAAPI_RESULTS_NOTES_NOTEREF_2 Char 6 $6. $6. BEAAPI.Results.Notes.NoteRef.2
852 BEAAPI_RESULTS_NOTES_NOTETEXT_2 Char 244 $244. $244. BEAAPI.Results.Notes.NoteText.2
853 BEAAPI_RESULTS_NOTES_NOTEREF_3 Char 6 $6. $6. BEAAPI.Results.Notes.NoteRef.3
854 BEAAPI_RESULTS_NOTES_NOTETEXT_3 Char 244 $244. $244. BEAAPI.Results.Notes.NoteText.3
855 BEAAPI_RESULTS_NOTES_NOTEREF_4 Char 2 $2. $2. BEAAPI.Results.Notes.NoteRef.4
856 BEAAPI_RESULTS_NOTES_NOTETEXT_4 Char 46 $46. $46. BEAAPI.Results.Notes.NoteText.4
SOLUTION R WORKING CODE
=======================
jsn <- as.data.frame(fromJSON(paste(readLines(json_file), collapse="")));
FULL SOLUTION
It is possible to reshape the 'list' stucture in R.
SAS requires a rectangular structure.
%utl_submit_r64('
library("rjson");
library(foreign);
json_file <- "d:/json/seriesdataout.txt" ;
jsn <- as.data.frame(fromJSON(paste(readLines(json_file), collapse="")));
write.dta(jsn,convert.factors="string",version = 10L, "d:/dta/jsn.dta");
');
* if you have access to IML interface to R you can eliminate the truncation.
package 'rjson' was built under R version 3.2.5
Warning message: **
In readLines(json_file) :
incomplete final line found on 'd:/json/seriesdataout.txt'
Warning messages:
1: In write.dta(jsn, convert.factors = "string", version = 10L, "d:/dta/jsn.dta") :
abbreviating variable names
2: In write.dta(jsn, convert.factors = "string", version = 10L, "d:/dta/jsn.dta") :
character strings of >244 bytes in column 850 will be truncated
3: In write.dta(jsn, convert.factors = "string", version = 10L, "d:/dta/jsn.dta") :
character strings of >244 bytes in column 852 will be truncated
4: In write.dta(jsn, convert.factors = "string", version = 10L, "d:/dta/jsn.dta") :
character strings of >244 bytes in column 854 will be truncated
proc import out=jsn file='d:\dta\jsn.dta' replace;
;run;quit;
proc print data=jsnlyt;
run;quit;
Thank you. Your code gave me the results that was closest to what I was looking for. Sorry about not posting output. I have posted the output to the most recent response to this topics dialog.
Hello everyone,
Sorry about the late response, I was tied up at work. The output that I needed was supposed to look like the attached picture.
Mainly as the year as the variable and the different industries as rows. I understand that the JSON file had more information then what is in the attachment but it is a good start.
Thank you for posting the responses. All the answers was very useful. Ksharp response gave me results that was the closest to what I was looking for. Massive thanks Ksharp.
Mike
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.