Hello,
I need to convert a JSON file into a SAS dataset that I can manage and I am having trouble importing it in a way that is usable.
Any suggestions would be greatly apprecated.
I have attached the file to this post (it is the text file verison). Below is a code of what I have:
data have;
infile 'x:\SeriesDataOut2.json' recfm=n dlm='{}[],';
input value : $200. @@;
if value in: ('"IndustrYDescription":' '"NoteRef"' '"TableID"' ) then delete;
run;
data temp;
 set have;
 length name v $ 100;
 if value =: '"Results":' then n+1;
 name=scan(value,2,'"','m');
 v=scan(value,-2,'"','m');
 drop value;
 if name=''  then delete;
run;
proc sort data=temp;
by n; run;
/*the proc transpose is kicking out a lot of errors*/
proc transpose data=temp out=want(drop=_name_ n)  ;
by n;
var v;
id name;
run;
But you didn't post the output yet .
data temp;
infile '/folders/myfolders/SeriesDataOut.txt' dsd lrecl=300000000 dlm='[]{},:';
input x : $2000. @@;
retain flag;
if x ='Data' then flag=1;
if x='Notes' then group+1;
if flag and x not in (' ' 'Data' 'Notes');
run;
data a b;
 set temp;
 if group=0 then output a;
  else output b;
run;
data a;
 set a;
 if mod(_n_,2)=1 then n+1;
 drop group flag;
run;
proc transpose data=a out=temp_a;
 by n;
 var x;
run;
data temp_a;
 set temp_a;
 if col1='TableID' then group+1;
 drop n _name_;
run;
proc transpose data=temp_a out=want_a(drop=group _name_);
 by group;
 id col1;
 var col2;
run;
data b;
 set b;
 if mod(_n_,2)=1 then n+1;
 drop group flag;
run;
proc transpose data=b out=temp_b;
 by n;
 var x;
run;
data temp_b;
 set temp_b;
 if col1='NoteRef' then group+1;
 drop n _name_;
run;
proc transpose data=temp_b out=want_b(drop=group _name_);
 by group;
 id col1;
 var col2;
run;Have you seen this?
http://blogs.sas.com/content/sasdummy/2015/09/28/parse-json-from-sas/
Otherwise if you want to use a data step, here's a quick piece of code that reads your file, to get you started.
Please modify to suit your needs.
data READ;
  infile "%sysfunc(pathname(WORK))\SeriesDataOut.txt" recfm=n dlm='}{';
  input STR : $400. @@;
  if _N_=1 then 
    call execute ('data WANT; length Industry DataValue $8 IndustrYDescription $400 NoteRef $12;');
  if STR=:'"TableID"' then do;
    STR=tranwrd(STR , '":"', '"n="'); 
    STR=tranwrd(STR, '","', '";"' );
    call execute(cats(STR,';output;'));
  put STR=;
  end;  
run;
run;
proc print noobs;
run;
| Industry | DataValue | IndustrYDescription | NoteRef | TableID | Frequency | Year | Quarter | 
|---|---|---|---|---|---|---|---|
| 11 | 447.3 | Agriculture, forestry, fishing, and hunting | 15 | 15 | A | 2012 | 2012 | 
| 111CA | 400.9 | Farms | 15 | 15 | A | 2012 | 2012 | 
| 113FF | 46.4 | Forestry, fishing, and related activities | 15 | 15 | A | 2012 | 2012 | 
| 21 | 570.2 | Mining | 15 | 15 | A | 2012 | 2012 | 
| 211 | 341.3 | Oil and gas extraction | 15 | 15 | A | 2012 | 2012 | 
| 212 | 133.5 | Mining, except oil and gas | 15 | 15 | A | 2012 | 2012 | 
| 213 | 95.4 | Support activities for mining | 15 | 15 | A | 2012 | 2012 | 
[Edit: Removed an empty table that lingered here.]
The output that you have listed is what I was aiming for. However, when I used your code for some reason I wasn't able to replicate it.
But you didn't post the output yet .
data temp;
infile '/folders/myfolders/SeriesDataOut.txt' dsd lrecl=300000000 dlm='[]{},:';
input x : $2000. @@;
retain flag;
if x ='Data' then flag=1;
if x='Notes' then group+1;
if flag and x not in (' ' 'Data' 'Notes');
run;
data a b;
 set temp;
 if group=0 then output a;
  else output b;
run;
data a;
 set a;
 if mod(_n_,2)=1 then n+1;
 drop group flag;
run;
proc transpose data=a out=temp_a;
 by n;
 var x;
run;
data temp_a;
 set temp_a;
 if col1='TableID' then group+1;
 drop n _name_;
run;
proc transpose data=temp_a out=want_a(drop=group _name_);
 by group;
 id col1;
 var col2;
run;
data b;
 set b;
 if mod(_n_,2)=1 then n+1;
 drop group flag;
run;
proc transpose data=b out=temp_b;
 by n;
 var x;
run;
data temp_b;
 set temp_b;
 if col1='NoteRef' then group+1;
 drop n _name_;
run;
proc transpose data=temp_b out=want_b(drop=group _name_);
 by group;
 id col1;
 var col2;
run;SOLUTION USING OPs txt file
HAVE OPs txt file (which seems to be incomplete?)
======================================================================
{"BEAAPI":{"Request":{"RequestParam":[{"ParameterName":"YEAR","ParameterValue":"2012"},{"ParameterNa
me":"TABLEID","ParameterValue":"15"},{"ParameterName":"DATASETNAME","ParameterValue":"GDPBYINDUSTRY"
},{"ParameterName":"RESULTFORMAT","ParameterValue":"JSON"},{"ParameterName":"INDUSTRY","ParameterVal
ue":"ALL"},{"ParameterName":"METHOD","ParameterValue":"GETDATA"},{"ParameterName":"USERID","Paramete
rValue":"715A1CFC-2F60-47A0-8DDF-82ED8831520A"},{"ParameterName":"FREQUENCY","ParameterValue":"A"}]}
,"Results": {"Statistic":"GDP by Industry Table","UTCProductionTime":"2016-10-21T14:15:11.923","Dime
......
WANT  SAS Dataset JSN
=====================
Middle Observation(1 ) of Last dataset = WORK.LVLA - Total Obs 1
 -- CHARACTER --
 -- CHARACTER --
REQUEST_REQUESTPARAM_PARAMETRNM  C    4       YEAR                Request.RequestParam.ParameterName
REQUEST_REQUESTPARAM_PARAMETRVL  C    4       2012                Request.RequestParam.ParameterValue
REQUEST_REQUESTPARAM_PARMTRNM_1  C    7       TABLEID             Request.RequestParam.ParameterName.1
REQUEST_REQUESTPARAM_PARMTRVL_1  C    2       15                  Request.RequestParam.ParameterValue.1
REQUEST_REQUESTPARAM_PARMTRNM_2  C    11      DATASETNAME         Request.RequestParam.ParameterName.2
REQUEST_REQUESTPARAM_PARMTRVL_2  C    13      GDPBYINDUSTRY       Request.RequestParam.ParameterValue.2
REQUEST_REQUESTPARAM_PARMTRNM_3  C    12      RESULTFORMAT        Request.RequestParam.ParameterName.3
REQUEST_REQUESTPARAM_PARMTRVL_3  C    4       JSON                Request.RequestParam.ParameterValue.3
REQUEST_REQUESTPARAM_PARMTRNM_4  C    8       INDUSTRY            Request.RequestParam.ParameterName.4
REQUEST_REQUESTPARAM_PARMTRVL_4  C    3       ALL                 Request.RequestParam.ParameterValue.4
REQUEST_REQUESTPARAM_PARMTRNM_5  C    6       METHOD              Request.RequestParam.ParameterName.5
REQUEST_REQUESTPARAM_PARMTRVL_5  C    7       GETDATA             Request.RequestParam.ParameterValue.5
REQUEST_REQUESTPARAM_PARMTRNM_6  C    6       USERID              Request.RequestParam.ParameterName.6
REQUEST_REQUESTPARAM_PARMTRVL_6  C    36      715A1CFC-2F60-47    Request.RequestParam.ParameterValue.6
REQUEST_REQUESTPARAM_PARMTRNM_7  C    9       FREQUENCY           Request.RequestParam.ParameterName.7
REQUEST_REQUESTPARAM_PARMTRVL_7  C    1       A                   Request.RequestParam.ParameterValue.7
RESULTS_STATISTIC                C    21      GDP by Industry     Results.Statistic
RESULTS_UTCPRODUCTIONTIME        C    23      2016-10-21T14:15    Results.UTCProductionTime
RESULTS_DIMENSIONS_ORDINAL       C    1       1                   Results.Dimensions.Ordinal
RESULTS_DIMENSIONS_NAME          C    7       TableID             Results.Dimensions.Name
...
RESULTS_NOTES_NOTEREF_2          C    6       15.3.Q              Results.Notes.NoteRef.2
RESULTS_NOTES_NOTETEXT_2         C    244     3. Consists of c    Results.Notes.NoteText.2
RESULTS_NOTES_NOTEREF_3          C    6       15.4.Q              Results.Notes.NoteRef.3
RESULTS_NOTES_NOTETEXT_3         C    244     (u): Underlying     Results.Notes.NoteText.3
RESULTS_NOTES_NOTEREF_4          C    2       15                  Results.Notes.NoteRef.4
RESULTS_NOTES_NOTETEXT_4         C    46      Gross Output by     Results.Notes.NoteText.4
The CONTENTS Procedure
Data Set Name        WORK.JSN                      Observations          1
Member Type          DATA                          Variables             856
Engine               V9                            Indexes               0
Created              10/25/2016 10:41:53           Observation Length    6317
Last Modified        10/25/2016 10:41:53           Deleted Observations  0
Protection                                         Compressed            NO
Data Set Type                                      Sorted                NO
Label
Data Representation  WINDOWS_64
Encoding             wlatin1  Western (Windows)
                  Engine/Host Dependent Information
Data Set Page Size          65536
Number of Data Set Pages    3
First Data Page             3
Max Obs per Page            10
Obs in First Data Page      1
Number of Data Set Repairs  0
ExtendObsCounter            YES
                                        Alphabetic List of Variables and Attributes
  #    Variable                           Type    Len    Format    Informat    Label
  1    BEAAPI_REQUEST_REQSTPRM_PRMTRNM    Char      4    $4.       $4.         BEAAPI.Request.RequestParam.ParameterName
  2    BEAAPI_REQUEST_REQSTPRM_PRMTRVL    Char      4    $4.       $4.         BEAAPI.Request.RequestParam.ParameterValue
  3    BEAAPI_REQUST_RQSTPRM_PRMTRNM_1    Char      7    $7.       $7.         BEAAPI.Request.RequestParam.ParameterName.1
  5    BEAAPI_REQUST_RQSTPRM_PRMTRNM_2    Char     11    $11.      $11.        BEAAPI.Request.RequestParam.ParameterName.2
  7    BEAAPI_REQUST_RQSTPRM_PRMTRNM_3    Char     12    $12.      $12.        BEAAPI.Request.RequestParam.ParameterName.3
  9    BEAAPI_REQUST_RQSTPRM_PRMTRNM_4    Char      8    $8.       $8.         BEAAPI.Request.RequestParam.ParameterName.4
 11    BEAAPI_REQUST_RQSTPRM_PRMTRNM_5    Char      6    $6.       $6.         BEAAPI.Request.RequestParam.ParameterName.5
  ...
848    BEAAPI_RESULTS_NOTES_NOTETEXT      Char    100    $100.     $100.       BEAAPI.Results.Notes.NoteText
849    BEAAPI_RESULTS_NOTES_NOTEREF_1     Char      8    $8.       $8.         BEAAPI.Results.Notes.NoteRef.1
850    BEAAPI_RESULTS_NOTES_NOTETEXT_1    Char    244    $244.     $244.       BEAAPI.Results.Notes.NoteText.1
851    BEAAPI_RESULTS_NOTES_NOTEREF_2     Char      6    $6.       $6.         BEAAPI.Results.Notes.NoteRef.2
852    BEAAPI_RESULTS_NOTES_NOTETEXT_2    Char    244    $244.     $244.       BEAAPI.Results.Notes.NoteText.2
853    BEAAPI_RESULTS_NOTES_NOTEREF_3     Char      6    $6.       $6.         BEAAPI.Results.Notes.NoteRef.3
854    BEAAPI_RESULTS_NOTES_NOTETEXT_3    Char    244    $244.     $244.       BEAAPI.Results.Notes.NoteText.3
855    BEAAPI_RESULTS_NOTES_NOTEREF_4     Char      2    $2.       $2.         BEAAPI.Results.Notes.NoteRef.4
856    BEAAPI_RESULTS_NOTES_NOTETEXT_4    Char     46    $46.      $46.        BEAAPI.Results.Notes.NoteText.4
SOLUTION R WORKING CODE
=======================
   jsn <- as.data.frame(fromJSON(paste(readLines(json_file), collapse="")));
FULL SOLUTION
It is possible to reshape the 'list' stucture in R.
SAS requires a rectangular structure.
%utl_submit_r64('
library("rjson");
library(foreign);
json_file <- "d:/json/seriesdataout.txt" ;
jsn <- as.data.frame(fromJSON(paste(readLines(json_file), collapse="")));
write.dta(jsn,convert.factors="string",version = 10L, "d:/dta/jsn.dta");
');
* if you have access to IML interface to R you can eliminate the truncation.
package 'rjson' was built under R version 3.2.5
Warning message:                **
In readLines(json_file) :
  incomplete final line found on 'd:/json/seriesdataout.txt'
Warning messages:
1: In write.dta(jsn, convert.factors = "string", version = 10L, "d:/dta/jsn.dta") :
  abbreviating variable names
2: In write.dta(jsn, convert.factors = "string", version = 10L, "d:/dta/jsn.dta") :
  character strings of >244 bytes in column 850 will be truncated
3: In write.dta(jsn, convert.factors = "string", version = 10L, "d:/dta/jsn.dta") :
  character strings of >244 bytes in column 852 will be truncated
4: In write.dta(jsn, convert.factors = "string", version = 10L, "d:/dta/jsn.dta") :
  character strings of >244 bytes in column 854 will be truncated
proc import out=jsn file='d:\dta\jsn.dta' replace;
;run;quit;
proc print data=jsnlyt;
run;quit;Thank you. Your code gave me the results that was closest to what I was looking for. Sorry about not posting output. I have posted the output to the most recent response to this topics dialog.
Hello everyone,
Sorry about the late response, I was tied up at work. The output that I needed was supposed to look like the attached picture.
Mainly as the year as the variable and the different industries as rows. I understand that the JSON file had more information then what is in the attachment but it is a good start.
Thank you for posting the responses. All the answers was very useful. Ksharp response gave me results that was the closest to what I was looking for. Massive thanks Ksharp.
Mike
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
