DATA Step, Macro, Functions and more

Need to convert a JSON file to SAS dataset

Accepted Solution Solved
Reply
Occasional Contributor yo1
Occasional Contributor
Posts: 19
Accepted Solution

Need to convert a JSON file to SAS dataset

Hello,

 

I need to convert a JSON file into a SAS dataset that I can manage and I am having trouble importing it in a way that is usable. 

Any suggestions would be greatly apprecated.

 

I have attached the file to this post (it is the text file verison).  Below is a code of what I have:

 

 

data have;
infile 'x:\SeriesDataOut2.json' recfm=n dlm='{}[],';
input value : $200. @@;
if value in: ('"IndustrYDescription":' '"NoteRef"' '"TableID"' ) then delete;
run;

 

data temp;
 set have;
 length name v $ 100;
 if value =: '"Results":' then n+1;
 name=scan(value,2,'"','m');
 v=scan(value,-2,'"','m');
 drop value;
 if name=''  then delete;
run;
proc sort data=temp;
by n; run;

 

/*the proc transpose is kicking out a lot of errors*/
proc transpose data=temp out=want(drop=_name_ n)  ;
by n;
var v;
id name;
run;


Accepted Solutions
Solution
‎10-31-2016 01:08 PM
Super User
Posts: 10,044

Re: Need to convert a JSON file to SAS dataset

But you didn't post the output yet .

 

data temp;
infile '/folders/myfolders/SeriesDataOut.txt' dsd lrecl=300000000 dlm='[]{},:';
input x : $2000. @@;
retain flag;
if x ='Data' then flag=1;
if x='Notes' then group+1;
if flag and x not in (' ' 'Data' 'Notes');
run;

data a b;
 set temp;
 if group=0 then output a;
  else output b;
run;





data a;
 set a;
 if mod(_n_,2)=1 then n+1;
 drop group flag;
run;
proc transpose data=a out=temp_a;
 by n;
 var x;
run;
data temp_a;
 set temp_a;
 if col1='TableID' then group+1;
 drop n _name_;
run;
proc transpose data=temp_a out=want_a(drop=group _name_);
 by group;
 id col1;
 var col2;
run;






data b;
 set b;
 if mod(_n_,2)=1 then n+1;
 drop group flag;
run;
proc transpose data=b out=temp_b;
 by n;
 var x;
run;
data temp_b;
 set temp_b;
 if col1='NoteRef' then group+1;
 drop n _name_;
run;
proc transpose data=temp_b out=want_b(drop=group _name_);
 by group;
 id col1;
 var col2;
run;

View solution in original post


All Replies
PROC Star
Posts: 1,760

Re: Need to convert a JSON file to SAS dataset

[ Edited ]

Have you seen this?

http://blogs.sas.com/content/sasdummy/2015/09/28/parse-json-from-sas/

 

Otherwise if you want to use a data step, here's a quick piece of code that reads your file, to get you started.

Please modify to suit your needs.

 

 

data READ;
  infile "%sysfunc(pathname(WORK))\SeriesDataOut.txt" recfm=n dlm='}{';
  input STR : $400. @@;
  if _N_=1 then 
    call execute ('data WANT; length Industry DataValue $8 IndustrYDescription $400 NoteRef $12;');
  if STR=:'"TableID"' then do;
    STR=tranwrd(STR , '":"', '"n="'); 
    STR=tranwrd(STR, '","', '";"' );
    call execute(cats(STR,';output;'));
  put STR=;
  end;  
run;

run;
proc print noobs;
run;

 

Industry DataValue IndustrYDescription NoteRef TableID Frequency Year Quarter
11 447.3 Agriculture, forestry, fishing, and hunting 15 15 A 2012 2012
111CA 400.9 Farms 15 15 A 2012 2012
113FF 46.4 Forestry, fishing, and related activities 15 15 A 2012 2012
21 570.2 Mining 15 15 A 2012 2012
211 341.3 Oil and gas extraction 15 15 A 2012 2012
212 133.5 Mining, except oil and gas 15 15 A 2012 2012
213 95.4 Support activities for mining 15 15 A 2012 2012

 

[Edit: Removed an empty table that lingered here.]

Occasional Contributor yo1
Occasional Contributor
Posts: 19

Re: Need to convert a JSON file to SAS dataset

The output that you have listed is what I was aiming for.  However, when I used your code for some reason I wasn't able to replicate it.

 

 

Solution
‎10-31-2016 01:08 PM
Super User
Posts: 10,044

Re: Need to convert a JSON file to SAS dataset

But you didn't post the output yet .

 

data temp;
infile '/folders/myfolders/SeriesDataOut.txt' dsd lrecl=300000000 dlm='[]{},:';
input x : $2000. @@;
retain flag;
if x ='Data' then flag=1;
if x='Notes' then group+1;
if flag and x not in (' ' 'Data' 'Notes');
run;

data a b;
 set temp;
 if group=0 then output a;
  else output b;
run;





data a;
 set a;
 if mod(_n_,2)=1 then n+1;
 drop group flag;
run;
proc transpose data=a out=temp_a;
 by n;
 var x;
run;
data temp_a;
 set temp_a;
 if col1='TableID' then group+1;
 drop n _name_;
run;
proc transpose data=temp_a out=want_a(drop=group _name_);
 by group;
 id col1;
 var col2;
run;






data b;
 set b;
 if mod(_n_,2)=1 then n+1;
 drop group flag;
run;
proc transpose data=b out=temp_b;
 by n;
 var x;
run;
data temp_b;
 set temp_b;
 if col1='NoteRef' then group+1;
 drop n _name_;
run;
proc transpose data=temp_b out=want_b(drop=group _name_);
 by group;
 id col1;
 var col2;
run;
Valued Guide
Posts: 505

Re: Need to convert a JSON file to SAS dataset

SOLUTION USING OPs txt file


HAVE OPs txt file (which seems to be incomplete?)
======================================================================


{"BEAAPI":{"Request":{"RequestParam":[{"ParameterName":"YEAR","ParameterValue":"2012"},{"ParameterNa
me":"TABLEID","ParameterValue":"15"},{"ParameterName":"DATASETNAME","ParameterValue":"GDPBYINDUSTRY"
},{"ParameterName":"RESULTFORMAT","ParameterValue":"JSON"},{"ParameterName":"INDUSTRY","ParameterVal
ue":"ALL"},{"ParameterName":"METHOD","ParameterValue":"GETDATA"},{"ParameterName":"USERID","Paramete
rValue":"715A1CFC-2F60-47A0-8DDF-82ED8831520A"},{"ParameterName":"FREQUENCY","ParameterValue":"A"}]}
,"Results": {"Statistic":"GDP by Industry Table","UTCProductionTime":"2016-10-21T14:15:11.923","Dime
......

WANT  SAS Dataset JSN
=====================

Middle Observation(1 ) of Last dataset = WORK.LVLA - Total Obs 1

 -- CHARACTER --

 -- CHARACTER --
REQUEST_REQUESTPARAM_PARAMETRNM  C    4       YEAR                Request.RequestParam.ParameterName
REQUEST_REQUESTPARAM_PARAMETRVL  C    4       2012                Request.RequestParam.ParameterValue
REQUEST_REQUESTPARAM_PARMTRNM_1  C    7       TABLEID             Request.RequestParam.ParameterName.1
REQUEST_REQUESTPARAM_PARMTRVL_1  C    2       15                  Request.RequestParam.ParameterValue.1
REQUEST_REQUESTPARAM_PARMTRNM_2  C    11      DATASETNAME         Request.RequestParam.ParameterName.2
REQUEST_REQUESTPARAM_PARMTRVL_2  C    13      GDPBYINDUSTRY       Request.RequestParam.ParameterValue.2
REQUEST_REQUESTPARAM_PARMTRNM_3  C    12      RESULTFORMAT        Request.RequestParam.ParameterName.3
REQUEST_REQUESTPARAM_PARMTRVL_3  C    4       JSON                Request.RequestParam.ParameterValue.3
REQUEST_REQUESTPARAM_PARMTRNM_4  C    8       INDUSTRY            Request.RequestParam.ParameterName.4
REQUEST_REQUESTPARAM_PARMTRVL_4  C    3       ALL                 Request.RequestParam.ParameterValue.4
REQUEST_REQUESTPARAM_PARMTRNM_5  C    6       METHOD              Request.RequestParam.ParameterName.5
REQUEST_REQUESTPARAM_PARMTRVL_5  C    7       GETDATA             Request.RequestParam.ParameterValue.5
REQUEST_REQUESTPARAM_PARMTRNM_6  C    6       USERID              Request.RequestParam.ParameterName.6
REQUEST_REQUESTPARAM_PARMTRVL_6  C    36      715A1CFC-2F60-47    Request.RequestParam.ParameterValue.6
REQUEST_REQUESTPARAM_PARMTRNM_7  C    9       FREQUENCY           Request.RequestParam.ParameterName.7
REQUEST_REQUESTPARAM_PARMTRVL_7  C    1       A                   Request.RequestParam.ParameterValue.7
RESULTS_STATISTIC                C    21      GDP by Industry     Results.Statistic
RESULTS_UTCPRODUCTIONTIME        C    23      2016-10-21T14:15    Results.UTCProductionTime
RESULTS_DIMENSIONS_ORDINAL       C    1       1                   Results.Dimensions.Ordinal
RESULTS_DIMENSIONS_NAME          C    7       TableID             Results.Dimensions.Name

...
RESULTS_NOTES_NOTEREF_2          C    6       15.3.Q              Results.Notes.NoteRef.2
RESULTS_NOTES_NOTETEXT_2         C    244     3. Consists of c    Results.Notes.NoteText.2
RESULTS_NOTES_NOTEREF_3          C    6       15.4.Q              Results.Notes.NoteRef.3
RESULTS_NOTES_NOTETEXT_3         C    244     (u): Underlying     Results.Notes.NoteText.3
RESULTS_NOTES_NOTEREF_4          C    2       15                  Results.Notes.NoteRef.4
RESULTS_NOTES_NOTETEXT_4         C    46      Gross Output by     Results.Notes.NoteText.4


The CONTENTS Procedure

Data Set Name        WORK.JSN                      Observations          1
Member Type          DATA                          Variables             856
Engine               V9                            Indexes               0
Created              10/25/2016 10:41:53           Observation Length    6317
Last Modified        10/25/2016 10:41:53           Deleted Observations  0
Protection                                         Compressed            NO
Data Set Type                                      Sorted                NO
Label
Data Representation  WINDOWS_64
Encoding             wlatin1  Western (Windows)


                  Engine/Host Dependent Information

Data Set Page Size          65536
Number of Data Set Pages    3
First Data Page             3
Max Obs per Page            10
Obs in First Data Page      1
Number of Data Set Repairs  0
ExtendObsCounter            YES



                                        Alphabetic List of Variables and Attributes

  #    Variable                           Type    Len    Format    Informat    Label

  1    BEAAPI_REQUEST_REQSTPRM_PRMTRNM    Char      4    $4.       $4.         BEAAPI.Request.RequestParam.ParameterName
  2    BEAAPI_REQUEST_REQSTPRM_PRMTRVL    Char      4    $4.       $4.         BEAAPI.Request.RequestParam.ParameterValue
  3    BEAAPI_REQUST_RQSTPRM_PRMTRNM_1    Char      7    $7.       $7.         BEAAPI.Request.RequestParam.ParameterName.1
  5    BEAAPI_REQUST_RQSTPRM_PRMTRNM_2    Char     11    $11.      $11.        BEAAPI.Request.RequestParam.ParameterName.2
  7    BEAAPI_REQUST_RQSTPRM_PRMTRNM_3    Char     12    $12.      $12.        BEAAPI.Request.RequestParam.ParameterName.3
  9    BEAAPI_REQUST_RQSTPRM_PRMTRNM_4    Char      8    $8.       $8.         BEAAPI.Request.RequestParam.ParameterName.4
 11    BEAAPI_REQUST_RQSTPRM_PRMTRNM_5    Char      6    $6.       $6.         BEAAPI.Request.RequestParam.ParameterName.5
  ...

848    BEAAPI_RESULTS_NOTES_NOTETEXT      Char    100    $100.     $100.       BEAAPI.Results.Notes.NoteText
849    BEAAPI_RESULTS_NOTES_NOTEREF_1     Char      8    $8.       $8.         BEAAPI.Results.Notes.NoteRef.1
850    BEAAPI_RESULTS_NOTES_NOTETEXT_1    Char    244    $244.     $244.       BEAAPI.Results.Notes.NoteText.1
851    BEAAPI_RESULTS_NOTES_NOTEREF_2     Char      6    $6.       $6.         BEAAPI.Results.Notes.NoteRef.2
852    BEAAPI_RESULTS_NOTES_NOTETEXT_2    Char    244    $244.     $244.       BEAAPI.Results.Notes.NoteText.2
853    BEAAPI_RESULTS_NOTES_NOTEREF_3     Char      6    $6.       $6.         BEAAPI.Results.Notes.NoteRef.3
854    BEAAPI_RESULTS_NOTES_NOTETEXT_3    Char    244    $244.     $244.       BEAAPI.Results.Notes.NoteText.3
855    BEAAPI_RESULTS_NOTES_NOTEREF_4     Char      2    $2.       $2.         BEAAPI.Results.Notes.NoteRef.4
856    BEAAPI_RESULTS_NOTES_NOTETEXT_4    Char     46    $46.      $46.        BEAAPI.Results.Notes.NoteText.4


SOLUTION R WORKING CODE
=======================

   jsn <- as.data.frame(fromJSON(paste(readLines(json_file), collapse="")));


FULL SOLUTION

It is possible to reshape the 'list' stucture in R.
SAS requires a rectangular structure.

%utl_submit_r64('
library("rjson");
library(foreign);
json_file <- "d:/json/seriesdataout.txt" ;
jsn <- as.data.frame(fromJSON(paste(readLines(json_file), collapse="")));
write.dta(jsn,convert.factors="string",version = 10L, "d:/dta/jsn.dta");
');

* if you have access to IML interface to R you can eliminate the truncation.

package 'rjson' was built under R version 3.2.5
Warning message:                **
In readLines(json_file) :
  incomplete final line found on 'd:/json/seriesdataout.txt'
Warning messages:
1: In write.dta(jsn, convert.factors = "string", version = 10L, "d:/dta/jsn.dta") :
  abbreviating variable names
2: In write.dta(jsn, convert.factors = "string", version = 10L, "d:/dta/jsn.dta") :
  character strings of >244 bytes in column 850 will be truncated
3: In write.dta(jsn, convert.factors = "string", version = 10L, "d:/dta/jsn.dta") :
  character strings of >244 bytes in column 852 will be truncated
4: In write.dta(jsn, convert.factors = "string", version = 10L, "d:/dta/jsn.dta") :
  character strings of >244 bytes in column 854 will be truncated

proc import out=jsn file='d:\dta\jsn.dta' replace;
;run;quit;

proc print data=jsnlyt;
run;quit;
Occasional Contributor yo1
Occasional Contributor
Posts: 19

Re: Need to convert a JSON file to SAS dataset

Thank you.  Your code gave me the results that was closest to what I was looking for.  Sorry about not posting output. I have posted the output to the most recent response to this topics dialog.

Occasional Contributor yo1
Occasional Contributor
Posts: 19

Re: Need to convert a JSON file to SAS dataset

Hello everyone,

 

Sorry about the late response, I was tied up at work. The output that I needed was supposed to look like the attached picture.

 

 

Mainly as the year as the variable and the different industries as rows.  I understand that the JSON file had more information then what is in the attachment but it is a good start.

 

Thank you for posting the responses. All the answers was very useful. Ksharp response gave me results that was the closest to what I was looking for. Massive thanks Ksharp.

 

Mike


OUTPUT_FOR_JSON.JPG
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 743 views
  • 2 likes
  • 4 in conversation