The SAS Output Delivery System and reporting techniques

importing JSON data

Reply
Super Contributor
Posts: 259

importing JSON data

[ Edited ]

Hi

 

I have a text file (example attached) in the form:

 

{"market_cap_by_available_supply": [[1367174841000, 1500517590], [1367261101000, 1575032004], ............

 

"volume_usd": [[1367174841000, 0.0], [1367261101000, 0.0], ........

 

So there is market cap information given where the first number in the square brackets is the time and the second is the market supply. Then halfway through the txt file the market volume is given in a similar format (he first number in the square brackets is the time and the second is the market volume). If you look at it in excel it just takes up one row. I want to inport into SAS and organise it like 

 

Time Market Cap Volume
1370122802000 1500517590 2450066
1370215501000 1500530000 2354556

  ....                             .....                 .....

 

Thank you 

 

 

Super User
Posts: 11,343

Re: Formatting unusual txt data

Your example file seems to end with the character "f" and has a blank line between the specified values. I am a bit concerned that you may have provided an edited file that a solution to read may not work with your real data. Also the first character is a { and that is not closed (no matching "}")  also indicating that the example file may have issues that you didn't specify.

Valued Guide
Posts: 505

Re: Formatting unusual txt data

Your txt file is in JSON format. I think it may be better to use proven code than roll your own code.

Python, Perl or R?
Super Contributor
Posts: 259

Re: Formatting unusual txt data

Posted in reply to rogerjdeangelis

I'm not familiar with those coding languages... I just use SAS so a solution in sas would be easier for me. 

Super Contributor
Posts: 259

Re: Formatting unusual txt data

Hi, I have attached the full file. There are no spaces but there are four bits of information provided... first, the market cap, the the price, then another price and finnally the volume. Each new piece of info starts with some text 

 

"market_cap_by_available_supply"

 

"price_btc"

 

"price_usd"

 

"volume_usd"

 

If you search these you will see them in the attached

Super User
Posts: 19,789

Re: Formatting unusual txt data

What version of SAS do you have? I suggested JSON method earlier with a link to sample code v

Did that method not work?

Super Contributor
Posts: 259

Re: Formatting unusual txt data

9.3

Valued Guide
Posts: 505

Re: Formatting unusual txt data

/* T1003440 SAS/WPS/R: Convert JSON file to SAS dataset

SAS/WPS/R: Convert JSON file to SAS dataset

If you have the IML interface to R you can jusrt cut and paste the code below.

see
https://goo.gl/QrPkcT
https://communities.sas.com/t5/Base-SAS-Programming/Importing-data-from-internet/m-p/346519

Key results from a voo doo analysis on the end

I have not checked the extract? Seems correct?

HAVE JSON file with four sections
===================================================

location
"https://graphs.coinmarketcap.com/v1/datapoints/bitcoin/1367174841000/1491151745000/"

MARKET_CAP_BY_AVAILABLE_SUPPLY

{"market_cap_by_available_supply": [[1367174841000, 1500517590], [1367261101000, 1575032004],
[1367347502000, 1501657492], [1367433902000, 1298951550], [1367522401000, 1148667722],
[1367608801000, 1011066494], [1367695201000, 1236351843], [1367781901000, 1298377788],
[1367868303000, 1315992303], [1367954701000, 1183766500], [1368041101000, 1253655040],
...

PRICE_BTC

{"price_btc": [[1367174841000, 1.0], [1367261101000, 1.0], [1367347502000, 1.0], [
0], [1368390301000, 1.0], [1368476701000, 1.0], [1368563101000, 1.0], [1368650101
62101000, 1.0], [1369948501000, 1.0], [1370036402000, 1.0], [1370122802000, 1.0],
1.0], [1371431702000, 1.0], [1371518404000, 1.0], [1371605102000, 1.0], [13716915
2906202000, 1.0], [1372992361000, 1.0], [1373078762000, 1.0], [1373165161000, 1.0

PRICE_USD
...

VOLUME_USD
...


WANT (SAS dataset)
==================

Up to 40 obs from want total obs=1,422


                    MARKET_    PRICE_  PRICE_  VOLUME_
Obs  PRIMARY_KEY      CAP2      BTC2    USD2    USD2

   1 1367174841000  1500517590    1      135       0
   2 1367261101000  1575032004    1      142       0
   3 1367347502000  1501657492    1      135       0
   4 1367433902000  1298951550    1      117       0
   5 1367522401000  1148667722    1      103       0
   6 1367608801000  1011066494    1       91       0
   7 1367695201000  1236351843    1      111       0
   8 1367781901000  1298377788    1      117       0
   9 1367868303000  1315992303    1      118       0
...

1419 1490968747000  16914284005    1     1041   392275000
1420 1491055148000  17609934881    1     1084   357781000
1421 1491141549000  17631751754    1     1085   311580000
1422 1491151745000  17831029868    1     1097   348902000

WORKING CODE (prety much it)
============

   datawps <- as.data.frame(unlist(fromJSON(
      "https://graphs.coinmarketcap.com/v1/datapoints/bitcoin/1367174841000/1491151745000/")));

FULL SOLUTION
=============

* to save time I downloaded, converted to a list and saved the result;
%utl_submit_r64('
source("C:/Program Files/R/R-3.3.2/etc/Rprofile.site", echo=T);
library(RJSONIO);
datawps <-fromJSON("https://graphs.coinmarketcap.com/v1/datapoints/bitcoin/1367174841000/1491151745000/");
save(datawps,file="d:/rda/hsonr64.rda");
');

* all this to reformat for sas/wps;
* I am not skilled in converting list structures to SAS datasets;
proc datasets lib=work kill;
run;quit;
data class;set sashelp.class;
run;quit;
%utl_submit_wps64('
options set=R_HOME "C:/Program Files/R/R-3.3.2";
libname wrk "%sysfunc(pathname(work))";
proc r;
submit;
source("C:/Program Files/R/R-3.3.2/etc/Rprofile.site", echo=T);
library(RJSONIO);
load("d:/rda/hsonr64.rda");
str(datawps);
market_cap <-t(as.data.frame(datawps[[1]]));
price_btc  <-t(as.data.frame(datawps[[2]]));
price_usd  <-t(as.data.frame(datawps[[3]]));
volume_usd <-t(as.data.frame(datawps[[4]]));
endsubmit;
import r=market_cap  data=wrk.market_cap ;
import r=price_btc   data=wrk.price_btc  ;
import r=price_usd   data=wrk.price_usd  ;
import r=volume_usd  data=wrk.volume_usd ;
run;quit;
');

data want;
  retain char "NEED";
  merge
     market_cap (rename=(v1= primary_key v2=market_cap2 ))
     price_btc  (rename=(v1= price_btc1  v2=price_btc2  ))
     price_usd  (rename=(v1= price_usd1  v2=price_usd2  ))
     volume_usd (rename=(v1= volume_usd1 v2=volume_usd2 ))
  ;
  drop
    price_btc1
    price_usd1
    volume_usd1
  ;
run;quit;

proc print data=want width=min;
format _numeric_ 23.;
run;quit;



*                           _
__   _____   ___         __| | ___   ___
\ \ / / _ \ / _ \ _____ / _` |/ _ \ / _ \
 \ V / (_) | (_) |_____| (_| | (_) | (_) |
  \_/ \___/ \___/       \__,_|\___/ \___/

;


see https://www.dropbox.com/s/po3ahepe7r7dnm7/oto_voodoo.sas?dl=0

%inc "c:/oto/oto_voodoo.sas";

%utlvdoc
    (
    libname        = work         /* libname of input dataset */
    ,data          = want         /* name of input dataset */
    ,key           = primary_key  /* 0 or variable */
    ,ExtrmVal      = 10           /* display top and bottom 30 frequencies */
    ,UniPlot       = 1            /* 'true' enables ('false' disables) plot option on univariate output */
    ,UniVar        = 1            /* 'true' enables ('false' disables) plot option on univariate output */
    ,misspat       = 1            /* 0 or 1 missing patterns */
    ,chart         = 1            /* 0 or 1 line printer chart */
    ,taball        = 0
    ,tabone        = 0            /* 0 or  variable vs all other variables          */
    ,mispop        = 1            /* 0 or 1  missing vs populated*/
    ,dupcol        = 1            /* 0 or 1  columns duplicated  */
    ,unqtwo        = 0                                                                              /* 0 */
    ,vdocor        = 1            /* 0 or 1  correlation of numeric variables */
    ,oneone        = 1            /* 0 or 1  one to one - one to many - many to many */
    ,cramer        = 0            /* 0 or 1  association of character variables    */
    ,optlength     = 1
    ,maxmin        = 1
    ,unichr        = 0
    ,printto       = d:\txt\vdo\&data..txt        /* file or output if output window */
    ,Cleanup       = 0           /* 0 or 1 delete intermediate datasets */
    );


Observations:       1422     .

Missing or uniformly evaluated variables:
 ------------------------------------------
   - missing for all observations:      0
   - uniformly evaluated -- all:        2
       with one or more missing values:      0
       with no missing values:               2
 ==================================================


 #     Variable         Unique Values      Type    Length  Format
---    --------         -------------      ----    ------  ------

  1    MARKET_CAP2             1,422       num       8  ( unique could be primary key)
  2    PRICE_BTC2                  1       num       8  only one value
  3    PRICE_USD2              1,417       num       8
  4    PRIMARY_KEY             1,422       num       8  ( primary key)
  5    VOLUME_USD2             1,180       num       8


10 most frequent values of VOLUME_USD2   ( )

              1,180 distinct values in total

 Rank     Value           Frequency
 ----     -----           ---------


   1    0                      242   (has three levels 242,2 and
   2    26078800                 2
   3    178002                   1
   4    2865040                  1

Quantiles (Definition 5)

Level           Quantile

100% Max       783283000
99%            354087000
95%            171667000
90%            114977000
75% Q3          62071700
50% Median      28060550
25% Q1          13062400
10%                    0
5%                     0
1%                     0
0% Min                 0


10 most frequent values of PRICE_USD2    ( )

                 1,417 distinct values in total

    Rank     Value     Frequency
    ----     -----     ---------


      1    135.3              3
      2    119                2
      3    121.18             2
      4    130.18             2
      5    67.809             1
      6    68.0831            1


Variable          N    Miss         Minimum         Maximum            Mean          Median         Std Dev             Sum
---------------------------------------------------------------------------------------------------------------------------
MARKET_CAP2    1422       0       771368060     20588754113      6606387737      5998179929      4083586970    9.3942834E12
PRICE_USD2     1422       0      67.8090000         1270.90     459.0338500     422.3765000     258.9802662       652746.13
PRIMARY_KEY    1422       0    1.3671748E12    1.4911517E12    1.4293695E12    1.4295505E12     35927896948    2.0325635E15
VOLUME_USD2    1422       0               0       783283000     50823830.61     28060550.00     71096834.83     72271487132
---------------------------------------------------------------------------------------------------------------------------


Maximums and Minimums work.want

VARIABLE       MIN             MAX

PRIMARY_KEY    1.3671748E12    1.4911517E12
MARKET_CAP2    771368060       20588754113
PRICE_BTC2     1               1
PRICE_USD2     67.809          1270.9
VOLUME_USD2    0               783283000

Maximum Number of Bytes to hold Character and Numeric Vales Exactly

                VARIABLE_                    NEW_
Obs    __TYP      TYPE       NAME           LENGTH    ORIGINAL    SAVINGS

 1       C      Character    CHAR              4          4          0
 2       N      Numeric      MARKET_CAP2       6          8          2
 3       N      Numeric      PRICE_BTC2        3          8          5
 4       N      Numeric      PRICE_USD2        8          8          0
 5       N      Numeric      PRIMARY_KEY       7          8          1
 6       N      Numeric      VOLUME_USD2       5          8          3


Relationship OF VARIABLES WHERE MAX LEVELS IS 2000 AND MAX NUMBER OF VARIABLES IS 100
One to One  -- One to many  --  Many to One -- Many to Many

Obs                       OUT

 1     Many to One     MARKET_CAP2 to PRICE_USD2

 2     One to One      MARKET_CAP2 to PRIMARY_KEY

 3     Many to One     MARKET_CAP2 to VOLUME_USD2
 4     One to Many     PRICE_USD2 to PRIMARY_KEY
 5     Many to One     PRICE_USD2 to VOLUME_USD2
 6     Many to One     PRIMARY_KEY to VOLUME_USD2


There are no missing values


                                            Cumulative    Cumulative
    PRIMARY_KEY    Frequency     Percent     Frequency      Percent
--------------------------------------------------------------------
Positive               1422      100.00          1422       100.00


                                            Cumulative    Cumulative
    MARKET_CAP2    Frequency     Percent     Frequency      Percent
--------------------------------------------------------------------
Positive               1422      100.00          1422       100.00


                                            Cumulative    Cumulative
     PRICE_BTC2    Frequency     Percent     Frequency      Percent
--------------------------------------------------------------------
Positive               1422      100.00          1422       100.00


                                            Cumulative    Cumulative
     PRICE_USD2    Frequency     Percent     Frequency      Percent
--------------------------------------------------------------------
Positive               1422      100.00          1422       100.00


                                            Cumulative    Cumulative
    VOLUME_USD2    Frequency     Percent     Frequency      Percent
--------------------------------------------------------------------
Zero                    242       17.02           242        17.02
Positive               1180       82.98          1422       100.00


check results below

 filename inp "d:/txt/json.txt"  recfm=n;
 data cntcma;
   retain cntcma 0;
   infile inp end=dne;
   input byte $char1.;
   cntcma=cntcma + (byte=',');
   output;
 run;quit;

148975 commas

11372=1422*8 - 4 =11372

proc means data=cntcma;
run;quit;
Super Contributor
Posts: 259

Re: Formatting unusual txt data

Posted in reply to rogerjdeangelis

Thank you but I don't have SAS/IML unfortunately (just the attached).  


Capture.GIF
Super User
Posts: 19,789

Re: Formatting unusual txt data

Super Contributor
Posts: 259

Re: Formatting unusual txt data

This requires version 9.4 which I don't have unfortunately 

Super Contributor
Posts: 259

Re: Formatting unusual txt data

In tried this code

 

filename api url "https://graphs.coinmarketcap.com/v1/datapoints/bitcoin.json";

DATA example;
INFILE api LRECL = 1000000 SCANOVER encoding="utf-8" truncover dsd dlm="}},{";
INPUT
@'"market_cap_by_available_supply":' Mkt_Cap : best32.
@'"price_btc":' Price_BTC : best32.
@'"price_usd":' Price_USD : best32.
@'"volume_usd":' Vvolume_usd : best32.
@@;
run;

 

 

But I get the same error

 

"ERROR: Invalid reply received from the HTTP server. Use the debug
option for more info."

Super User
Posts: 19,789

Re: Formatting unusual txt data

In 9.3 I think it was PROC GROOVY that was used. 

I've used this before it's a pain...is there an option to use a file converter and convert the file to a different format or do you need a direct connection to the web data. I guess, is this a one time process or are you setting up a process that will be repeated?

 

https://communities.sas.com/t5/SAS-Procedures/Parse-json-file-with-Proc-Groovy/td-p/187484

Super Contributor
Posts: 259

Re: importing JSON data

Hi reeza, I'm setting up a process that will be repeated
Super User
Posts: 19,789

Re: importing JSON data

Then PROC GROOVY is your best option.

Ask a Question
Discussion stats
  • 14 replies
  • 538 views
  • 1 like
  • 4 in conversation