DATA Step, Macro, Functions and more

Extracting Data From Webpages

Accepted Solution Solved
Reply
Super Contributor
Posts: 259
Accepted Solution

Extracting Data From Webpages

 

Hi

 

My sas version is 9.3 Base. I tried using the following code but I got an error:

 


filename bit url "https://graphs.coinmarketcap.com/v1/datapoints/bitcoin.txt";


data bitcoin;
infile bit truncover;
input @1 bigline $char256.;
run;

 

The data is on a webpage (link https://graphs.coinmarketcap.com/v1/datapoints/bitcoin/) so perhaps I have the wrong filename code. 

 

The error I'm getting is 

 

ERROR: Invalid reply received from the HTTP server. Use the debug
option for more info.

 


Accepted Solutions
Solution
‎04-03-2017 11:55 AM
Valued Guide
Posts: 505

Re: Extracting Data From Webpages

/* T1003430 Transposing a very fat (1,568 columns) excel sheet to 78,250  key/value pairs

WPS/SAS/R: Transposing a very fat (1,568 columns) excel sheet to 78,250  key/value pairs

inspired by (this message) also on SAS-L
https://goo.gl/ohpG8K
https://communities.sas.com/t5/General-SAS-Programming/datastream-into-SAS/m-p/346447/highlight/false#M44677

Only WPS/R or IML/R could create the SAS dataset from the fat sheet?

proc import seemed failed (after column 4/30/2009 all data was missing (bad number conversion?)
Did not try to figure it out dicided to switch to WPS-R interface

libname failed (255 col limit)

There are ways to do this with SAS but perhaps not as direct as R, NA is missing in R.


HAVE (An EXCEL sheet with 1,568 columns)
=========================================

Middle Observation(50 ) of she - Total Obs 100

Column      -- CHARACTER --

1          NAME                      C    57      SEMBCORP INDUSTRIES
2          DATATYPE                  C    2       RI
3          CURRENCY                  C    2       S$
4          2009:01:01:00:00:00       C    7       248.88
5          2009:01:02:00:00:00       C    7       260.74
6          2009:01:05:00:00:00       C    7       284.41
           ...
1566       X2014_12_29_00_00_00      C    8       577.87
1567       X2014_12_30_00_00_00      C    8       579.25
1568       X2014_12_31_00_00_00      C    8       581.92


WANT
===
check 100*(1568-3(dimension values)) = 156,500. we do get the right number of rows)

Up to 40 obs from want total obs=156,500

 Obs    DATATYPE          NAME           CURRENCY       DTE       VAL

    1       P        SINGAPORE TELECOM       S$       01JAN2009    2.55
    2       P        SINGAPORE TELECOM       S$       02JAN2009    2.67
    3       P        SINGAPORE TELECOM       S$       05JAN2009    2.61
    4       P        SINGAPORE TELECOM       S$       06JAN2009    2.6
    5       P        SINGAPORE TELECOM       S$       07JAN2009    2.52
    6       P        SINGAPORE TELECOM       S$       08JAN2009    2.49
    7       P        SINGAPORE TELECOM       S$       09JAN2009    2.52
    8       P        SINGAPORE TELECOM       S$       12JAN2009    2.55
    9       P        SINGAPORE TELECOM       S$       13JAN2009    2.53
   10       P        SINGAPORE TELECOM       S$       14JAN2009    2.52
....
78247       P        CHINA AVTN.OIL (SING.)  S$       26DEC2014    0.66
78248       P        CHINA AVTN.OIL (SING.)  S$       29DEC2014    0.655
78249       P        CHINA AVTN.OIL (SING.)  S$       30DEC2014    0.68
78250       P        CHINA AVTN.OIL (SING.)  S$       31DEC2014    0.67


Middle Observation(78250 ) of want - Total Obs 156,500

 -- CHARACTER --
DATATYPE                         C    2       RI                  DATATYPE
NAME                             C    57      SEMBCORP INDUSTR    NAME
CURRENCY                         C    2       S$                  CURRENCY
VAL                              C    8       581.92              VAL

 -- NUMERIC --
DTE                              N    8       20088               DTE

WORKING CODE

       wb <- loadWorkbook("d:/xls/sample.xlsx",create = FALSE);
       import r=she data=wrk.she;

       * key/value pairs;
       array vals X:;
       dte=input(substr(vname(vals),2,10),yymmdd10.);
       val=vals;


*     _                        _____ _         _       _
__  _| |___       ___  __ _ __|___  | |__   __| | __ _| |_
\ \/ / / __|_____/ __|/ _` / __| / /| '_ \ / _` |/ _` | __|
 >  <| \__ \_____\__ \ (_| \__ \/ / | |_) | (_| | (_| | |_
/_/\_\_|___/     |___/\__,_|___/_/  |_.__/ \__,_|\__,_|\__|

;

%utl_submit_r64('
library(XLConnect);
wb <- loadWorkbook("d:/xls/sample.xlsx",create = FALSE);
she<-readWorksheet(wb, "sheet1",colTypes="character", forceConversion=TRUE);
endsubmit;
import r=she data=wrk.she;
run;quit;
');

*_        ______  ____       ____    _    ____
\ \      / /  _ \/ ___|     / ___|  / \  / ___|
 \ \ /\ / /| |_) \___ \ ____\___ \ / _ \ \___ \
  \ V  V / |  __/ ___) |_____|__) / ___ \ ___) |
   \_/\_/  |_|   |____/     |____/_/   \_\____/

;

data want (keep=datatype name currency dte val);
  retain datatype name currency dte val ;
  format dte date9.;
  set she(where=(datatype in ('RI','P')));
      array vals X:;
      do over vals;
         dte=input(substr(vname(vals),2,10),yymmdd10.);
         val=vals;
         output;
      end;
  run;quit;
run;quit;




/* T1003440 SAS/WPS/R: Convert JSON file to SAS dataset

SAS/WPS/R: Convert JSON file to SAS dataset

If you have the IML interface to R you can jusrt cut and paste the code below.

see
https://goo.gl/QrPkcT
https://communities.sas.com/t5/Base-SAS-Programming/Importing-data-from-internet/m-p/346519

Key results from a voo doo analysis on the end

I have not checked the extract? Seems correct?

HAVE JSON file with four sections
===================================================

location
"https://graphs.coinmarketcap.com/v1/datapoints/bitcoin/1367174841000/1491151745000/"

MARKET_CAP_BY_AVAILABLE_SUPPLY

{"market_cap_by_available_supply": [[1367174841000, 1500517590], [1367261101000, 1575032004],
[1367347502000, 1501657492], [1367433902000, 1298951550], [1367522401000, 1148667722],
[1367608801000, 1011066494], [1367695201000, 1236351843], [1367781901000, 1298377788],
[1367868303000, 1315992303], [1367954701000, 1183766500], [1368041101000, 1253655040],
...

PRICE_BTC

{"price_btc": [[1367174841000, 1.0], [1367261101000, 1.0], [1367347502000, 1.0], [
0], [1368390301000, 1.0], [1368476701000, 1.0], [1368563101000, 1.0], [1368650101
62101000, 1.0], [1369948501000, 1.0], [1370036402000, 1.0], [1370122802000, 1.0],
1.0], [1371431702000, 1.0], [1371518404000, 1.0], [1371605102000, 1.0], [13716915
2906202000, 1.0], [1372992361000, 1.0], [1373078762000, 1.0], [1373165161000, 1.0

PRICE_USD
...

VOLUME_USD
...


WANT (SAS dataset)
==================

Up to 40 obs from want total obs=1,422


                    MARKET_    PRICE_  PRICE_  VOLUME_
Obs  PRIMARY_KEY      CAP2      BTC2    USD2    USD2

   1 1367174841000  1500517590    1      135       0
   2 1367261101000  1575032004    1      142       0
   3 1367347502000  1501657492    1      135       0
   4 1367433902000  1298951550    1      117       0
   5 1367522401000  1148667722    1      103       0
   6 1367608801000  1011066494    1       91       0
   7 1367695201000  1236351843    1      111       0
   8 1367781901000  1298377788    1      117       0
   9 1367868303000  1315992303    1      118       0
...

1419 1490968747000  16914284005    1     1041   392275000
1420 1491055148000  17609934881    1     1084   357781000
1421 1491141549000  17631751754    1     1085   311580000
1422 1491151745000  17831029868    1     1097   348902000

WORKING CODE (prety much it)
============

   datawps <- as.data.frame(unlist(fromJSON(
      "https://graphs.coinmarketcap.com/v1/datapoints/bitcoin/1367174841000/1491151745000/")));

FULL SOLUTION
=============

* to save time I downloaded, converted to a list and saved the result;
%utl_submit_r64('
source("C:/Program Files/R/R-3.3.2/etc/Rprofile.site", echo=T);
library(RJSONIO);
datawps <-fromJSON("https://graphs.coinmarketcap.com/v1/datapoints/bitcoin/1367174841000/1491151745000/");
save(datawps,file="d:/rda/hsonr64.rda");
');

* all this to reformat for sas/wps;
* I am not skilled in converting list structures to SAS datasets;
proc datasets lib=work kill;
run;quit;
data class;set sashelp.class;
run;quit;
%utl_submit_wps64('
options set=R_HOME "C:/Program Files/R/R-3.3.2";
libname wrk "%sysfunc(pathname(work))";
proc r;
submit;
source("C:/Program Files/R/R-3.3.2/etc/Rprofile.site", echo=T);
library(RJSONIO);
load("d:/rda/hsonr64.rda");
str(datawps);
market_cap <-t(as.data.frame(datawps[[1]]));
price_btc  <-t(as.data.frame(datawps[[2]]));
price_usd  <-t(as.data.frame(datawps[[3]]));
volume_usd <-t(as.data.frame(datawps[[4]]));
endsubmit;
import r=market_cap  data=wrk.market_cap ;
import r=price_btc   data=wrk.price_btc  ;
import r=price_usd   data=wrk.price_usd  ;
import r=volume_usd  data=wrk.volume_usd ;
run;quit;
');

data want;
  retain char "NEED";
  merge
     market_cap (rename=(v1= primary_key v2=market_cap2 ))
     price_btc  (rename=(v1= price_btc1  v2=price_btc2  ))
     price_usd  (rename=(v1= price_usd1  v2=price_usd2  ))
     volume_usd (rename=(v1= volume_usd1 v2=volume_usd2 ))
  ;
  drop
    price_btc1
    price_usd1
    volume_usd1
  ;
run;quit;

proc print data=want width=min;
format _numeric_ 23.;
run;quit;



*                           _
__   _____   ___         __| | ___   ___
\ \ / / _ \ / _ \ _____ / _` |/ _ \ / _ \
 \ V / (_) | (_) |_____| (_| | (_) | (_) |
  \_/ \___/ \___/       \__,_|\___/ \___/

;


see https://www.dropbox.com/s/po3ahepe7r7dnm7/oto_voodoo.sas?dl=0

%inc "c:/oto/oto_voodoo.sas";

%utlvdoc
    (
    libname        = work         /* libname of input dataset */
    ,data          = want         /* name of input dataset */
    ,key           = primary_key  /* 0 or variable */
    ,ExtrmVal      = 10           /* display top and bottom 30 frequencies */
    ,UniPlot       = 1            /* 'true' enables ('false' disables) plot option on univariate output */
    ,UniVar        = 1            /* 'true' enables ('false' disables) plot option on univariate output */
    ,misspat       = 1            /* 0 or 1 missing patterns */
    ,chart         = 1            /* 0 or 1 line printer chart */
    ,taball        = 0
    ,tabone        = 0            /* 0 or  variable vs all other variables          */
    ,mispop        = 1            /* 0 or 1  missing vs populated*/
    ,dupcol        = 1            /* 0 or 1  columns duplicated  */
    ,unqtwo        = 0                                                                              /* 0 */
    ,vdocor        = 1            /* 0 or 1  correlation of numeric variables */
    ,oneone        = 1            /* 0 or 1  one to one - one to many - many to many */
    ,cramer        = 0            /* 0 or 1  association of character variables    */
    ,optlength     = 1
    ,maxmin        = 1
    ,unichr        = 0
    ,printto       = d:\txt\vdo\&data..txt        /* file or output if output window */
    ,Cleanup       = 0           /* 0 or 1 delete intermediate datasets */
    );


Observations:       1422     .

Missing or uniformly evaluated variables:
 ------------------------------------------
   - missing for all observations:      0
   - uniformly evaluated -- all:        2
       with one or more missing values:      0
       with no missing values:               2
 ==================================================


 #     Variable         Unique Values      Type    Length  Format
---    --------         -------------      ----    ------  ------

  1    MARKET_CAP2             1,422       num       8  ( unique could be primary key)
  2    PRICE_BTC2                  1       num       8  only one value
  3    PRICE_USD2              1,417       num       8
  4    PRIMARY_KEY             1,422       num       8  ( primary key)
  5    VOLUME_USD2             1,180       num       8


10 most frequent values of VOLUME_USD2   ( )

              1,180 distinct values in total

 Rank     Value           Frequency
 ----     -----           ---------


   1    0                      242   (has three levels 242,2 and
   2    26078800                 2
   3    178002                   1
   4    2865040                  1

Quantiles (Definition 5)

Level           Quantile

100% Max       783283000
99%            354087000
95%            171667000
90%            114977000
75% Q3          62071700
50% Median      28060550
25% Q1          13062400
10%                    0
5%                     0
1%                     0
0% Min                 0


10 most frequent values of PRICE_USD2    ( )

                 1,417 distinct values in total

    Rank     Value     Frequency
    ----     -----     ---------


      1    135.3              3
      2    119                2
      3    121.18             2
      4    130.18             2
      5    67.809             1
      6    68.0831            1


Variable          N    Miss         Minimum         Maximum            Mean          Median         Std Dev             Sum
---------------------------------------------------------------------------------------------------------------------------
MARKET_CAP2    1422       0       771368060     20588754113      6606387737      5998179929      4083586970    9.3942834E12
PRICE_USD2     1422       0      67.8090000         1270.90     459.0338500     422.3765000     258.9802662       652746.13
PRIMARY_KEY    1422       0    1.3671748E12    1.4911517E12    1.4293695E12    1.4295505E12     35927896948    2.0325635E15
VOLUME_USD2    1422       0               0       783283000     50823830.61     28060550.00     71096834.83     72271487132
---------------------------------------------------------------------------------------------------------------------------


Maximums and Minimums work.want

VARIABLE       MIN             MAX

PRIMARY_KEY    1.3671748E12    1.4911517E12
MARKET_CAP2    771368060       20588754113
PRICE_BTC2     1               1
PRICE_USD2     67.809          1270.9
VOLUME_USD2    0               783283000

Maximum Number of Bytes to hold Character and Numeric Vales Exactly

                VARIABLE_                    NEW_
Obs    __TYP      TYPE       NAME           LENGTH    ORIGINAL    SAVINGS

 1       C      Character    CHAR              4          4          0
 2       N      Numeric      MARKET_CAP2       6          8          2
 3       N      Numeric      PRICE_BTC2        3          8          5
 4       N      Numeric      PRICE_USD2        8          8          0
 5       N      Numeric      PRIMARY_KEY       7          8          1
 6       N      Numeric      VOLUME_USD2       5          8          3


Relationship OF VARIABLES WHERE MAX LEVELS IS 2000 AND MAX NUMBER OF VARIABLES IS 100
One to One  -- One to many  --  Many to One -- Many to Many

Obs                       OUT

 1     Many to One     MARKET_CAP2 to PRICE_USD2

 2     One to One      MARKET_CAP2 to PRIMARY_KEY

 3     Many to One     MARKET_CAP2 to VOLUME_USD2
 4     One to Many     PRICE_USD2 to PRIMARY_KEY
 5     Many to One     PRICE_USD2 to VOLUME_USD2
 6     Many to One     PRIMARY_KEY to VOLUME_USD2


There are no missing values


                                            Cumulative    Cumulative
    PRIMARY_KEY    Frequency     Percent     Frequency      Percent
--------------------------------------------------------------------
Positive               1422      100.00          1422       100.00


                                            Cumulative    Cumulative
    MARKET_CAP2    Frequency     Percent     Frequency      Percent
--------------------------------------------------------------------
Positive               1422      100.00          1422       100.00


                                            Cumulative    Cumulative
     PRICE_BTC2    Frequency     Percent     Frequency      Percent
--------------------------------------------------------------------
Positive               1422      100.00          1422       100.00


                                            Cumulative    Cumulative
     PRICE_USD2    Frequency     Percent     Frequency      Percent
--------------------------------------------------------------------
Positive               1422      100.00          1422       100.00


                                            Cumulative    Cumulative
    VOLUME_USD2    Frequency     Percent     Frequency      Percent
--------------------------------------------------------------------
Zero                    242       17.02           242        17.02
Positive               1180       82.98          1422       100.00


check results below

 filename inp "d:/txt/json.txt"  recfm=n;
 data cntcma;
   retain cntcma 0;
   infile inp end=dne;
   input byte $char1.;
   cntcma=cntcma + (byte=',');
   output;
 run;quit;

148975 commas

11372=1422*8 - 4 =11372

proc means data=cntcma;
run;quit;

View solution in original post


All Replies
Contributor
Posts: 54

Re: Extracting Data From Webpages

Try running the same option with DEBUG option enabled.

filename bit url 'https://graphs.coinmarketcap.com/v1/datapoints/bitcoin.txt' DEBUG ;

The DEBUG option will generate detailed information to the log. this will enable you to see the GET request being sent from SAS, and
the response as well
Super Contributor
Posts: 259

Re: Extracting Data From Webpages

Posted in reply to naveenraj

Hi Naveenraj

 

I ran using the debug and got the error message with lots of other detail on the log. Do you know how I could change my code to import the data correctly? 

 

Thanks


Debug.GIF
Super User
Posts: 10,046

Re: Extracting Data From Webpages

You are connecting HTTPS not HTTP. 

So you need configure a couple of things at your sas side.

That is very cumbersome, Search it at support.sas.com

 

Or you could use the third part software(Like :  CURL) to get that HTML and import it into sas again.

Super Contributor
Posts: 259

Re: Extracting Data From Webpages

Hi Ksharp

 

Curl isn't an option for me and I don't think the instruction in the sas support is helpful. Can you not provide some code here that might assist? 

Solution
‎04-03-2017 11:55 AM
Valued Guide
Posts: 505

Re: Extracting Data From Webpages

/* T1003430 Transposing a very fat (1,568 columns) excel sheet to 78,250  key/value pairs

WPS/SAS/R: Transposing a very fat (1,568 columns) excel sheet to 78,250  key/value pairs

inspired by (this message) also on SAS-L
https://goo.gl/ohpG8K
https://communities.sas.com/t5/General-SAS-Programming/datastream-into-SAS/m-p/346447/highlight/false#M44677

Only WPS/R or IML/R could create the SAS dataset from the fat sheet?

proc import seemed failed (after column 4/30/2009 all data was missing (bad number conversion?)
Did not try to figure it out dicided to switch to WPS-R interface

libname failed (255 col limit)

There are ways to do this with SAS but perhaps not as direct as R, NA is missing in R.


HAVE (An EXCEL sheet with 1,568 columns)
=========================================

Middle Observation(50 ) of she - Total Obs 100

Column      -- CHARACTER --

1          NAME                      C    57      SEMBCORP INDUSTRIES
2          DATATYPE                  C    2       RI
3          CURRENCY                  C    2       S$
4          2009:01:01:00:00:00       C    7       248.88
5          2009:01:02:00:00:00       C    7       260.74
6          2009:01:05:00:00:00       C    7       284.41
           ...
1566       X2014_12_29_00_00_00      C    8       577.87
1567       X2014_12_30_00_00_00      C    8       579.25
1568       X2014_12_31_00_00_00      C    8       581.92


WANT
===
check 100*(1568-3(dimension values)) = 156,500. we do get the right number of rows)

Up to 40 obs from want total obs=156,500

 Obs    DATATYPE          NAME           CURRENCY       DTE       VAL

    1       P        SINGAPORE TELECOM       S$       01JAN2009    2.55
    2       P        SINGAPORE TELECOM       S$       02JAN2009    2.67
    3       P        SINGAPORE TELECOM       S$       05JAN2009    2.61
    4       P        SINGAPORE TELECOM       S$       06JAN2009    2.6
    5       P        SINGAPORE TELECOM       S$       07JAN2009    2.52
    6       P        SINGAPORE TELECOM       S$       08JAN2009    2.49
    7       P        SINGAPORE TELECOM       S$       09JAN2009    2.52
    8       P        SINGAPORE TELECOM       S$       12JAN2009    2.55
    9       P        SINGAPORE TELECOM       S$       13JAN2009    2.53
   10       P        SINGAPORE TELECOM       S$       14JAN2009    2.52
....
78247       P        CHINA AVTN.OIL (SING.)  S$       26DEC2014    0.66
78248       P        CHINA AVTN.OIL (SING.)  S$       29DEC2014    0.655
78249       P        CHINA AVTN.OIL (SING.)  S$       30DEC2014    0.68
78250       P        CHINA AVTN.OIL (SING.)  S$       31DEC2014    0.67


Middle Observation(78250 ) of want - Total Obs 156,500

 -- CHARACTER --
DATATYPE                         C    2       RI                  DATATYPE
NAME                             C    57      SEMBCORP INDUSTR    NAME
CURRENCY                         C    2       S$                  CURRENCY
VAL                              C    8       581.92              VAL

 -- NUMERIC --
DTE                              N    8       20088               DTE

WORKING CODE

       wb <- loadWorkbook("d:/xls/sample.xlsx",create = FALSE);
       import r=she data=wrk.she;

       * key/value pairs;
       array vals X:;
       dte=input(substr(vname(vals),2,10),yymmdd10.);
       val=vals;


*     _                        _____ _         _       _
__  _| |___       ___  __ _ __|___  | |__   __| | __ _| |_
\ \/ / / __|_____/ __|/ _` / __| / /| '_ \ / _` |/ _` | __|
 >  <| \__ \_____\__ \ (_| \__ \/ / | |_) | (_| | (_| | |_
/_/\_\_|___/     |___/\__,_|___/_/  |_.__/ \__,_|\__,_|\__|

;

%utl_submit_r64('
library(XLConnect);
wb <- loadWorkbook("d:/xls/sample.xlsx",create = FALSE);
she<-readWorksheet(wb, "sheet1",colTypes="character", forceConversion=TRUE);
endsubmit;
import r=she data=wrk.she;
run;quit;
');

*_        ______  ____       ____    _    ____
\ \      / /  _ \/ ___|     / ___|  / \  / ___|
 \ \ /\ / /| |_) \___ \ ____\___ \ / _ \ \___ \
  \ V  V / |  __/ ___) |_____|__) / ___ \ ___) |
   \_/\_/  |_|   |____/     |____/_/   \_\____/

;

data want (keep=datatype name currency dte val);
  retain datatype name currency dte val ;
  format dte date9.;
  set she(where=(datatype in ('RI','P')));
      array vals X:;
      do over vals;
         dte=input(substr(vname(vals),2,10),yymmdd10.);
         val=vals;
         output;
      end;
  run;quit;
run;quit;




/* T1003440 SAS/WPS/R: Convert JSON file to SAS dataset

SAS/WPS/R: Convert JSON file to SAS dataset

If you have the IML interface to R you can jusrt cut and paste the code below.

see
https://goo.gl/QrPkcT
https://communities.sas.com/t5/Base-SAS-Programming/Importing-data-from-internet/m-p/346519

Key results from a voo doo analysis on the end

I have not checked the extract? Seems correct?

HAVE JSON file with four sections
===================================================

location
"https://graphs.coinmarketcap.com/v1/datapoints/bitcoin/1367174841000/1491151745000/"

MARKET_CAP_BY_AVAILABLE_SUPPLY

{"market_cap_by_available_supply": [[1367174841000, 1500517590], [1367261101000, 1575032004],
[1367347502000, 1501657492], [1367433902000, 1298951550], [1367522401000, 1148667722],
[1367608801000, 1011066494], [1367695201000, 1236351843], [1367781901000, 1298377788],
[1367868303000, 1315992303], [1367954701000, 1183766500], [1368041101000, 1253655040],
...

PRICE_BTC

{"price_btc": [[1367174841000, 1.0], [1367261101000, 1.0], [1367347502000, 1.0], [
0], [1368390301000, 1.0], [1368476701000, 1.0], [1368563101000, 1.0], [1368650101
62101000, 1.0], [1369948501000, 1.0], [1370036402000, 1.0], [1370122802000, 1.0],
1.0], [1371431702000, 1.0], [1371518404000, 1.0], [1371605102000, 1.0], [13716915
2906202000, 1.0], [1372992361000, 1.0], [1373078762000, 1.0], [1373165161000, 1.0

PRICE_USD
...

VOLUME_USD
...


WANT (SAS dataset)
==================

Up to 40 obs from want total obs=1,422


                    MARKET_    PRICE_  PRICE_  VOLUME_
Obs  PRIMARY_KEY      CAP2      BTC2    USD2    USD2

   1 1367174841000  1500517590    1      135       0
   2 1367261101000  1575032004    1      142       0
   3 1367347502000  1501657492    1      135       0
   4 1367433902000  1298951550    1      117       0
   5 1367522401000  1148667722    1      103       0
   6 1367608801000  1011066494    1       91       0
   7 1367695201000  1236351843    1      111       0
   8 1367781901000  1298377788    1      117       0
   9 1367868303000  1315992303    1      118       0
...

1419 1490968747000  16914284005    1     1041   392275000
1420 1491055148000  17609934881    1     1084   357781000
1421 1491141549000  17631751754    1     1085   311580000
1422 1491151745000  17831029868    1     1097   348902000

WORKING CODE (prety much it)
============

   datawps <- as.data.frame(unlist(fromJSON(
      "https://graphs.coinmarketcap.com/v1/datapoints/bitcoin/1367174841000/1491151745000/")));

FULL SOLUTION
=============

* to save time I downloaded, converted to a list and saved the result;
%utl_submit_r64('
source("C:/Program Files/R/R-3.3.2/etc/Rprofile.site", echo=T);
library(RJSONIO);
datawps <-fromJSON("https://graphs.coinmarketcap.com/v1/datapoints/bitcoin/1367174841000/1491151745000/");
save(datawps,file="d:/rda/hsonr64.rda");
');

* all this to reformat for sas/wps;
* I am not skilled in converting list structures to SAS datasets;
proc datasets lib=work kill;
run;quit;
data class;set sashelp.class;
run;quit;
%utl_submit_wps64('
options set=R_HOME "C:/Program Files/R/R-3.3.2";
libname wrk "%sysfunc(pathname(work))";
proc r;
submit;
source("C:/Program Files/R/R-3.3.2/etc/Rprofile.site", echo=T);
library(RJSONIO);
load("d:/rda/hsonr64.rda");
str(datawps);
market_cap <-t(as.data.frame(datawps[[1]]));
price_btc  <-t(as.data.frame(datawps[[2]]));
price_usd  <-t(as.data.frame(datawps[[3]]));
volume_usd <-t(as.data.frame(datawps[[4]]));
endsubmit;
import r=market_cap  data=wrk.market_cap ;
import r=price_btc   data=wrk.price_btc  ;
import r=price_usd   data=wrk.price_usd  ;
import r=volume_usd  data=wrk.volume_usd ;
run;quit;
');

data want;
  retain char "NEED";
  merge
     market_cap (rename=(v1= primary_key v2=market_cap2 ))
     price_btc  (rename=(v1= price_btc1  v2=price_btc2  ))
     price_usd  (rename=(v1= price_usd1  v2=price_usd2  ))
     volume_usd (rename=(v1= volume_usd1 v2=volume_usd2 ))
  ;
  drop
    price_btc1
    price_usd1
    volume_usd1
  ;
run;quit;

proc print data=want width=min;
format _numeric_ 23.;
run;quit;



*                           _
__   _____   ___         __| | ___   ___
\ \ / / _ \ / _ \ _____ / _` |/ _ \ / _ \
 \ V / (_) | (_) |_____| (_| | (_) | (_) |
  \_/ \___/ \___/       \__,_|\___/ \___/

;


see https://www.dropbox.com/s/po3ahepe7r7dnm7/oto_voodoo.sas?dl=0

%inc "c:/oto/oto_voodoo.sas";

%utlvdoc
    (
    libname        = work         /* libname of input dataset */
    ,data          = want         /* name of input dataset */
    ,key           = primary_key  /* 0 or variable */
    ,ExtrmVal      = 10           /* display top and bottom 30 frequencies */
    ,UniPlot       = 1            /* 'true' enables ('false' disables) plot option on univariate output */
    ,UniVar        = 1            /* 'true' enables ('false' disables) plot option on univariate output */
    ,misspat       = 1            /* 0 or 1 missing patterns */
    ,chart         = 1            /* 0 or 1 line printer chart */
    ,taball        = 0
    ,tabone        = 0            /* 0 or  variable vs all other variables          */
    ,mispop        = 1            /* 0 or 1  missing vs populated*/
    ,dupcol        = 1            /* 0 or 1  columns duplicated  */
    ,unqtwo        = 0                                                                              /* 0 */
    ,vdocor        = 1            /* 0 or 1  correlation of numeric variables */
    ,oneone        = 1            /* 0 or 1  one to one - one to many - many to many */
    ,cramer        = 0            /* 0 or 1  association of character variables    */
    ,optlength     = 1
    ,maxmin        = 1
    ,unichr        = 0
    ,printto       = d:\txt\vdo\&data..txt        /* file or output if output window */
    ,Cleanup       = 0           /* 0 or 1 delete intermediate datasets */
    );


Observations:       1422     .

Missing or uniformly evaluated variables:
 ------------------------------------------
   - missing for all observations:      0
   - uniformly evaluated -- all:        2
       with one or more missing values:      0
       with no missing values:               2
 ==================================================


 #     Variable         Unique Values      Type    Length  Format
---    --------         -------------      ----    ------  ------

  1    MARKET_CAP2             1,422       num       8  ( unique could be primary key)
  2    PRICE_BTC2                  1       num       8  only one value
  3    PRICE_USD2              1,417       num       8
  4    PRIMARY_KEY             1,422       num       8  ( primary key)
  5    VOLUME_USD2             1,180       num       8


10 most frequent values of VOLUME_USD2   ( )

              1,180 distinct values in total

 Rank     Value           Frequency
 ----     -----           ---------


   1    0                      242   (has three levels 242,2 and
   2    26078800                 2
   3    178002                   1
   4    2865040                  1

Quantiles (Definition 5)

Level           Quantile

100% Max       783283000
99%            354087000
95%            171667000
90%            114977000
75% Q3          62071700
50% Median      28060550
25% Q1          13062400
10%                    0
5%                     0
1%                     0
0% Min                 0


10 most frequent values of PRICE_USD2    ( )

                 1,417 distinct values in total

    Rank     Value     Frequency
    ----     -----     ---------


      1    135.3              3
      2    119                2
      3    121.18             2
      4    130.18             2
      5    67.809             1
      6    68.0831            1


Variable          N    Miss         Minimum         Maximum            Mean          Median         Std Dev             Sum
---------------------------------------------------------------------------------------------------------------------------
MARKET_CAP2    1422       0       771368060     20588754113      6606387737      5998179929      4083586970    9.3942834E12
PRICE_USD2     1422       0      67.8090000         1270.90     459.0338500     422.3765000     258.9802662       652746.13
PRIMARY_KEY    1422       0    1.3671748E12    1.4911517E12    1.4293695E12    1.4295505E12     35927896948    2.0325635E15
VOLUME_USD2    1422       0               0       783283000     50823830.61     28060550.00     71096834.83     72271487132
---------------------------------------------------------------------------------------------------------------------------


Maximums and Minimums work.want

VARIABLE       MIN             MAX

PRIMARY_KEY    1.3671748E12    1.4911517E12
MARKET_CAP2    771368060       20588754113
PRICE_BTC2     1               1
PRICE_USD2     67.809          1270.9
VOLUME_USD2    0               783283000

Maximum Number of Bytes to hold Character and Numeric Vales Exactly

                VARIABLE_                    NEW_
Obs    __TYP      TYPE       NAME           LENGTH    ORIGINAL    SAVINGS

 1       C      Character    CHAR              4          4          0
 2       N      Numeric      MARKET_CAP2       6          8          2
 3       N      Numeric      PRICE_BTC2        3          8          5
 4       N      Numeric      PRICE_USD2        8          8          0
 5       N      Numeric      PRIMARY_KEY       7          8          1
 6       N      Numeric      VOLUME_USD2       5          8          3


Relationship OF VARIABLES WHERE MAX LEVELS IS 2000 AND MAX NUMBER OF VARIABLES IS 100
One to One  -- One to many  --  Many to One -- Many to Many

Obs                       OUT

 1     Many to One     MARKET_CAP2 to PRICE_USD2

 2     One to One      MARKET_CAP2 to PRIMARY_KEY

 3     Many to One     MARKET_CAP2 to VOLUME_USD2
 4     One to Many     PRICE_USD2 to PRIMARY_KEY
 5     Many to One     PRICE_USD2 to VOLUME_USD2
 6     Many to One     PRIMARY_KEY to VOLUME_USD2


There are no missing values


                                            Cumulative    Cumulative
    PRIMARY_KEY    Frequency     Percent     Frequency      Percent
--------------------------------------------------------------------
Positive               1422      100.00          1422       100.00


                                            Cumulative    Cumulative
    MARKET_CAP2    Frequency     Percent     Frequency      Percent
--------------------------------------------------------------------
Positive               1422      100.00          1422       100.00


                                            Cumulative    Cumulative
     PRICE_BTC2    Frequency     Percent     Frequency      Percent
--------------------------------------------------------------------
Positive               1422      100.00          1422       100.00


                                            Cumulative    Cumulative
     PRICE_USD2    Frequency     Percent     Frequency      Percent
--------------------------------------------------------------------
Positive               1422      100.00          1422       100.00


                                            Cumulative    Cumulative
    VOLUME_USD2    Frequency     Percent     Frequency      Percent
--------------------------------------------------------------------
Zero                    242       17.02           242        17.02
Positive               1180       82.98          1422       100.00


check results below

 filename inp "d:/txt/json.txt"  recfm=n;
 data cntcma;
   retain cntcma 0;
   infile inp end=dne;
   input byte $char1.;
   cntcma=cntcma + (byte=',');
   output;
 run;quit;

148975 commas

11372=1422*8 - 4 =11372

proc means data=cntcma;
run;quit;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 182 views
  • 0 likes
  • 4 in conversation