DATA Step, Macro, Functions and more

Importing data from internet

Reply
Super Contributor
Posts: 259

Importing data from internet

Hi

Is it possible to import data into sas from
The following location:

https://graphs.coinmarketcap.com/v1/datapoints/bitcoin/1367174841000/1491151745000/

The data is structured in the form [,],[,] ... where the first number in the square bracket is a Unix timestamp and the second number is the historic market capitalisation for a stock. I want to import this and then restructure it so that I have a field with a sas date, another field with a sas time and another field with the market cap. Another complexity is that the the data contains 3 pieces of info and a timestamp against each one: market cap, then price, then volume.... and I want my dataset to have these as fields. Is this possible! Thanks
Super User
Posts: 19,877

Re: Importing data from internet

I believe this is in the JSON format? Or XML? 

 

If you're on the latest version of SAS you can try the methods here:

http://blogs.sas.com/content/sasdummy/2016/12/02/json-libname-engine-sas/

 

Either way, it's a text file so you should like at FILENAME URL to connect to a text file online and then to parse it with either data step hard code or XML or JSON library. 

Valued Guide
Posts: 505

Re: Importing data from internet

SAS/WPS/R: Convert JSON file to SAS dataset

If you have the IML interface to R you can jusrt cut and paste the code below.

see (this isa link to this message)
https://goo.gl/QrPkcT
https://communities.sas.com/t5/Base-SAS-Programming/Importing-data-from-internet/m-p/346519

Key results from a voo doo analysis on the end

I did a quick check on the end (counted commas)

HAVE JSON file with four sections
===================================================

location
"https://graphs.coinmarketcap.com/v1/datapoints/bitcoin/1367174841000/1491151745000/"

MARKET_CAP_BY_AVAILABLE_SUPPLY

{"market_cap_by_available_supply": [[1367174841000, 1500517590], [1367261101000, 1575032004],
[1367347502000, 1501657492], [1367433902000, 1298951550], [1367522401000, 1148667722],
[1367608801000, 1011066494], [1367695201000, 1236351843], [1367781901000, 1298377788],
[1367868303000, 1315992303], [1367954701000, 1183766500], [1368041101000, 1253655040],
...

PRICE_BTC

{"price_btc": [[1367174841000, 1.0], [1367261101000, 1.0], [1367347502000, 1.0], [
0], [1368390301000, 1.0], [1368476701000, 1.0], [1368563101000, 1.0], [1368650101
62101000, 1.0], [1369948501000, 1.0], [1370036402000, 1.0], [1370122802000, 1.0],
1.0], [1371431702000, 1.0], [1371518404000, 1.0], [1371605102000, 1.0], [13716915
2906202000, 1.0], [1372992361000, 1.0], [1373078762000, 1.0], [1373165161000, 1.0

PRICE_USD
...

VOLUME_USD
...


WANT (SAS dataset)
==================

Up to 40 obs from want total obs=1,422


                    MARKET_    PRICE_  PRICE_  VOLUME_
Obs  PRIMARY_KEY      CAP2      BTC2    USD2    USD2

   1 1367174841000  1500517590    1      135       0
   2 1367261101000  1575032004    1      142       0
   3 1367347502000  1501657492    1      135       0
   4 1367433902000  1298951550    1      117       0
   5 1367522401000  1148667722    1      103       0
   6 1367608801000  1011066494    1       91       0
   7 1367695201000  1236351843    1      111       0
   8 1367781901000  1298377788    1      117       0
   9 1367868303000  1315992303    1      118       0
...

1419 1490968747000  16914284005    1     1041   392275000
1420 1491055148000  17609934881    1     1084   357781000
1421 1491141549000  17631751754    1     1085   311580000
1422 1491151745000  17831029868    1     1097   348902000

WORKING CODE (prety much it)
============

   datawps <- as.data.frame(unlist(fromJSON(
      "https://graphs.coinmarketcap.com/v1/datapoints/bitcoin/1367174841000/1491151745000/")));

FULL SOLUTION
=============

* to save time I downloaded, converted to a list and saved the result;
%utl_submit_r64('
source("C:/Program Files/R/R-3.3.2/etc/Rprofile.site", echo=T);
library(RJSONIO);
datawps <-fromJSON("https://graphs.coinmarketcap.com/v1/datapoints/bitcoin/1367174841000/1491151745000/");
save(datawps,file="d:/rda/hsonr64.rda");
');

* all this to reformat for sas/wps;
* I am not skilled in converting list structures to SAS datasets;
proc datasets lib=work kill;
run;quit;
data class;set sashelp.class;
run;quit;
%utl_submit_wps64('
options set=R_HOME "C:/Program Files/R/R-3.3.2";
libname wrk "%sysfunc(pathname(work))";
proc r;
submit;
source("C:/Program Files/R/R-3.3.2/etc/Rprofile.site", echo=T);
library(RJSONIO);
load("d:/rda/hsonr64.rda");
str(datawps);
market_cap <-t(as.data.frame(datawps[[1]]));
price_btc  <-t(as.data.frame(datawps[[2]]));
price_usd  <-t(as.data.frame(datawps[[3]]));
volume_usd <-t(as.data.frame(datawps[[4]]));
endsubmit;
import r=market_cap  data=wrk.market_cap ;
import r=price_btc   data=wrk.price_btc  ;
import r=price_usd   data=wrk.price_usd  ;
import r=volume_usd  data=wrk.volume_usd ;
run;quit;
');

data want;
  retain char "NEED";
  merge
     market_cap (rename=(v1= primary_key v2=market_cap2 ))
     price_btc  (rename=(v1= price_btc1  v2=price_btc2  ))
     price_usd  (rename=(v1= price_usd1  v2=price_usd2  ))
     volume_usd (rename=(v1= volume_usd1 v2=volume_usd2 ))
  ;
  drop
    price_btc1
    price_usd1
    volume_usd1
  ;
run;quit;

proc print data=want width=min;
format _numeric_ 23.;
run;quit;



*                           _
__   _____   ___         __| | ___   ___
\ \ / / _ \ / _ \ _____ / _` |/ _ \ / _ \
 \ V / (_) | (_) |_____| (_| | (_) | (_) |
  \_/ \___/ \___/       \__,_|\___/ \___/

;


see https://www.dropbox.com/s/po3ahepe7r7dnm7/oto_voodoo.sas?dl=0

%inc "c:/oto/oto_voodoo.sas";

%utlvdoc
    (
    libname        = work         /* libname of input dataset */
    ,data          = want         /* name of input dataset */
    ,key           = primary_key  /* 0 or variable */
    ,ExtrmVal      = 10           /* display top and bottom 30 frequencies */
    ,UniPlot       = 1            /* 'true' enables ('false' disables) plot option on univariate output */
    ,UniVar        = 1            /* 'true' enables ('false' disables) plot option on univariate output */
    ,misspat       = 1            /* 0 or 1 missing patterns */
    ,chart         = 1            /* 0 or 1 line printer chart */
    ,taball        = 0
    ,tabone        = 0            /* 0 or  variable vs all other variables          */
    ,mispop        = 1            /* 0 or 1  missing vs populated*/
    ,dupcol        = 1            /* 0 or 1  columns duplicated  */
    ,unqtwo        = 0                                                                              /* 0 */
    ,vdocor        = 1            /* 0 or 1  correlation of numeric variables */
    ,oneone        = 1            /* 0 or 1  one to one - one to many - many to many */
    ,cramer        = 0            /* 0 or 1  association of character variables    */
    ,optlength     = 1
    ,maxmin        = 1
    ,unichr        = 0
    ,printto       = d:\txt\vdo\&data..txt        /* file or output if output window */
    ,Cleanup       = 0           /* 0 or 1 delete intermediate datasets */
    );


Observations:       1422     .

Missing or uniformly evaluated variables:
 ------------------------------------------
   - missing for all observations:      0
   - uniformly evaluated -- all:        2
       with one or more missing values:      0
       with no missing values:               2
 ==================================================


 #     Variable         Unique Values      Type    Length  Format
---    --------         -------------      ----    ------  ------

  1    MARKET_CAP2             1,422       num       8  ( unique could be primary key)
  2    PRICE_BTC2                  1       num       8  only one value
  3    PRICE_USD2              1,417       num       8
  4    PRIMARY_KEY             1,422       num       8  ( primary key)
  5    VOLUME_USD2             1,180       num       8


10 most frequent values of VOLUME_USD2   ( )

              1,180 distinct values in total

 Rank     Value           Frequency
 ----     -----           ---------


   1    0                      242   (has three levels 242,2 and
   2    26078800                 2
   3    178002                   1
   4    2865040                  1

Quantiles (Definition 5)

Level           Quantile

100% Max       783283000
99%            354087000
95%            171667000
90%            114977000
75% Q3          62071700
50% Median      28060550
25% Q1          13062400
10%                    0
5%                     0
1%                     0
0% Min                 0


10 most frequent values of PRICE_USD2    ( )

                 1,417 distinct values in total

    Rank     Value     Frequency
    ----     -----     ---------


      1    135.3              3
      2    119                2
      3    121.18             2
      4    130.18             2
      5    67.809             1
      6    68.0831            1


Variable          N    Miss         Minimum         Maximum            Mean          Median         Std Dev             Sum
---------------------------------------------------------------------------------------------------------------------------
MARKET_CAP2    1422       0       771368060     20588754113      6606387737      5998179929      4083586970    9.3942834E12
PRICE_USD2     1422       0      67.8090000         1270.90     459.0338500     422.3765000     258.9802662       652746.13
PRIMARY_KEY    1422       0    1.3671748E12    1.4911517E12    1.4293695E12    1.4295505E12     35927896948    2.0325635E15
VOLUME_USD2    1422       0               0       783283000     50823830.61     28060550.00     71096834.83     72271487132
---------------------------------------------------------------------------------------------------------------------------


Maximums and Minimums work.want

VARIABLE       MIN             MAX

PRIMARY_KEY    1.3671748E12    1.4911517E12
MARKET_CAP2    771368060       20588754113
PRICE_BTC2     1               1
PRICE_USD2     67.809          1270.9
VOLUME_USD2    0               783283000

Maximum Number of Bytes to hold Character and Numeric Vales Exactly

                VARIABLE_                    NEW_
Obs    __TYP      TYPE       NAME           LENGTH    ORIGINAL    SAVINGS

 1       C      Character    CHAR              4          4          0
 2       N      Numeric      MARKET_CAP2       6          8          2
 3       N      Numeric      PRICE_BTC2        3          8          5
 4       N      Numeric      PRICE_USD2        8          8          0
 5       N      Numeric      PRIMARY_KEY       7          8          1
 6       N      Numeric      VOLUME_USD2       5          8          3


Relationship OF VARIABLES WHERE MAX LEVELS IS 2000 AND MAX NUMBER OF VARIABLES IS 100
One to One  -- One to many  --  Many to One -- Many to Many

Obs                       OUT

 1     Many to One     MARKET_CAP2 to PRICE_USD2

 2     One to One      MARKET_CAP2 to PRIMARY_KEY

 3     Many to One     MARKET_CAP2 to VOLUME_USD2
 4     One to Many     PRICE_USD2 to PRIMARY_KEY
 5     Many to One     PRICE_USD2 to VOLUME_USD2
 6     Many to One     PRIMARY_KEY to VOLUME_USD2


There are no missing values


                                            Cumulative    Cumulative
    PRIMARY_KEY    Frequency     Percent     Frequency      Percent
--------------------------------------------------------------------
Positive               1422      100.00          1422       100.00


                                            Cumulative    Cumulative
    MARKET_CAP2    Frequency     Percent     Frequency      Percent
--------------------------------------------------------------------
Positive               1422      100.00          1422       100.00


                                            Cumulative    Cumulative
     PRICE_BTC2    Frequency     Percent     Frequency      Percent
--------------------------------------------------------------------
Positive               1422      100.00          1422       100.00


                                            Cumulative    Cumulative
     PRICE_USD2    Frequency     Percent     Frequency      Percent
--------------------------------------------------------------------
Positive               1422      100.00          1422       100.00


                                            Cumulative    Cumulative
    VOLUME_USD2    Frequency     Percent     Frequency      Percent
--------------------------------------------------------------------
Zero                    242       17.02           242        17.02
Positive               1180       82.98          1422       100.00


check results below

 filename inp "d:/txt/json.txt"  recfm=n;
 data cntcma;
   retain cntcma 0;
   infile inp end=dne;
   input byte $char1.;
   cntcma=cntcma + (byte=',');
   output;
 run;quit;

148975 commas

11372=1422*8 - 4 =11372

proc means data=cntcma;
run;quit;



Super Contributor
Posts: 259

Re: Importing data from internet

Hi Reeza

 

My sas version is 9.3 Base. I tried using the following code but I got an error:

 


filename bit url "https://graphs.coinmarketcap.com/v1/datapoints/bitcoin.txt";


data bitcoin;
infile bit truncover;
input @1 bigline $char256.;
run;

 

The data is on a webpage (link https://graphs.coinmarketcap.com/v1/datapoints/bitcoin/) so perhaps I have the wrong filename code. 

 

The error I'm getting is 

 

ERROR: Invalid reply received from the HTTP server. Use the debug
option for more info.

 

 

 

Ask a Question
Discussion stats
  • 3 replies
  • 230 views
  • 0 likes
  • 3 in conversation