I am trying to get data (nearly 4000 rows) from a page, which has an infinite scroll. Proc HTTP reads only the first 100 rows.
Could someone please help me to solve this!
JAR
Can you provide the code including the url?
Hello Kawakami,
Kindly excuse my style, as I am not professionally trained to write SAS programs:
%let site="https://www.investing.com/crypto/currencies";
filename source temp;
PROC HTTP
URL = &site
OUT = source
METHOD = "GET";
RUN;
DATA Temp(DROP = line);
INFILE source LENGTH = recLen LRECL = 32767;
INPUT line $VARYING32767. recLen;
do until (_Top);
_Top=find(_infile_,'<table class="genTbl openTbl');
if not _Top then input;
end;
do until (_Bottom);
input;
_Bottom=find(_infile_,'</table>');
if _Bottom then STOP;
else do;
Record=_infile_;
output;
end;
end;
Drop _:;
Run;
Data Temp2;
rx1=prxparse("s/<.*?>//");
set temp (firstObs=16);
length result FinalResult $ 2000;
Retain result;
if mod(_n_, 12) ne 0 then result=cats(result, "|", record);
else do;
finalResult=result;
call prxchange(rx1,-1,FinalResult);
result ="";
end;
If FinalResult="" then delete;
Keep FinalResult;
/*I will use scan() to extract each variable from FinalResult*/
run;
The data embedded in the html source seems to only have the 1st to 100th place to begin with.
I think the javascript (window.siteData.defaultDomainCurrData part) is probably updating the data in real time.
I don't think proc http can handle javascript.
I found this paper, but I'm not sure if it's helpful.
https://www.sas.com/content/dam/SAS/support/en/sas-global-forum-proceedings/2019/3169-2019.pdf
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.