Hi everyone,
I have the following website url:
https://www.dailymaverick.co.za/wp-admin/admin-ajax.php?action=ajaxCategoryPost&lazy_load_offset=0&ajax_count=1&cat_name=dm168&taxonomy_name=section
The above is the link to the first page of the website.
In order to go to the 2nd page the load_offset value will be incremented by 21 and ajax_count value will change to 2.
So the link to the 2nd page will be
https://www.dailymaverick.co.za/wp-admin/admin-ajax.php?action=ajaxCategoryPost&lazy_load_offset=21&ajax_count=2&cat_name=dm168&taxonomy_name=section
I am trying to implement the above logic to browse n number of pages, however I am finding it difficult to change the load_offset value.
Any suggestions on how I can get the correct load_offset value?
The code I am using :
option mprint mlogic symbolgen;
/*macro to extract multiple pages */
options dlcreatedir;
%let htmldir = %sysfunc(getoption(WORK))/html;
libname html "&htmldir.";
libname html clear;
%let page=0;
%macro getPages(num=);
%do i = 1 %to &num.;
%if i=1 %then %do;
%let off=&page;
%end;
%if i gt 1 %then %do;
%let off=%sysevalf(&&i-1)*21);
%end;
%let url=https://www.dailymaverick.co.za/wp-admin/admin-ajax.php?action=ajaxCategoryPost%str(&)lazy_load_offset=&off%str(&)ajax_count=&i%str(&)cat_name=dm168%str(&)taxonomy_name=section;
filename dest "&htmldir./page&i..html";
proc http
url = "&url."
out = dest
method = "GET" ;
run;
%put &url &off &i;
%end;
%mend;
/*calling the macro*/
%getPages(num=2);
Log:
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 68 69 option mprint mlogic symbolgen; 70 /*macro to extract multiple pages */ 71 options dlcreatedir; 72 %let htmldir = %sysfunc(getoption(WORK))/html; 73 libname html "&htmldir."; SYMBOLGEN: Macro variable HTMLDIR resolves to /saswork/SAS_workB9DF00015C53_odaws01-apse1.oda.sas.com/SAS_work869700015C53_odaws01-apse1.oda.sas.com/html NOTE: Library HTML was created. NOTE: Libref HTML was successfully assigned as follows: Engine: V9 Physical Name: /saswork/SAS_workB9DF00015C53_odaws01-apse1.oda.sas.com/SAS_work869700015C53_odaws01-apse1.oda.sas.com/html 74 libname html clear; NOTE: Libref HTML has been deassigned. 75 %let page=0; 76 %macro getPages(num=); 77 %do i = 1 %to &num.; 78 %if i=1 %then %do; 79 %let off=&page; 80 %end; 81 %if i gt 1 %then %do; 82 %let off=%sysevalf(&&i-1)*21); 83 %end; 84 %let 84 ! url=https://www.dailymaverick.co.za/wp-admin/admin-ajax.php?action=ajaxCategoryPost%str(&)lazy_load_offset=&off%str(&)aja 84 ! x_count=&i%str(&)cat_name=dm168%str(&)taxonomy_name=section; 85 filename dest "&htmldir./page&i..html"; 86 proc http 87 url = "&url." 88 out = dest 89 method = "GET" ; 90 run; 91 %put &url &off &i; 92 %end; 93 %mend; 94 95 96 97 /*calling the macro*/ 98 %getPages(num=1); MLOGIC(GETPAGES): Beginning execution. MLOGIC(GETPAGES): Parameter NUM has value 1 SYMBOLGEN: Macro variable NUM resolves to 1 MLOGIC(GETPAGES): %DO loop beginning; index variable I; start value is 1; stop value is 1; by value is 1. MLOGIC(GETPAGES): %IF condition i=1 is FALSE MLOGIC(GETPAGES): %IF condition i gt 1 is TRUE MLOGIC(GETPAGES): %LET (variable name is OFF) SYMBOLGEN: && resolves to &. SYMBOLGEN: Macro variable I resolves to 1 MLOGIC(GETPAGES): %LET (variable name is URL) SYMBOLGEN: Macro variable OFF resolves to 0*21) SYMBOLGEN: Macro variable I resolves to 1 SYMBOLGEN: Macro variable HTMLDIR resolves to /saswork/SAS_workB9DF00015C53_odaws01-apse1.oda.sas.com/SAS_work869700015C53_odaws01-apse1.oda.sas.com/html SYMBOLGEN: Macro variable I resolves to 1 MPRINT(GETPAGES): filename dest "/saswork/SAS_workB9DF00015C53_odaws01-apse1.oda.sas.com/SAS_work869700015C53_odaws01-apse1.oda.sas.com/html/page1.html"; SYMBOLGEN: Macro variable URL resolves to https://www.dailymaverick.co.za/wp-admin/admin-ajax.php?action=ajaxCategoryPost&lazy_load_offset=0*21)&ajax_count=1&cat_ name=dm168&taxonomy_name=section SYMBOLGEN: Some characters in the above value which were subject to macro quoting have been unquoted for printing. MPRINT(GETPAGES): proc http url = "https://www.dailymaverick.co.za/wp-admin/admin-ajax.php?action=ajaxCategoryPost&lazy_load_offset=0*21)&ajax_count=1&cat_name=dm168& taxonomy_name=section" out = dest method = "GET" ; MPRINT(GETPAGES): run; NOTE: PROCEDURE HTTP used (Total process time): real time 1.76 seconds user cpu time 0.03 seconds system cpu time 0.00 seconds memory 2158.90k OS Memory 25504.00k Timestamp 10/28/2021 03:29:38 PM Step Count 24 Switch Count 3 Page Faults 0 Page Reclaims 1188 Page Swaps 0 Voluntary Context Switches 35 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 104 MLOGIC(GETPAGES): %PUT &url &off &i SYMBOLGEN: Macro variable URL resolves to https://www.dailymaverick.co.za/wp-admin/admin-ajax.php?action=ajaxCategoryPost&lazy_load_offset=0*21)&ajax_count=1&cat_ name=dm168&taxonomy_name=section SYMBOLGEN: Some characters in the above value which were subject to macro quoting have been unquoted for printing. SYMBOLGEN: Macro variable OFF resolves to 0*21) SYMBOLGEN: Macro variable I resolves to 1 https://www.dailymaverick.co.za/wp-admin/admin-ajax.php?action=ajaxCategoryPost&lazy_load_offset=0*21)&ajax_count=1&cat_name=dm168&t axonomy_name=section 0*21) 1 MLOGIC(GETPAGES): %DO loop index variable I is now 2; loop will not iterate again. MLOGIC(GETPAGES): Ending execution. 99 100 /* DATA work.links; */ 101 /* format date_searched ddmmyy10.; */ 102 /* length headline $ 500 ; */ 103 /* LENGTH string $ 550; */ 104 /* INFILE "&htmldir./*.html" LENGTH = recLen LRECL = 32767; */ 105 /* INPUT line $VARYING32767. recLen; */ 106 /* p=prxparse('/<h1>/'); */ 107 /* pos=prxmatch(p,line); */ 108 /* if pos gt 0 then do; */ 109 /* headline=substr(line,p); */ 110 /* end; */ 111 /* if pos not gt 0 then delete; */ 112 /* OUTPUT; */ 113 /* RUN; */ 114 /* */ 115 /* data work.decodelinks(drop=rx1); */ 116 /* set work.links; */ 117 /* retain rx1 rx2; */ 118 /* rx1=prxparse("s/<.*?>//"); */ 119 /* call prxchange(rx1,-1,headline); */ 120 /* rx2=prxparse("s/&#\d+//"); */ 121 /* call prxchange(rx2,-1,headline); */ 122 /* run; */ 123 124 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; SYMBOLGEN: Macro variable GRAPHTERM resolves to GOPTIONS NOACCESSIBLE; 134
Since you have already done the work and turned on macro debugging options, show us the entire LOG for this code. Copy the log as text and paste it into the window that appears when you click on the </> icon here in SAS Communities, this will format the log properly to make it more readable.
Hi ,
I have added the log
Thanks and Regards,
Mohammad Umair Kazi
So you want a macro variable to use to set the value after &lazy_load_offset= in the URL?
Then start with a value of 0 and then increment by 21 each time through the loop.
%local off i ;
%let off=0;
%do i=1 %to #
....
%let off=%eval(&off + 21);
%end;
Why did you set OFF to this value?
SYMBOLGEN: Macro variable OFF resolves to 0*21)
You equation is wrong. Also since you are doing integer arithmetic so you can use %EVAL().
%let off=%eval( (&i-1) * 21 );
What is the purpose of the PAGE macro variable? Isn't that what I has?
%macro getPages(num=);
%local i off ;
%do i = 1 %to &num.;
%let off=%eval((&i-1)*21);
%let url=https://www.dailymaverick.co.za/wp-admin/admin-ajax.php?action=ajaxCategoryPost%str(&)lazy_load_offset=&off%str(&)ajax_count=&i%str(&)cat_name=dm168%str(&)taxonomy_name=section;
filename dest "&htmldir./page&i..html";
proc http
url = "&url."
out = dest
method = "GET"
;
run;
%put &=i &=off &=url ;
%end;
%mend;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.