Hello,
In my program I run the following code:
data days_1;
do day=1 to 10000;
output;
end;
run;
proc sql;
create table test as
select rent.date,day
from sashelp.rent, days_1
order by date,day;
quit;
In practice, my table has 2200 different dates and not 10 as in the 'rent' table, the run time is very long, looking for a way that the run time will be shorter without Cartesian multiplication or another solution.
Thanks.
You can do the cartesian crossing via a hash object in a data step:
data want;
set sashelp.rent days_1 (obs=0);
if _n_=1 then do;
declare hash h (dataset:'days_1',ordered:'a');
h.definekey('date');
h.definedata(all:'Y');
h.definedone();
declare hiter hi ('h');
end;
do while (hi.next()=0);
output;
end;
run;
This loads DAYS_1 into the hash object (think lookup table) named h, and stores it in ascending order by DAY (even if the original dataset was not sorted). This takes place during the first iteration of the data step ("if _n_=1"). The hash object is "retained" for all susequent iterations.
Then for each obs in sashelp.rent, the program uses the hash iterator hi to step through (i.e. retrieve) every dataitem (think "row") in h, and outputs it.
The reason I used "obs=0" in the SET statement for the days_1 dataset is to force SAS to make provision for all the variables in days_1 in the program data vector (the hash object declare statement won't suffice). But not to actually read in the data (which IS done in the declare statement).
Not sure what your output should be but maybe something like this?
data days_1;
set sashelp.rent;
do day=1 to 10000;
output;
end;
run;
> without Cartesian multiplication
Well, you are asking for a Cartesian product...
To speed up IOs on this very simple program, look at increasing BUFSIZE and BUFNO, and using SGIO.
You can do the cartesian crossing via a hash object in a data step:
data want;
set sashelp.rent days_1 (obs=0);
if _n_=1 then do;
declare hash h (dataset:'days_1',ordered:'a');
h.definekey('date');
h.definedata(all:'Y');
h.definedone();
declare hiter hi ('h');
end;
do while (hi.next()=0);
output;
end;
run;
This loads DAYS_1 into the hash object (think lookup table) named h, and stores it in ascending order by DAY (even if the original dataset was not sorted). This takes place during the first iteration of the data step ("if _n_=1"). The hash object is "retained" for all susequent iterations.
Then for each obs in sashelp.rent, the program uses the hash iterator hi to step through (i.e. retrieve) every dataitem (think "row") in h, and outputs it.
The reason I used "obs=0" in the SET statement for the days_1 dataset is to force SAS to make provision for all the variables in days_1 in the program data vector (the hash object declare statement won't suffice). But not to actually read in the data (which IS done in the declare statement).
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.