BookmarkSubscribeRSS Feed
skatraga
Calcite | Level 5
I have two SAS data sets A and B. Data set A has about 20000 rows and data set B has about 30000 rows. I want to process each row in data set A against all rows in data set B. I know this can be done using Merge statement but the cartesian product results in 600 million rows in the final data set. This is blowing up my memory. Is there any way of doing this in a single data step without using the merge statement. I pretty sure there is one, but couldn't remember what I did several years ago.
10 REPLIES 10
art297
Opal | Level 21
You could use something like:
[pre]
data have;
input x y;
cards;
1 2
3 4
;
data want;
set have;
do i=1 to n;
set sashelp.class nobs=n point=i;
output;
end;
run;
[/pre]
HTH,
Art
---------
> I have two SAS data sets A and B. Data set A has
> about 20000 rows and data set B has about 30000 rows.
> I want to process each row in data set A against all
> rows in data set B. I know this can be done using
> Merge statement but the cartesian product results in
> 600 million rows in the final data set. This is
> blowing up my memory. Is there any way of doing this
> in a single data step without using the merge
> statement. I pretty sure there is one, but couldn't
> remember what I did several years ago. Modified to correct code that was posted prematurely and without any testing. Oops!


Message was edited by: art297
skatraga
Calcite | Level 5
Thanks for your help. Looks like this code processes only for the first row of Data set A.
art297
Opal | Level 21
Did you run the code? It should have given you 19 records for each record in the first file (i.e., all of the records from the first file combined with all of the records from the second file.

Art
----------
> Thanks for your help. Looks like this code processes
> only for the first row of Data set A.
skatraga
Calcite | Level 5
I ran the code. It gave me 19 records for the first records in the first file.
polingjw
Quartz | Level 8
The eof marker on the sashelp.class dataset is being reached before the second observation in the have dataset can be processed, causing the data step to terminate prematurely. You could try the following code instead, but I don’t see how this would be any more efficient than the SQL Cartesian product. [pre]
data have;
input x y;
cards;
1 2
3 4
;
data want;
set have;
do i=1 to n;
set sashelp.class nobs=n point=i;
output;
end;
run;
[/pre]
skatraga
Calcite | Level 5
Thanks for your help. This works on the example data set. I am running it on my large data set. Hope it runs.
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
From the OP: " I want to process each row in data set A against all rows in data set B"

How about explaining in more detail what you are trying to accomplish, specifically listing logic and/or SAS variables used, for what purpose, input/output, and the like.

There are many SAS features, some of which have been mentioned which may be more suitable.

Also, share what SAS code you have attempted for feedback thus far -- that will also help you get constructive and useful feedback and guidance.

Scott Barry
SBBWorks, Inc.
art297
Opal | Level 21
Sorry. The code I placed was, unfortunately, not tested. Fortunately, you now have the correct code.

Art
polingjw
Quartz | Level 8
If you have SAS 9.2, the data step hash object can sometimes be used to perform “SQL-like joins.” Here is a good paper on the topic.

http://support.sas.com/rnd/base/datastep/dot/better-hashing-sas92.pdf

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 881 views
  • 0 likes
  • 4 in conversation