BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Rickaroo
Calcite | Level 5


I'm merging two data files and creating a cartesian join or merge.   My end goal is to determine for each SELLER and BUYER combination, what products are matching

The Data looks as follows:

Dataset SELLERS

SELLERID          P1     P2     P3........    P30

123456                           1     1               1

234567               1                                  1

Dataset BUYERS

BUYERID          P1     P2     P3........    P30

ab23456                       1     1           

XS98765                      1  

If a SELLER sells a product a '1' is indicated and similar for BUYERS.    So in this case, SELLER 123456 sells P2,P3, and P30.    BUYER ab23456 is interested in P2 and P3.

I could see creating a cartesion product using:

PROC SQL;

   CREATE TABLE ALL as

     SELECT *

        FROM SELLERS  BUYERS;

QUIT;

This would produce duplicate names for the products.   I'm thinking some array processing could be useful.     

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

I would be very tempted to change the data to a long format

SellerId ProductId

123456     P2

123456     P3

123456     P30

234567     P1

etc for both the Seller an buyer data.

then

Proc sql;

     create table want as

     select distinct a.productId, a.sellerId, b.buyerid

     from SellerDatalong as a join BuyerDataLong as b

     on a.productId = b.productid;

quit;

View solution in original post

4 REPLIES 4
ballardw
Super User

I would be very tempted to change the data to a long format

SellerId ProductId

123456     P2

123456     P3

123456     P30

234567     P1

etc for both the Seller an buyer data.

then

Proc sql;

     create table want as

     select distinct a.productId, a.sellerId, b.buyerid

     from SellerDatalong as a join BuyerDataLong as b

     on a.productId = b.productid;

quit;

Rickaroo
Calcite | Level 5

thanks.  I'm wondering using this approach what the output would looklike (Note: I do not currently have the data to mess with) and am awaiting to have SAS installed on my machine

ballardw
Super User

The result would be a table like

ProductId SellerId      Buyer

P1              123456     ABCDEF

P1              123456     ADFAS

P1              123456     XYz23

you could add Order by ProductID, SellerId, BuyerId to force a specific order.

Each product would have every seller matched with every buyer of that product.

The distinct clause prevents getting multiples of any specific combination.

Haikuo
Onyx | Level 15

This is one way to realize 's idea:

data seller;

     input sellerid $ p1-p4;

     cards;

423456 . 1 1 1

234567 1 . . 1

;

data buyer;

     input buyerid $ p1-p4;

     cards;

ab3456 . 1 1 .

xs4567 . 1 . 1

;

proc sort data=seller;

     by sellerid;

run;

proc transpose data=seller out=s1 (where=(not missing(col1)));

     var p1-p4;

     by sellerid;

run;

proc sort data=buyer;

     by buyerid;

run;

proc transpose data=buyer out=b1 (where=(not missing(col1)));

     var p1-p4;

     by buyerid;

run;

proc sql;

     create table want (drop=col1) as

           select * from s1, b1

                where s1._name_=b1._name_;

quit;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1805 views
  • 4 likes
  • 3 in conversation