DATA Step, Macro, Functions and more

Combining data

Reply
Occasional Contributor
Posts: 17

Combining data


Hi,

Need help.  I have 2 input datasets  having following data.

Input dataset _1

area  rating


1        0
1        1
1        2
1        3
1        4
1        5
2        0
2        1
2        2
2        3
2        4
2        5

Input Dataset -2

Area     User-Id

1      G1
1      G2
2      G3 
2      G4 
2      G5

Output dataset need to have data in following way

OUTPUT DATASET

area user-id rating

1 G1 0
1 G1 1
1 G1 2
1 G1 3
1 G1 4
1 G1 5
1 G2 0
1 G2 1
1 G2 2
1 G2 3
1 G2 4
1 G2 5
2 G3 0
2 G3 1
2 G3 2
2 G3 3
2 G3 4
2 G3 5 and so on..

Thanks in advance.

Regards,

Nitin

Respected Advisor
Posts: 3,892

Re: Combining data

Looks like homework to me. What have you already done? Post your code also if it's not working yet.

Hint: Proc SQL, join using column "area"

You're asked in this exercise to combine data from two tables which have a many:many relationship. This is a case where a SAS datastep merge and a SQL join will return different results. It is very important for you to understand the difference and why this happens. And the best way of learning SAS coding is "to code".

Occasional Contributor
Posts: 17

Re: Combining data

Thanks Patrick.

Your hint helped me to resolve the issue. Actually I am new to SAS, therefore small things are quite big for me. I had not thought of using SQL at all and was looking from SET/merge perspective, though there may be solution in that way too, but I was struggling to find the solution.

PROC SQL;                      

CREATE TABLE ABCD AS           

SELECT *                       

FROM ABC, XYZ                  

WHERE ABC.A =                  

XYZ.A;                         

QUIT;                          

PROC SORT DATA=ABCD NODUPKEY;  

BY LR B;                       

RUN;                          

Respected Advisor
Posts: 3,892

Re: Combining data

Whenever you want to create all combinations of rows coming from 2 tables (or all combinations with matching keys) and the relationship between the tables is many:many then use Proc SQL.

Not sure why you need a Proc Sort Nodupkey in the code you've posted - but it appears to be dealing with slightly different data than what you've posted as sample.

data have1;

  infile datalines dlm=' ' truncover;

  Input area rating;

  datalines;

1 0

1 1

1 2

1 3

1 4

1 5

2 0

2 1

2 2

2 3

2 4

2 5

;

run;

data have2;

  infile datalines dlm=' ' truncover;

  Input Area User_Id $;

  datalines;

1 G1

1 G2

2 G3

2 G4

2 G5

;

run;

proc sql;

  create table want as

    select r.area, r.user_id, l.rating

      from have1 l, have2 r

        where l.area=r.area

  ;

quit;

Ask a Question
Discussion stats
  • 3 replies
  • 186 views
  • 4 likes
  • 2 in conversation