Hi All,
Is there an efficient way in PROC SQL to add all categories of one variable by ID to all IDs of another dataset?
For example, below I have two datasets, Data1, with 6 observations from only 2 IDs; and data2 with one record per ID (4 IDs); I want to get to dataset WANT where all categories of variable ITEM in data1 are added to all IDs in data2.
ID Item
1 A
1 B
1 C
2 A
2 B
2 C
Date 2
ID
1
2
3
4
WANT
ID Item
1 A
1 B
1 C
2 A
2 B
2 C
3 A
3 B
3 C
4 A
4 B
4 C
Any help is appreciated!
data data1; input ID Item $ count ; cards; 1 A 2 1 B 4 1 C 0 2 A . 2 B 3 2 C 1 ; data data2; input ID ; cards; 1 2 3 4 ; run; proc sql; create table want as select a.*,b.count from (select data2.id,data1.item from (select distinct item from data1 ), (select distinct id from data2 )) as a left join data1 as b on a.id=b.id and a.item=b.item; quit;
Xia Keshan
proc sql;
select *
from data1 a, data2 b
where a.id = b.id;
quit;
If an ID appears more than once in data2, you will get the cartesian product (eg 3 records for '1' in data1, 5 records for '1' in data2 = 15 records for '1' in want).
If data2 has only one record per ID, you can do it in a data step.
(assuming both datasets are sorted by ID)
data want;
merge
data1 (in=a)
data2 (in=b)
;
by id;
if a and b;
run;
What is your desired output if the dataset1 is like below and dataset2 remains same?...
ID Item
1 A
1 B
1 C
2 X
2 Y
2 Z
So that we can think for a better solution....
proc sql;
select distinct data2.id,item from data1,data2;
quit;
proc sql;
select * from (select distinct item from data1 ), data2;
quit;
Don't you need a where statement "where dat1.id = data2.id" if you are going to match to the "want" dataset described?
Thank you guys; this is really helpful.
I found stat@sas and Ksharp approach work, but is later is closer a bit to what I needed.
Let me add one piece of information to datasets, and please let me k now who would you approach it the best; so have and want data would be like below, with adding an new variable (count) to data1. thanks again.
ID Item count
1 A 2
1 B 4
1 C 0
2 A .
2 B 3
2 C 1
Date 2
ID
1
2
3
4
WANT
ID Item count
1 A 2
1 B 4
1 C 0
2 A .
2 B 3
2 C 1
3 A .
3 B .
3 C .
4 A .
4 B .
4 C .
From where do you get the list of possible values for "Item"? You need that to generate records when there is nothing in data1.
data data1; input ID Item $ count ; cards; 1 A 2 1 B 4 1 C 0 2 A . 2 B 3 2 C 1 ; data data2; input ID ; cards; 1 2 3 4 ; run; proc sql; create table want as select a.*,b.count from (select data2.id,data1.item from (select distinct item from data1 ), (select distinct id from data2 )) as a left join data1 as b on a.id=b.id and a.item=b.item; quit;
Xia Keshan
Thank you guys!
PROC SQL;
CREATE TABLE WANT AS
SELECT A.ID,A.ITEM,B.COUNT FROM
(SELECT B.ID,A.ITEM FROM (SELECT DISTINCT ITEM FROM DATA1) AS A,DATA2 AS B) AS A LEFT JOIN DATA1 AS B ON (A.ID=B.ID AND A.ITEM=B.ITEM);QUIT;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.