Hi All,
Is there an efficient way in PROC SQL to add all categories of one variable by ID to all IDs of another dataset?
For example, below I have two datasets, Data1, with 6 observations from only 2 IDs; and data2 with one record per ID (4 IDs); I want to get to dataset WANT where all categories of variable ITEM in data1 are added to all IDs in data2.
ID Item
1 A
1 B
1 C
2 A
2 B
2 C
Date 2
ID
1
2
3
4
WANT
ID Item
1 A
1 B
1 C
2 A
2 B
2 C
3 A
3 B
3 C
4 A
4 B
4 C
Any help is appreciated!
data data1; input ID Item $ count ; cards; 1 A 2 1 B 4 1 C 0 2 A . 2 B 3 2 C 1 ; data data2; input ID ; cards; 1 2 3 4 ; run; proc sql; create table want as select a.*,b.count from (select data2.id,data1.item from (select distinct item from data1 ), (select distinct id from data2 )) as a left join data1 as b on a.id=b.id and a.item=b.item; quit;
Xia Keshan
proc sql;
select *
from data1 a, data2 b
where a.id = b.id;
quit;
If an ID appears more than once in data2, you will get the cartesian product (eg 3 records for '1' in data1, 5 records for '1' in data2 = 15 records for '1' in want).
If data2 has only one record per ID, you can do it in a data step.
(assuming both datasets are sorted by ID)
data want;
merge
data1 (in=a)
data2 (in=b)
;
by id;
if a and b;
run;
What is your desired output if the dataset1 is like below and dataset2 remains same?...
ID Item
1 A
1 B
1 C
2 X
2 Y
2 Z
So that we can think for a better solution....
proc sql;
select distinct data2.id,item from data1,data2;
quit;
proc sql;
select * from (select distinct item from data1 ), data2;
quit;
Don't you need a where statement "where dat1.id = data2.id" if you are going to match to the "want" dataset described?
Thank you guys; this is really helpful.
I found stat@sas and Ksharp approach work, but is later is closer a bit to what I needed.
Let me add one piece of information to datasets, and please let me k now who would you approach it the best; so have and want data would be like below, with adding an new variable (count) to data1. thanks again.
ID Item count
1 A 2
1 B 4
1 C 0
2 A .
2 B 3
2 C 1
Date 2
ID
1
2
3
4
WANT
ID Item count
1 A 2
1 B 4
1 C 0
2 A .
2 B 3
2 C 1
3 A .
3 B .
3 C .
4 A .
4 B .
4 C .
From where do you get the list of possible values for "Item"? You need that to generate records when there is nothing in data1.
data data1; input ID Item $ count ; cards; 1 A 2 1 B 4 1 C 0 2 A . 2 B 3 2 C 1 ; data data2; input ID ; cards; 1 2 3 4 ; run; proc sql; create table want as select a.*,b.count from (select data2.id,data1.item from (select distinct item from data1 ), (select distinct id from data2 )) as a left join data1 as b on a.id=b.id and a.item=b.item; quit;
Xia Keshan
Thank you guys!
PROC SQL;
CREATE TABLE WANT AS
SELECT A.ID,A.ITEM,B.COUNT FROM
(SELECT B.ID,A.ITEM FROM (SELECT DISTINCT ITEM FROM DATA1) AS A,DATA2 AS B) AS A LEFT JOIN DATA1 AS B ON (A.ID=B.ID AND A.ITEM=B.ITEM);QUIT;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.