Hello,
I am trying to merge two data sets and so far unsuccessfully... I would appreciate any help!
Have: Data set #1 (students are nested within classes, which are in turn nested within teachers)
StudentID ClassID TeacherID
1 1 1
2 1 1
3 2 1
4 2 1
5 3 2
6 4 2
7 4 2
8 4 2
Data set #2: For this data set, 1 class per teacher was randomly selected from Data set #1, and the procedure was repeated 3 times.
ClassID TeacherID Replicate
1 1 1
1 1 2
2 1 3
3 2 1
4 2 2
4 2 3
Want: To add student data to Data set #2. So I need to repeat each row as many times as there are students in class, and add students' IDs.
StudentID ClassID TeacherID Replicate
1 1 1 1
2 1 1 1
1 1 1 2
2 1 1 2
3 2 1 3
4 2 1 3
5 3 2 1
6 4 2 2
7 4 2 2
8 4 2 2
6 4 2 3
7 4 2 3
8 4 2 3
I think this works
data new;
input Stu_ID ClassID TeacherID;
datalines;
1 1 1
2 1 1
3 2 1
4 2 1
5 3 2
6 4 2
7 4 2
8 4 2
;
data resampled;
input ClassID TeacherID Replicate;
datalines;
1 1 1
1 1 2
2 1 3
3 2 1
4 2 2
4 2 3
;
proc sql;
create table want as
select r.*, n.Stu_id
from resampled as r
full join new as n
on r.classId=n.classId
and r.teacherID=n.teacherID
order by 4, 1, 2, 3
;
quit;
Nope...
Here is what I tried:
data new;
input Stu_ID ClassID TeacherID;
datalines;
1 1 1
2 1 1
3 2 1
4 2 1
5 3 2
6 4 2
7 4 2
8 4 2
;
data resampled;
input ClassID TeacherID Replicate;
datalines;
1 1 1
1 1 2
2 1 3
3 2 1
4 2 2
4 2 3
;
proc sort data = new;
by ClassID TeacherID;
run;
proc sort data = resampled;
by ClassID TeacherID;
run;
data merged;
merge new resampled;
by ClassID TeacherID;
run;
proc print data = merged;
run;
Here is what I got:
Obs | Stu_ID | ClassID | TeacherID | Replicate |
1 | 1 | 1 | 1 | 1 |
2 | 2 | 1 | 1 | 2 |
3 | 3 | 2 | 1 | 3 |
4 | 4 | 2 | 1 | 3 |
5 | 5 | 3 | 2 | 1 |
6 | 6 | 4 | 2 | 2 |
7 | 7 | 4 | 2 | 3 |
8 | 8 | 4 | 2 | 3 |
Is that output exactly what you want? Or an example of it?
proc sql;
create table want as
select r.*, n.Stu_id
from resampled as r
cross join new as n
;
quit;
Your code doesn't produce what I need... The output in the starting post is exactly want I need, yes.
Your code produced this output:
Obs | ClassID | TeacherID | Replicate | Stu_ID |
1 | 1 | 1 | 1 | 1 |
2 | 1 | 1 | 2 | 1 |
3 | 2 | 1 | 3 | 1 |
4 | 3 | 2 | 1 | 1 |
5 | 4 | 2 | 2 | 1 |
6 | 4 | 2 | 3 | 1 |
7 | 1 | 1 | 1 | 2 |
8 | 1 | 1 | 2 | 2 |
9 | 2 | 1 | 3 | 2 |
10 | 3 | 2 | 1 | 2 |
11 | 4 | 2 | 2 | 2 |
12 | 4 | 2 | 3 | 2 |
13 | 1 | 1 | 1 | 3 |
14 | 1 | 1 | 2 | 3 |
15 | 2 | 1 | 3 | 3 |
16 | 3 | 2 | 1 | 3 |
17 | 4 | 2 | 2 | 3 |
18 | 4 | 2 | 3 | 3 |
19 | 1 | 1 | 1 | 4 |
20 | 1 | 1 | 2 | 4 |
21 | 2 | 1 | 3 | 4 |
22 | 3 | 2 | 1 | 4 |
23 | 4 | 2 | 2 | 4 |
24 | 4 | 2 | 3 | 4 |
25 | 1 | 1 | 1 | 5 |
26 | 1 | 1 | 2 | 5 |
27 | 2 | 1 | 3 | 5 |
28 | 3 | 2 | 1 | 5 |
29 | 4 | 2 | 2 | 5 |
30 | 4 | 2 | 3 | 5 |
31 | 1 | 1 | 1 | 6 |
32 | 1 | 1 | 2 | 6 |
33 | 2 | 1 | 3 | 6 |
34 | 3 | 2 | 1 | 6 |
35 | 4 | 2 | 2 | 6 |
36 | 4 | 2 | 3 | 6 |
37 | 1 | 1 | 1 | 7 |
38 | 1 | 1 | 2 | 7 |
39 | 2 | 1 | 3 | 7 |
40 | 3 | 2 | 1 | 7 |
41 | 4 | 2 | 2 | 7 |
42 | 4 | 2 | 3 | 7 |
43 | 1 | 1 | 1 | 8 |
44 | 1 | 1 | 2 | 8 |
45 | 2 | 1 | 3 | 8 |
46 | 3 | 2 | 1 | 8 |
47 | 4 | 2 | 2 | 8 |
48 | 4 | 2 | 3 | 8 |
But what I need is to add StudentID to each case (line) in Data set #2. So, if, say, a class had 3 students, then instead of one line from Data set #2 there will be three, and they will differ only in StudentID.
Thank you so much for helping!
I think this works
data new;
input Stu_ID ClassID TeacherID;
datalines;
1 1 1
2 1 1
3 2 1
4 2 1
5 3 2
6 4 2
7 4 2
8 4 2
;
data resampled;
input ClassID TeacherID Replicate;
datalines;
1 1 1
1 1 2
2 1 3
3 2 1
4 2 2
4 2 3
;
proc sql;
create table want as
select r.*, n.Stu_id
from resampled as r
full join new as n
on r.classId=n.classId
and r.teacherID=n.teacherID
order by 4, 1, 2, 3
;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.