Hello everyone
I have two tables. I need to join them and connect the columns. Below are the two source tables and the third table is the desired table.
In table 1 the first_term is the term that student began that program.
Table 1
Id | first_term | Program |
610 | 199601 | Libarts |
610 | 201001 | Medtran |
610 | 201102 | Photo |
In table 2, the "term" column has all the terms the student has registered for a class. if the starting term in the "term" column is greater than or equal to the "first_term" in table 1, then it means that class (CRN number) is tied to that program. For example, the first program the student began is LIBARTS in 199601 (table 1). In table 2, we can see after 199601, this student took courses till the term reaches to the beginning of the new program which is 201001. This means the first 5 observations belong to the LIBARTS program.
Table 2
Id | Term | CRN |
610 | 199602 | 211 |
610 | 199702 | 213 |
610 | 200501 | 102 |
610 | 200903 | 302 |
610 | 200903 | 321 |
610 | 201001 | 124 |
610 | 201001 | 153 |
610 | 201001 | 150 |
610 | 201001 | 142 |
610 | 201002 | 225 |
610 | 201002 | 242 |
610 | 201002 | 244 |
610 | 201003 | 316 |
610 | 201101 | 103 |
610 | 201101 | 121 |
610 | 201101 | 142 |
610 | 201101 | 145 |
610 | 201102 | 242 |
610 | 201102 | 242 |
610 | 201103 | 315 |
610 | 201201 | 140 |
610 | 201201 | 140 |
610 | 201201 | 150 |
610 | 201202 | 211 |
610 | 201202 | 241 |
610 | 201202 | 242 |
Basically, the joined table should look like below.
Id | Term | CRN | Program |
610 | 199602 | 211 | Libarts |
610 | 199702 | 213 | Libarts |
610 | 200501 | 102 | Libarts |
610 | 200903 | 302 | Libarts |
610 | 200903 | 321 | Libarts |
610 | 201001 | 124 | Medtran |
610 | 201001 | 153 | Medtran |
610 | 201001 | 150 | Medtran |
610 | 201001 | 142 | Medtran |
610 | 201002 | 225 | Medtran |
610 | 201002 | 242 | Medtran |
610 | 201002 | 244 | Medtran |
610 | 201003 | 316 | Medtran |
610 | 201101 | 103 | Medtran |
610 | 201101 | 121 | Medtran |
610 | 201101 | 142 | Medtran |
610 | 201101 | 145 | Medtran |
610 | 201102 | 242 | Photo |
610 | 201102 | 242 | Photo |
610 | 201103 | 315 | Photo |
610 | 201201 | 140 | Photo |
610 | 201201 | 140 | Photo |
610 | 201201 | 150 | Photo |
610 | 201202 | 211 | Photo |
610 | 201202 | 241 | Photo |
610 | 201202 | 242 | Photo |
Any help will be greatly appreciated. Thanks.
Hmm, I think it works when I run it 🙂
Does your code look like this:
data one;
length id 8 first_term $6 program$32;
input id first_term program;
datalines;
610 199601 Libarts
610 201001 Medtran
610 201102 Photo
510 199701 Libarts1
510 201101 Medtran1
510 201302 Photo1
;
run;
data two;
length Id 8 Term $6 CRN $32;
input id term crn;
datalines;
610 199602 211
610 199702 213
610 200501 102
610 200903 302
610 200903 321
610 201001 124
610 201001 153
610 201001 150
610 201001 142
610 201002 225
610 201002 242
610 201002 244
610 201003 316
610 201101 103
610 201101 121
610 201101 142
610 201101 145
610 201102 242
610 201102 242
610 201103 315
610 201201 140
610 201201 140
610 201201 150
610 201202 211
610 201202 241
610 201202 242
510 199702 211
510 199802 213
510 200601 102
510 200904 302
510 200905 321
510 201101 124
510 201101 153
510 201101 150
510 201101 142
510 201102 225
510 201102 242
510 201102 244
510 201103 316
510 201401 103
510 201401 121
;
run;
proc sort data=one;
by id descending first_term;
run;
data one;
set one;
by id descending first_term;
length fromdate todate $6;
fromdate = first_term;
todate = lag(first_term);
if first.id then
todate = '599901';
run;
proc sql;
create table tog as
select two.id, two.term, two.crn, one.program from one
inner join two
on one.id = two.id
and one.fromdate <= two.term < one.todate
;
run;
This might be one way to solve it 🙂
data one;
length id 8 first_term $6 program$32;
input id first_term program;
datalines;
610 199601 Libarts
610 201001 Medtran
610 201102 Photo
;
run;
data two;
length Id 8 Term $6 CRN $32;
input id term crn;
datalines;
610 199602 211
610 199702 213
610 200501 102
610 200903 302
610 200903 321
610 201001 124
610 201001 153
610 201001 150
610 201001 142
610 201002 225
610 201002 242
610 201002 244
610 201003 316
610 201101 103
610 201101 121
610 201101 142
610 201101 145
610 201102 242
610 201102 242
610 201103 315
610 201201 140
610 201201 140
610 201201 150
610 201202 211
610 201202 241
610 201202 242
;
run;
proc sort data=one;
by id descending first_term;
run;
data one;
set one;
length fromdate todate $6;
fromdate = first_term;
todate = lag(first_term);
if todate = '' then todate = '599901';
run;
proc sql;
create table tog as
select two.id, two.term, two.crn, one.program from one
inner join two
on one.id = two.id
and one.fromdate <= two.term < one.todate
;
run;
Thank you for your response. This code works if there is only one student. But I have many students. The 'todate' has to be sorted by id? Currently the 'lag' function picks the date from previous observation which means it overlaps to another student? Hope i am making sense?
Yes of, course, quick and wrong....:)
This is untested, but try:
data one;
set one;
by id descending first_term;
length fromdate todate $6;
fromdate = first_term;
todate = lag(first_term);
if first.id then todate = '599901';
run;
Thanks Fredrik. Tried that, did not work as it still picks the terms from previous ids.
Hmm, I think it works when I run it 🙂
Does your code look like this:
data one;
length id 8 first_term $6 program$32;
input id first_term program;
datalines;
610 199601 Libarts
610 201001 Medtran
610 201102 Photo
510 199701 Libarts1
510 201101 Medtran1
510 201302 Photo1
;
run;
data two;
length Id 8 Term $6 CRN $32;
input id term crn;
datalines;
610 199602 211
610 199702 213
610 200501 102
610 200903 302
610 200903 321
610 201001 124
610 201001 153
610 201001 150
610 201001 142
610 201002 225
610 201002 242
610 201002 244
610 201003 316
610 201101 103
610 201101 121
610 201101 142
610 201101 145
610 201102 242
610 201102 242
610 201103 315
610 201201 140
610 201201 140
610 201201 150
610 201202 211
610 201202 241
610 201202 242
510 199702 211
510 199802 213
510 200601 102
510 200904 302
510 200905 321
510 201101 124
510 201101 153
510 201101 150
510 201101 142
510 201102 225
510 201102 242
510 201102 244
510 201103 316
510 201401 103
510 201401 121
;
run;
proc sort data=one;
by id descending first_term;
run;
data one;
set one;
by id descending first_term;
length fromdate todate $6;
fromdate = first_term;
todate = lag(first_term);
if first.id then
todate = '599901';
run;
proc sql;
create table tog as
select two.id, two.term, two.crn, one.program from one
inner join two
on one.id = two.id
and one.fromdate <= two.term < one.todate
;
run;
Thanks Fredrik. This code worked. I swear I did the same way yesterday and it did not work. I am guessing I may have something wrong.
I created my own code but seems more complicated. Below is my code;
data sgbstdn1a (drop=x);
set sgbstdn;
by sgbstdn_pidm;
length fromdate todate $6;
fromdate = first_term;
x=lag(first_term);
if first.sgbstdn_pidm then do;
todate='999999';
end; else do;
todate=x;
end;
run;
The important thing is to sort in descending date order since first.term actually is the last one, and you need to go in reverse order to get the date intervals correct 🙂
//Fredrik
Thank you! Makes sense!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.