Hi, i am looking to create additional 3 columns from the input table below to give the earliest grade change if an ID falls into group 2.
For example for ID = a1, it was classified as Group 2 on 1 Feb 2015 and the earliest change in grade happens on 1 Apr 2015 when it changes to 4. I would also like to calculate the number of months it took to change from grade 3 to grade 4.
i have also attached the desired output table below.
Thanks!
Input table
ID | Date | Grade | Group |
a1 | 1-Jan-15 | 3 | 1 |
a1 | 1-Feb-15 | 3 | 2 |
a1 | 1-Mar-15 | 3 | 1 |
a1 | 1-Apr-15 | 4 | 2 |
b1 | 1-Jan-15 | 5 | 2 |
b1 | 1-Feb-15 | 5 | 2 |
b1 | 1-Mar-15 | 4 | 1 |
c1 | 1-Jan-15 | 7 | 2 |
c1 | 1-Feb-15 | 7 | 1 |
c1 | 1-Mar-15 | 7 | 1 |
c1 | 1-Apr-15 | 7 | 1 |
c1 | 1-Jan-18 | 8 | 1 |
c1 | 1-Feb-18 | 9 | 1 |
d1 | 1-Jan-15 | 1 | 1 |
d1 | 1-Feb-15 | 2 | 1 |
d1 | 1-Mar-15 | 2 | 2 |
d1 | 1-Apr-15 | 3 | 2 |
e1 | 1-Jan-15 | 10 | 2 |
e1 | 1-Feb-15 | 11 | 1 |
Output table
ID | Date | Grade | Group | New grade | Date of new grade | No. of months to get to new grade |
a1 | 1-Jan-15 | 3 | 1 | |||
a1 | 1-Feb-15 | 3 | 2 | 4 | 1-Apr-15 | 2 |
a1 | 1-Mar-15 | 3 | 1 | |||
a1 | 1-Apr-15 | 4 | 2 | |||
b1 | 1-Jan-15 | 5 | 2 | 4 | 1-Mar-15 | 2 |
b1 | 1-Feb-15 | 5 | 2 | |||
b1 | 1-Mar-15 | 4 | 1 | |||
c1 | 1-Jan-15 | 7 | 2 | 8 | 1-Jan-18 | 36 |
c1 | 1-Feb-15 | 7 | 1 | |||
c1 | 1-Mar-15 | 7 | 1 | |||
c1 | 1-Apr-15 | 7 | 1 | |||
c1 | 1-Jan-18 | 8 | 1 | |||
c1 | 1-Feb-18 | 9 | 1 | |||
d1 | 1-Jan-15 | 1 | 1 | |||
d1 | 1-Feb-15 | 2 | 1 | |||
d1 | 1-Mar-15 | 2 | 2 | 3 | 1-Apr-15 | 1 |
d1 | 1-Apr-15 | 3 | 2 | |||
e1 | 1-Jan-15 | 10 | 2 | 11 | 1-Feb-15 | 1 |
e1 | 1-Feb-15 | 11 | 1 |
Hi, i am looking to create additional 3 columns from the input table below to give the earliest grade change if an ID falls into group 2. For example for ID = a1, it was classified as Group 2 on 1 Feb 2015 and the earliest change in grade happens on 1 Apr 2015 when it changes to 4. I would also like to calculate the number of months it took to change from grade 3 to grade 4. i have also attached the desired output table below. Thanks!
Input table
ID | Date | Grade | Group |
a1 | 1 jan 2015 | 3 | 1 |
a1 | 1 feb 2015 | 3 | 2 |
a1 | 1 mar 2015 | 3 | 1 |
a1 | 1 apr 2015 | 4 | 1 |
b1 | 1 jan 2015 | 5 | 2 |
b1 | 1 feb 2015 | 5 | 1 |
b1 | 1 mar 2015 | 5 | 1 |
c1 | 1 jan 2015 | 5 | 1 |
c1 | 1 feb 2015 | 5 | 1 |
c1 | 1 mar 2015 | 5 | 1 |
c1 | 1 apr 2015 | 5 | 1 |
c1 | 1 jan 2018 | 6 | 2 |
c1 | 1 feb 2018 | 7 | 1 |
d1 | 1 jan 2015 | 4 | 2 |
d1 | 1 jan 2016 | 5 | 1 |
d1 | 1 feb 2016 | 5 | 2 |
d1 | 1 mar 2016 | 6 | 1 |
e1 | 1 jan 2015 | 7 | 2 |
e1 | 1 jan 2017 | 8 | 1 |
Output table
ID | Date | Grade | Group | New grade | Date of new grade | No. of mths to get to new grade |
a1 | 1 jan 2015 | 3 | 1 | |||
a1 | 1 feb 2015 | 3 | 2 | 4 | 1 apr 2015 | 2 |
a1 | 1 mar 2015 | 3 | 1 | |||
a1 | 1 apr 2015 | 4 | 1 | |||
b1 | 1 jan 2015 | 5 | 2 | |||
b1 | 1 feb 2015 | 5 | 1 | |||
b1 | 1 mar 2015 | 5 | 1 | |||
c1 | 1 jan 2015 | 5 | 1 | 6 | 1 jan 2018 | 36 |
c1 | 1 feb 2015 | 5 | 1 | |||
c1 | 1 mar 2015 | 5 | 1 | |||
c1 | 1 apr 2015 | 5 | 1 | |||
c1 | 1 jan 2018 | 6 | 2 | 7 | 1 feb 2018 | 1 |
c1 | 1 feb 2018 | 7 | 1 | |||
d1 | 1 jan 2015 | 4 | 2 | 5 | 1 jan 2016 | 12 |
d1 | 1 jan 2016 | 5 | 1 | |||
d1 | 1 feb 2016 | 4 | 2 | 6 | 1 mar 2016 | 1 |
d1 | 1 mar 2016 | 6 | 1 | |||
e1 | 1 jan 2015 | 7 | 2 | 8 | 1 jan 2017 | 24 |
e1 | 1 jan 2017 | 8 | 1 |
Identical posts combined.
The first expected entry for c1 does not match your description, because it is in group 1.
Otherwise, this does what your description says:
data have;
infile datalines dlm="09"x;
input ID $ Date :date11. Grade Group;
format date yymmdd10.;
datalines;
a1 1 jan 2015 3 1
a1 1 feb 2015 3 2
a1 1 mar 2015 3 1
a1 1 apr 2015 4 1
b1 1 jan 2015 5 2
b1 1 feb 2015 5 1
b1 1 mar 2015 5 1
c1 1 jan 2015 5 1
c1 1 feb 2015 5 1
c1 1 mar 2015 5 1
c1 1 apr 2015 5 1
c1 1 jan 2018 6 2
c1 1 feb 2018 7 1
d1 1 jan 2015 4 2
d1 1 jan 2016 5 1
d1 1 feb 2016 5 2
d1 1 mar 2016 6 1
e1 1 jan 2015 7 2
e1 1 jan 2017 8 1
;
proc sql;
create table want as
select
t1.id,
t1.date,
t1.grade,
t1.group,
case
when t1.group = 2 and t2.id is not missing
then t2.grade
else .
end as new_grade,
case
when t1.group = 2 and t2.id is not missing
then t2.date
else .
end as new_date format=yymmdd10.,
case
when t1.group = 2 and t2.id is not missing
then intck('month',t1.date,t2.date)
else .
end as diff_months
from have t1
left join have t2
on t1.id = t2.id and t2.grade > t1.grade and t2.date > t1.date
group by t1.id, t1.date, t1.grade, t1.group
having t2.date = min(t2.date)
;
quit;
Note how the source data is presented as a data step with datalines. This allows everybody to recreate the dataset with a simple copy/paste and submit. Please do so yourself in the future, as it will speed up the process of helping you.
See here:
left join have t2
I join the have dataset with itself, and for the second instance I use the alias t2 (the first is, of course, t1).
Are the dates sas-dates or strings?
The value of "group" seems to be ignored, when searching or a grade-change, right?
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.