Hi, i am looking to create additional 3 columns from the input table below to give the earliest grade change if an ID falls into group 2.
For example for ID = a1, it was classified as Group 2 on 1 Feb 2015 and the earliest change in grade happens on 1 Apr 2015 when it changes to 4. I would also like to calculate the number of months it took to change from grade 3 to grade 4.
i have also attached the desired output table below.
Thanks!
Input table
ID | Date | Grade | Group |
a1 | 1-Jan-15 | 3 | 1 |
a1 | 1-Feb-15 | 3 | 2 |
a1 | 1-Mar-15 | 3 | 1 |
a1 | 1-Apr-15 | 4 | 2 |
b1 | 1-Jan-15 | 5 | 2 |
b1 | 1-Feb-15 | 5 | 2 |
b1 | 1-Mar-15 | 4 | 1 |
c1 | 1-Jan-15 | 7 | 2 |
c1 | 1-Feb-15 | 7 | 1 |
c1 | 1-Mar-15 | 7 | 1 |
c1 | 1-Apr-15 | 7 | 1 |
c1 | 1-Jan-18 | 8 | 1 |
c1 | 1-Feb-18 | 9 | 1 |
d1 | 1-Jan-15 | 1 | 1 |
d1 | 1-Feb-15 | 2 | 1 |
d1 | 1-Mar-15 | 2 | 2 |
d1 | 1-Apr-15 | 3 | 2 |
e1 | 1-Jan-15 | 10 | 2 |
e1 | 1-Feb-15 | 11 | 1 |
Output table
ID | Date | Grade | Group | New grade | Date of new grade | No. of months to get to new grade |
a1 | 1-Jan-15 | 3 | 1 | |||
a1 | 1-Feb-15 | 3 | 2 | 4 | 1-Apr-15 | 2 |
a1 | 1-Mar-15 | 3 | 1 | |||
a1 | 1-Apr-15 | 4 | 2 | |||
b1 | 1-Jan-15 | 5 | 2 | 4 | 1-Mar-15 | 2 |
b1 | 1-Feb-15 | 5 | 2 | |||
b1 | 1-Mar-15 | 4 | 1 | |||
c1 | 1-Jan-15 | 7 | 2 | 8 | 1-Jan-18 | 36 |
c1 | 1-Feb-15 | 7 | 1 | |||
c1 | 1-Mar-15 | 7 | 1 | |||
c1 | 1-Apr-15 | 7 | 1 | |||
c1 | 1-Jan-18 | 8 | 1 | |||
c1 | 1-Feb-18 | 9 | 1 | |||
d1 | 1-Jan-15 | 1 | 1 | |||
d1 | 1-Feb-15 | 2 | 1 | |||
d1 | 1-Mar-15 | 2 | 2 | 3 | 1-Apr-15 | 1 |
d1 | 1-Apr-15 | 3 | 2 | |||
e1 | 1-Jan-15 | 10 | 2 | 11 | 1-Feb-15 | 1 |
e1 | 1-Feb-15 | 11 | 1 |
Hi, i am looking to create additional 3 columns from the input table below to give the earliest grade change if an ID falls into group 2. For example for ID = a1, it was classified as Group 2 on 1 Feb 2015 and the earliest change in grade happens on 1 Apr 2015 when it changes to 4. I would also like to calculate the number of months it took to change from grade 3 to grade 4. i have also attached the desired output table below. Thanks!
Input table
ID | Date | Grade | Group |
a1 | 1 jan 2015 | 3 | 1 |
a1 | 1 feb 2015 | 3 | 2 |
a1 | 1 mar 2015 | 3 | 1 |
a1 | 1 apr 2015 | 4 | 1 |
b1 | 1 jan 2015 | 5 | 2 |
b1 | 1 feb 2015 | 5 | 1 |
b1 | 1 mar 2015 | 5 | 1 |
c1 | 1 jan 2015 | 5 | 1 |
c1 | 1 feb 2015 | 5 | 1 |
c1 | 1 mar 2015 | 5 | 1 |
c1 | 1 apr 2015 | 5 | 1 |
c1 | 1 jan 2018 | 6 | 2 |
c1 | 1 feb 2018 | 7 | 1 |
d1 | 1 jan 2015 | 4 | 2 |
d1 | 1 jan 2016 | 5 | 1 |
d1 | 1 feb 2016 | 5 | 2 |
d1 | 1 mar 2016 | 6 | 1 |
e1 | 1 jan 2015 | 7 | 2 |
e1 | 1 jan 2017 | 8 | 1 |
Output table
ID | Date | Grade | Group | New grade | Date of new grade | No. of mths to get to new grade |
a1 | 1 jan 2015 | 3 | 1 | |||
a1 | 1 feb 2015 | 3 | 2 | 4 | 1 apr 2015 | 2 |
a1 | 1 mar 2015 | 3 | 1 | |||
a1 | 1 apr 2015 | 4 | 1 | |||
b1 | 1 jan 2015 | 5 | 2 | |||
b1 | 1 feb 2015 | 5 | 1 | |||
b1 | 1 mar 2015 | 5 | 1 | |||
c1 | 1 jan 2015 | 5 | 1 | 6 | 1 jan 2018 | 36 |
c1 | 1 feb 2015 | 5 | 1 | |||
c1 | 1 mar 2015 | 5 | 1 | |||
c1 | 1 apr 2015 | 5 | 1 | |||
c1 | 1 jan 2018 | 6 | 2 | 7 | 1 feb 2018 | 1 |
c1 | 1 feb 2018 | 7 | 1 | |||
d1 | 1 jan 2015 | 4 | 2 | 5 | 1 jan 2016 | 12 |
d1 | 1 jan 2016 | 5 | 1 | |||
d1 | 1 feb 2016 | 4 | 2 | 6 | 1 mar 2016 | 1 |
d1 | 1 mar 2016 | 6 | 1 | |||
e1 | 1 jan 2015 | 7 | 2 | 8 | 1 jan 2017 | 24 |
e1 | 1 jan 2017 | 8 | 1 |
Identical posts combined.
The first expected entry for c1 does not match your description, because it is in group 1.
Otherwise, this does what your description says:
data have;
infile datalines dlm="09"x;
input ID $ Date :date11. Grade Group;
format date yymmdd10.;
datalines;
a1 1 jan 2015 3 1
a1 1 feb 2015 3 2
a1 1 mar 2015 3 1
a1 1 apr 2015 4 1
b1 1 jan 2015 5 2
b1 1 feb 2015 5 1
b1 1 mar 2015 5 1
c1 1 jan 2015 5 1
c1 1 feb 2015 5 1
c1 1 mar 2015 5 1
c1 1 apr 2015 5 1
c1 1 jan 2018 6 2
c1 1 feb 2018 7 1
d1 1 jan 2015 4 2
d1 1 jan 2016 5 1
d1 1 feb 2016 5 2
d1 1 mar 2016 6 1
e1 1 jan 2015 7 2
e1 1 jan 2017 8 1
;
proc sql;
create table want as
select
t1.id,
t1.date,
t1.grade,
t1.group,
case
when t1.group = 2 and t2.id is not missing
then t2.grade
else .
end as new_grade,
case
when t1.group = 2 and t2.id is not missing
then t2.date
else .
end as new_date format=yymmdd10.,
case
when t1.group = 2 and t2.id is not missing
then intck('month',t1.date,t2.date)
else .
end as diff_months
from have t1
left join have t2
on t1.id = t2.id and t2.grade > t1.grade and t2.date > t1.date
group by t1.id, t1.date, t1.grade, t1.group
having t2.date = min(t2.date)
;
quit;
Note how the source data is presented as a data step with datalines. This allows everybody to recreate the dataset with a simple copy/paste and submit. Please do so yourself in the future, as it will speed up the process of helping you.
See here:
left join have t2
I join the have dataset with itself, and for the second instance I use the alias t2 (the first is, of course, t1).
Are the dates sas-dates or strings?
The value of "group" seems to be ignored, when searching or a grade-change, right?
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.