conditions with multiple record

Shilin · Posted 08-03-2021 10:06 AM

Hi, i am looking to create additional 3 columns from the input table below to give the earliest grade change if an ID falls into group 2.

For example for ID = a1, it was classified as Group 2 on 1 Feb 2015 and the earliest change in grade happens on 1 Apr 2015 when it changes to 4. I would also like to calculate the number of months it took to change from grade 3 to grade 4.

i have also attached the desired output table below.

Thanks!

Input table

ID	Date	Grade	Group
a1	1-Jan-15	3	1
a1	1-Feb-15	3	2
a1	1-Mar-15	3	1
a1	1-Apr-15	4	2
b1	1-Jan-15	5	2
b1	1-Feb-15	5	2
b1	1-Mar-15	4	1
c1	1-Jan-15	7	2
c1	1-Feb-15	7	1
c1	1-Mar-15	7	1
c1	1-Apr-15	7	1
c1	1-Jan-18	8	1
c1	1-Feb-18	9	1
d1	1-Jan-15	1	1
d1	1-Feb-15	2	1
d1	1-Mar-15	2	2
d1	1-Apr-15	3	2
e1	1-Jan-15	10	2
e1	1-Feb-15	11	1

Output table

ID	Date	Grade	Group	New grade	Date of new grade	No. of months to get to new grade
a1	1-Jan-15	3	1
a1	1-Feb-15	3	2	4	1-Apr-15	2
a1	1-Mar-15	3	1
a1	1-Apr-15	4	2
b1	1-Jan-15	5	2	4	1-Mar-15	2
b1	1-Feb-15	5	2
b1	1-Mar-15	4	1
c1	1-Jan-15	7	2	8	1-Jan-18	36
c1	1-Feb-15	7	1
c1	1-Mar-15	7	1
c1	1-Apr-15	7	1
c1	1-Jan-18	8	1
c1	1-Feb-18	9	1
d1	1-Jan-15	1	1
d1	1-Feb-15	2	1
d1	1-Mar-15	2	2	3	1-Apr-15	1
d1	1-Apr-15	3	2
e1	1-Jan-15	10	2	11	1-Feb-15	1
e1	1-Feb-15	11	1

Shilin · Posted 08-03-2021 10:25 AM

Hi, i am looking to create additional 3 columns from the input table below to give the earliest grade change if an ID falls into group 2. For example for ID = a1, it was classified as Group 2 on 1 Feb 2015 and the earliest change in grade happens on 1 Apr 2015 when it changes to 4. I would also like to calculate the number of months it took to change from grade 3 to grade 4. i have also attached the desired output table below. Thanks!

Input table

ID	Date	Grade	Group
a1	1 jan 2015	3	1
a1	1 feb 2015	3	2
a1	1 mar 2015	3	1
a1	1 apr 2015	4	1
b1	1 jan 2015	5	2
b1	1 feb 2015	5	1
b1	1 mar 2015	5	1
c1	1 jan 2015	5	1
c1	1 feb 2015	5	1
c1	1 mar 2015	5	1
c1	1 apr 2015	5	1
c1	1 jan 2018	6	2
c1	1 feb 2018	7	1
d1	1 jan 2015	4	2
d1	1 jan 2016	5	1
d1	1 feb 2016	5	2
d1	1 mar 2016	6	1
e1	1 jan 2015	7	2
e1	1 jan 2017	8	1

Output table

ID	Date	Grade	Group	New grade	Date of new grade	No. of mths to get to new grade
a1	1 jan 2015	3	1
a1	1 feb 2015	3	2	4	1 apr 2015	2
a1	1 mar 2015	3	1
a1	1 apr 2015	4	1
b1	1 jan 2015	5	2
b1	1 feb 2015	5	1
b1	1 mar 2015	5	1
c1	1 jan 2015	5	1	6	1 jan 2018	36
c1	1 feb 2015	5	1
c1	1 mar 2015	5	1
c1	1 apr 2015	5	1
c1	1 jan 2018	6	2	7	1 feb 2018	1
c1	1 feb 2018	7	1
d1	1 jan 2015	4	2	5	1 jan 2016	12
d1	1 jan 2016	5	1
d1	1 feb 2016	4	2	6	1 mar 2016	1
d1	1 mar 2016	6	1
e1	1 jan 2015	7	2	8	1 jan 2017	24
e1	1 jan 2017	8	1

ballardw · Posted 08-03-2021 10:35 AM

Identical posts combined.

Kurt_Bremser · Posted 08-04-2021 04:21 AM

The first expected entry for c1 does not match your description, because it is in group 1.

Otherwise, this does what your description says:

data have;
infile datalines dlm="09"x;
input ID $ Date	:date11. Grade Group;
format date yymmdd10.;
datalines;
a1	1 jan 2015	3	1
a1	1 feb 2015	3	2
a1	1 mar 2015	3	1
a1	1 apr 2015	4	1
b1	1 jan 2015	5	2
b1	1 feb 2015	5	1
b1	1 mar 2015	5	1
c1	1 jan 2015	5	1
c1	1 feb 2015	5	1
c1	1 mar 2015	5	1
c1	1 apr 2015	5	1
c1	1 jan 2018	6	2
c1	1 feb 2018	7	1
d1	1 jan 2015	4	2
d1	1 jan 2016	5	1
d1	1 feb 2016	5	2
d1	1 mar 2016	6	1
e1	1 jan 2015	7	2
e1	1 jan 2017	8	1
;

proc sql;
create table want as
  select
    t1.id,
    t1.date,
    t1.grade,
    t1.group,
    case
      when t1.group = 2 and t2.id is not missing
      then t2.grade
      else .
    end as new_grade,
    case
      when t1.group = 2 and t2.id is not missing
      then t2.date
      else .
    end as new_date format=yymmdd10.,
    case
      when t1.group = 2 and t2.id is not missing
      then intck('month',t1.date,t2.date)
      else .
    end as diff_months
  from have t1
  left join have t2
  on t1.id = t2.id and t2.grade > t1.grade and t2.date > t1.date
  group by t1.id, t1.date, t1.grade, t1.group
  having t2.date = min(t2.date)
;
quit;

Note how the source data is presented as a data step with datalines. This allows everybody to recreate the dataset with a simple copy/paste and submit. Please do so yourself in the future, as it will speed up the process of helping you.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Shilin · Posted 08-04-2021 05:12 AM

Hi thanks. May I know what is t2 here as I only have one input table

Kurt_Bremser · Posted 08-04-2021 05:25 AM

See here:

left join have t2

I join the have dataset with itself, and for the second instance I use the alias t2 (the first is, of course, t1).

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Shilin · Posted 08-04-2021 08:35 AM

Hi thanks much!

andreas_lds · Posted 08-04-2021 02:53 AM

Are the dates sas-dates or strings?

The value of "group" seems to be ignored, when searching or a grade-change, right?

Shilin · Posted 08-04-2021 03:38 AM

I haven’t convert the dates. Supposed to be SAS dates

The value of “group” is not ignored. I would like to find the earliest grade change and the associated date for that new grade if an ID falls into group 2

conditions with multiple record

loop through multiple record

Re: loop through multiple record

Re: loop through multiple record

Re: loop through multiple record

Re: loop through multiple record

Re: loop through multiple record

Re: conditions with multiple record

Re: conditions with multiple record

Registration is open