I have the below data. I want to create new variables with last aedtc date with corresponding max grade and where ever aeevnt is 1 I want to retain to the corresponding record.
data data;
input @1 seqno $8. @9 grade 1. @11 aedtc $11. @22 aeevnt 1.;
cards;
K-001 1 . 0
K-001 2 . 0
K-001 2 19DEC2019 1
K-001 3 19DEC2019 1
K-001 3 . 0
K-002 1 . 0
K-002 4 . 0
K-002 3 20APR2020 1
K-002 1 . 0
K-003 2 20JAN2020 1
K-003 2 20JAN2020 1
K-003 1 . 0
K-003 3 . 0
K-003 3 23JAN2020 1
K-003 3 . 0
K-003 1 . 0
K-003 . . 0
K-003 1 . 0
K-003 3 . 0
K-003 2 . 0
K-005 2 . 0
K-005 1 . 0
K-005 1 . 0
K-006 3 30APR2021 1
K-006 3 30APR2021 1
K-006 1 . 0
K-006 2 06APR2021 1
K-006 2 06APR2021 1
K-006 1 . 0
K-006 2 . 0
K-006 3 . 0
K-006 1 . 0
K-007 1 . 0
K-007 3 . 0
K-007 1 . 0
K-007 1 . 0
DK-001 2 . 0
DK-001 . . 0
DK-001 2 . 0
DK-001 . . 0
DK-001 1 . 0
DK-003 . . 0
DK-003 2 . 0
DK-003 . . 0
DK-003 2 . 0
DK-003 1 . 0
DK-003 1 . 0
DK-005 . . 0
DK-006 . . 0
DK-007 . . 0
CDK-001 2 . 0
CDK-001 3 . 0
CDK-001 2 . 0
CDK-001 3 2 4SEP2020 1
CDK-001 3 . 0
CDK-002 1 . 0
CDK-002 3 . 0
;
run;
My final output dataset should have only one record per pateint.
K-001 3 19DEC2019 1
K-002 3 20APR2020 1
K-003 3 23JAN2020 1
K-005 2 . 0
K-006 3 06APR2021 1
K-007 3 . 0
DK-001 2 . 0
DK-003 2 . 0
DK-005 . . 0
DK-006 . . 0
DK-007 . . 0
CDK-001 3 2 4SEP2020 1
CDK-002 3 . 0
@RAVI2000 wrote:
If you observe the data for grade variable with subject K-002 has max grade 4. It's not capturing the max grade here. I tried modifying but still not working.
The reason @novinosrin 's program does not generate a max grade 4 for K-002 is because neither your sample solution data, nor does your description suggest it should generate a 4. The 4 only occurs with a missing aedtc date. Your description presumably would rule that out:
"last aedtc date with corresponding max grade"
Here is a data step solution that retains the last record with a date (not necessarily the latest date), and retains the max grade for dated, and max grade for undated records. It then generates the results you specify:
data want (drop=_:);
set data;
by seqno notsorted;
retain _lastest_date _maxgrade_dated _maxgrade_undated ;
if first.seqno then call missing (of _:);
if aedtc^=. then do;
_lastest_date=aedtc;
_maxgrade_dated=max(_maxgrade_dated,grade);
end;
else if grade^=. then _maxgrade_undated=max(_maxgrade_undated,grade);
if last.seqno;
aedtc=_lastest_date;
grade=coalesce(_maxgrade_dated,_maxgrade_undated);
if aedtc^=. then aeevnt=1;
run;
data data;
input @1 seqno $8. @9 grade 1. @11 aedtc $11. @22 aeevnt 1.;
cards;
K-001 1 . 0
K-001 2 . 0
K-001 2 19DEC2019 1
K-001 3 19DEC2019 1
K-001 3 . 0
K-002 1 . 0
K-002 4 . 0
K-002 3 20APR2020 1
K-002 1 . 0
K-003 2 20JAN2020 1
K-003 2 20JAN2020 1
K-003 1 . 0
K-003 3 . 0
K-003 3 23JAN2020 1
K-003 3 . 0
K-003 1 . 0
K-003 . . 0
K-003 1 . 0
K-003 3 . 0
K-003 2 . 0
K-005 2 . 0
K-005 1 . 0
K-005 1 . 0
K-006 3 30APR2021 1
K-006 3 30APR2021 1
K-006 1 . 0
K-006 2 06APR2021 1
K-006 2 06APR2021 1
K-006 1 . 0
K-006 2 . 0
K-006 3 . 0
K-006 1 . 0
K-007 1 . 0
K-007 3 . 0
K-007 1 . 0
K-007 1 . 0
DK-001 2 . 0
DK-001 . . 0
DK-001 2 . 0
DK-001 . . 0
DK-001 1 . 0
DK-003 . . 0
DK-003 2 . 0
DK-003 . . 0
DK-003 2 . 0
DK-003 1 . 0
DK-003 1 . 0
DK-005 . . 0
DK-006 . . 0
DK-007 . . 0
CDK-001 2 . 0
CDK-001 3 . 0
CDK-001 2 . 0
CDK-001 3 04SEP2020 1
CDK-001 3 . 0
CDK-002 1 . 0
CDK-002 3 . 0
;
run;
proc sql number;
create table data2 as select * from data
group by seqno
having grade=max(grade)
order by seqno, aedtc;
quit;
data data3;
set data2;
by seqno aedtc;
if last.seqno then output;
run;
Using the date variable aedtc as numeric-
data data;
input @1 seqno $8. @9 grade 1. aedtc :date9. @22 aeevnt 1.;
format aedtc date9.;
cards;
K-001 1 . 0
K-001 2 . 0
K-001 2 19DEC2019 1
K-001 3 19DEC2019 1
K-001 3 . 0
K-002 1 . 0
K-002 4 . 0
K-002 3 20APR2020 1
K-002 1 . 0
K-003 2 20JAN2020 1
K-003 2 20JAN2020 1
K-003 1 . 0
K-003 3 . 0
K-003 3 23JAN2020 1
K-003 3 . 0
K-003 1 . 0
K-003 . . 0
K-003 1 . 0
K-003 3 . 0
K-003 2 . 0
K-005 2 . 0
K-005 1 . 0
K-005 1 . 0
K-006 3 30APR2021 1
K-006 3 30APR2021 1
K-006 1 . 0
K-006 2 06APR2021 1
K-006 2 06APR2021 1
K-006 1 . 0
K-006 2 . 0
K-006 3 . 0
K-006 1 . 0
K-007 1 . 0
K-007 3 . 0
K-007 1 . 0
K-007 1 . 0
DK-001 2 . 0
DK-001 . . 0
DK-001 2 . 0
DK-001 . . 0
DK-001 1 . 0
DK-003 . . 0
DK-003 2 . 0
DK-003 . . 0
DK-003 2 . 0
DK-003 1 . 0
DK-003 1 . 0
DK-005 . . 0
DK-006 . . 0
DK-007 . . 0
CDK-001 2 . 0
CDK-001 3 . 0
CDK-001 2 . 0
CDK-001 3 04SEP2020 1
CDK-001 3 . 0
CDK-002 1 . 0
CDK-002 3 . 0
;
run;
proc sql;
create table want as
select distinct *
from data
group by seqno
having aeevnt and max(grade*aeevnt)=grade
or not max(aeevnt) and max(grade)=grade;
quit;
seqno | grade | aedtc | aeevnt |
---|---|---|---|
CDK-001 | 3 | 04SEP2020 | 1 |
CDK-002 | 3 | . | 0 |
DK-001 | 2 | . | 0 |
DK-003 | 2 | . | 0 |
DK-005 | . | . | 0 |
DK-006 | . | . | 0 |
DK-007 | . | . | 0 |
K-001 | 3 | 19DEC2019 | 1 |
K-002 | 3 | 20APR2020 | 1 |
K-003 | 3 | 23JAN2020 | 1 |
K-005 | 2 | . | 0 |
K-006 | 3 | 30APR2021 | 1 |
K-007 | 3 | . | 0 |
@RAVI2000 wrote:
If you observe the data for grade variable with subject K-002 has max grade 4. It's not capturing the max grade here. I tried modifying but still not working.
The reason @novinosrin 's program does not generate a max grade 4 for K-002 is because neither your sample solution data, nor does your description suggest it should generate a 4. The 4 only occurs with a missing aedtc date. Your description presumably would rule that out:
"last aedtc date with corresponding max grade"
Here is a data step solution that retains the last record with a date (not necessarily the latest date), and retains the max grade for dated, and max grade for undated records. It then generates the results you specify:
data want (drop=_:);
set data;
by seqno notsorted;
retain _lastest_date _maxgrade_dated _maxgrade_undated ;
if first.seqno then call missing (of _:);
if aedtc^=. then do;
_lastest_date=aedtc;
_maxgrade_dated=max(_maxgrade_dated,grade);
end;
else if grade^=. then _maxgrade_undated=max(_maxgrade_undated,grade);
if last.seqno;
aedtc=_lastest_date;
grade=coalesce(_maxgrade_dated,_maxgrade_undated);
if aedtc^=. then aeevnt=1;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.