I have the below data. I want to create new variables with last aedtc date with corresponding max grade and where ever aeevnt is 1 I want to retain to the corresponding record.
data data;
input @1 seqno $8. @9 grade 1. @11 aedtc $11. @22 aeevnt 1.;
cards;
K-001 1 . 0
K-001 2 . 0
K-001 2 19DEC2019 1
K-001 3 19DEC2019 1
K-001 3 . 0
K-002 1 . 0
K-002 4 . 0
K-002 3 20APR2020 1
K-002 1 . 0
K-003 2 20JAN2020 1
K-003 2 20JAN2020 1
K-003 1 . 0
K-003 3 . 0
K-003 3 23JAN2020 1
K-003 3 . 0
K-003 1 . 0
K-003 . . 0
K-003 1 . 0
K-003 3 . 0
K-003 2 . 0
K-005 2 . 0
K-005 1 . 0
K-005 1 . 0
K-006 3 30APR2021 1
K-006 3 30APR2021 1
K-006 1 . 0
K-006 2 06APR2021 1
K-006 2 06APR2021 1
K-006 1 . 0
K-006 2 . 0
K-006 3 . 0
K-006 1 . 0
K-007 1 . 0
K-007 3 . 0
K-007 1 . 0
K-007 1 . 0
DK-001 2 . 0
DK-001 . . 0
DK-001 2 . 0
DK-001 . . 0
DK-001 1 . 0
DK-003 . . 0
DK-003 2 . 0
DK-003 . . 0
DK-003 2 . 0
DK-003 1 . 0
DK-003 1 . 0
DK-005 . . 0
DK-006 . . 0
DK-007 . . 0
CDK-001 2 . 0
CDK-001 3 . 0
CDK-001 2 . 0
CDK-001 3 2 4SEP2020 1
CDK-001 3 . 0
CDK-002 1 . 0
CDK-002 3 . 0
;
run;
My final output dataset should have only one record per pateint.
K-001 3 19DEC2019 1
K-002 3 20APR2020 1
K-003 3 23JAN2020 1
K-005 2 . 0
K-006 3 06APR2021 1
K-007 3 . 0
DK-001 2 . 0
DK-003 2 . 0
DK-005 . . 0
DK-006 . . 0
DK-007 . . 0
CDK-001 3 2 4SEP2020 1
CDK-002 3 . 0
@RAVI2000 wrote:
If you observe the data for grade variable with subject K-002 has max grade 4. It's not capturing the max grade here. I tried modifying but still not working.
The reason @novinosrin 's program does not generate a max grade 4 for K-002 is because neither your sample solution data, nor does your description suggest it should generate a 4. The 4 only occurs with a missing aedtc date. Your description presumably would rule that out:
"last aedtc date with corresponding max grade"
Here is a data step solution that retains the last record with a date (not necessarily the latest date), and retains the max grade for dated, and max grade for undated records. It then generates the results you specify:
data want (drop=_:);
set data;
by seqno notsorted;
retain _lastest_date _maxgrade_dated _maxgrade_undated ;
if first.seqno then call missing (of _:);
if aedtc^=. then do;
_lastest_date=aedtc;
_maxgrade_dated=max(_maxgrade_dated,grade);
end;
else if grade^=. then _maxgrade_undated=max(_maxgrade_undated,grade);
if last.seqno;
aedtc=_lastest_date;
grade=coalesce(_maxgrade_dated,_maxgrade_undated);
if aedtc^=. then aeevnt=1;
run;
data data;
input @1 seqno $8. @9 grade 1. @11 aedtc $11. @22 aeevnt 1.;
cards;
K-001 1 . 0
K-001 2 . 0
K-001 2 19DEC2019 1
K-001 3 19DEC2019 1
K-001 3 . 0
K-002 1 . 0
K-002 4 . 0
K-002 3 20APR2020 1
K-002 1 . 0
K-003 2 20JAN2020 1
K-003 2 20JAN2020 1
K-003 1 . 0
K-003 3 . 0
K-003 3 23JAN2020 1
K-003 3 . 0
K-003 1 . 0
K-003 . . 0
K-003 1 . 0
K-003 3 . 0
K-003 2 . 0
K-005 2 . 0
K-005 1 . 0
K-005 1 . 0
K-006 3 30APR2021 1
K-006 3 30APR2021 1
K-006 1 . 0
K-006 2 06APR2021 1
K-006 2 06APR2021 1
K-006 1 . 0
K-006 2 . 0
K-006 3 . 0
K-006 1 . 0
K-007 1 . 0
K-007 3 . 0
K-007 1 . 0
K-007 1 . 0
DK-001 2 . 0
DK-001 . . 0
DK-001 2 . 0
DK-001 . . 0
DK-001 1 . 0
DK-003 . . 0
DK-003 2 . 0
DK-003 . . 0
DK-003 2 . 0
DK-003 1 . 0
DK-003 1 . 0
DK-005 . . 0
DK-006 . . 0
DK-007 . . 0
CDK-001 2 . 0
CDK-001 3 . 0
CDK-001 2 . 0
CDK-001 3 04SEP2020 1
CDK-001 3 . 0
CDK-002 1 . 0
CDK-002 3 . 0
;
run;
proc sql number;
create table data2 as select * from data
group by seqno
having grade=max(grade)
order by seqno, aedtc;
quit;
data data3;
set data2;
by seqno aedtc;
if last.seqno then output;
run;
Using the date variable aedtc as numeric-
data data;
input @1 seqno $8. @9 grade 1. aedtc :date9. @22 aeevnt 1.;
format aedtc date9.;
cards;
K-001 1 . 0
K-001 2 . 0
K-001 2 19DEC2019 1
K-001 3 19DEC2019 1
K-001 3 . 0
K-002 1 . 0
K-002 4 . 0
K-002 3 20APR2020 1
K-002 1 . 0
K-003 2 20JAN2020 1
K-003 2 20JAN2020 1
K-003 1 . 0
K-003 3 . 0
K-003 3 23JAN2020 1
K-003 3 . 0
K-003 1 . 0
K-003 . . 0
K-003 1 . 0
K-003 3 . 0
K-003 2 . 0
K-005 2 . 0
K-005 1 . 0
K-005 1 . 0
K-006 3 30APR2021 1
K-006 3 30APR2021 1
K-006 1 . 0
K-006 2 06APR2021 1
K-006 2 06APR2021 1
K-006 1 . 0
K-006 2 . 0
K-006 3 . 0
K-006 1 . 0
K-007 1 . 0
K-007 3 . 0
K-007 1 . 0
K-007 1 . 0
DK-001 2 . 0
DK-001 . . 0
DK-001 2 . 0
DK-001 . . 0
DK-001 1 . 0
DK-003 . . 0
DK-003 2 . 0
DK-003 . . 0
DK-003 2 . 0
DK-003 1 . 0
DK-003 1 . 0
DK-005 . . 0
DK-006 . . 0
DK-007 . . 0
CDK-001 2 . 0
CDK-001 3 . 0
CDK-001 2 . 0
CDK-001 3 04SEP2020 1
CDK-001 3 . 0
CDK-002 1 . 0
CDK-002 3 . 0
;
run;
proc sql;
create table want as
select distinct *
from data
group by seqno
having aeevnt and max(grade*aeevnt)=grade
or not max(aeevnt) and max(grade)=grade;
quit;
seqno | grade | aedtc | aeevnt |
---|---|---|---|
CDK-001 | 3 | 04SEP2020 | 1 |
CDK-002 | 3 | . | 0 |
DK-001 | 2 | . | 0 |
DK-003 | 2 | . | 0 |
DK-005 | . | . | 0 |
DK-006 | . | . | 0 |
DK-007 | . | . | 0 |
K-001 | 3 | 19DEC2019 | 1 |
K-002 | 3 | 20APR2020 | 1 |
K-003 | 3 | 23JAN2020 | 1 |
K-005 | 2 | . | 0 |
K-006 | 3 | 30APR2021 | 1 |
K-007 | 3 | . | 0 |
@RAVI2000 wrote:
If you observe the data for grade variable with subject K-002 has max grade 4. It's not capturing the max grade here. I tried modifying but still not working.
The reason @novinosrin 's program does not generate a max grade 4 for K-002 is because neither your sample solution data, nor does your description suggest it should generate a 4. The 4 only occurs with a missing aedtc date. Your description presumably would rule that out:
"last aedtc date with corresponding max grade"
Here is a data step solution that retains the last record with a date (not necessarily the latest date), and retains the max grade for dated, and max grade for undated records. It then generates the results you specify:
data want (drop=_:);
set data;
by seqno notsorted;
retain _lastest_date _maxgrade_dated _maxgrade_undated ;
if first.seqno then call missing (of _:);
if aedtc^=. then do;
_lastest_date=aedtc;
_maxgrade_dated=max(_maxgrade_dated,grade);
end;
else if grade^=. then _maxgrade_undated=max(_maxgrade_undated,grade);
if last.seqno;
aedtc=_lastest_date;
grade=coalesce(_maxgrade_dated,_maxgrade_undated);
if aedtc^=. then aeevnt=1;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.