BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
RAVI2000
Lapis Lazuli | Level 10

I have the below data. I want to create new variables with last aedtc date with corresponding max grade and where ever aeevnt is 1 I want to retain to the corresponding record.

data data;
input @1 seqno $8.	@9 grade 1. @11 aedtc $11. @22 aeevnt 1.;
cards;
K-001   1 .          0
K-001   2 .          0
K-001   2 19DEC2019  1
K-001   3 19DEC2019  1
K-001   3 .          0
K-002   1 .          0
K-002   4 .          0
K-002   3 20APR2020  1
K-002   1 .	         0
K-003   2 20JAN2020	 1
K-003   2 20JAN2020	 1
K-003   1 .	         0
K-003   3 .	         0
K-003   3 23JAN2020	 1
K-003   3 .	         0
K-003   1 .	         0
K-003   . .	         0
K-003   1 .	         0
K-003   3 .	         0
K-003   2 .	         0
K-005   2 .	         0
K-005   1 .	         0
K-005   1 .	         0
K-006   3 30APR2021	 1
K-006   3 30APR2021	 1
K-006   1 .	         0
K-006   2 06APR2021	 1
K-006   2 06APR2021	 1
K-006   1 .	         0
K-006   2 .	         0
K-006   3 .	         0
K-006   1 .	         0
K-007   1 .	         0
K-007   3 .	         0
K-007   1 .	         0
K-007   1 .	         0
DK-001  2 .	         0
DK-001  . .	         0
DK-001  2 .	         0
DK-001  . .	         0
DK-001  1 .	         0
DK-003  . .	         0
DK-003  2 .	         0
DK-003  . .	         0
DK-003  2 .	         0
DK-003  1 .	         0
DK-003  1 .	         0
DK-005  . .	         0
DK-006  . .	         0
DK-007  . .	         0
CDK-001 2 .	         0
CDK-001 3 .	         0
CDK-001 2 .	         0
CDK-001 3 2 4SEP2020 1
CDK-001 3 .	         0
CDK-002 1 .	         0
CDK-002 3 .	         0
;
run;

My final output dataset should have only one record per pateint.
K-001 3 19DEC2019 1
K-002 3 20APR2020 1
K-003 3 23JAN2020 1
K-005 2 . 0
K-006 3 06APR2021 1
K-007 3 . 0
DK-001 2 . 0
DK-003 2 . 0
DK-005 . . 0
DK-006 . . 0
DK-007 . . 0
CDK-001 3 2 4SEP2020 1
CDK-002 3 . 0

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

@RAVI2000 wrote:
If you observe the data for grade variable with subject K-002 has max grade 4. It's not capturing the max grade here. I tried modifying but still not working.

The reason @novinosrin 's program does not generate a max grade 4 for K-002 is because neither your sample solution data, nor does your description suggest it should generate a 4.  The 4 only occurs with a missing aedtc date.  Your description presumably would rule that out:

     "last aedtc date with corresponding max grade"

 

Here is a data step solution that retains the last record with a date (not necessarily the latest date), and retains the max grade for dated, and max grade for undated records.  It then generates the results you specify:


data want (drop=_:);
  set data;
  by seqno notsorted;

  retain _lastest_date _maxgrade_dated  _maxgrade_undated ;
  if first.seqno then call missing (of _:);

  if aedtc^=. then do;
    _lastest_date=aedtc;
    _maxgrade_dated=max(_maxgrade_dated,grade);
  end;
  else if grade^=. then _maxgrade_undated=max(_maxgrade_undated,grade);

  if last.seqno;
  aedtc=_lastest_date;
  grade=coalesce(_maxgrade_dated,_maxgrade_undated);
  if aedtc^=. then aeevnt=1;
run;

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

6 REPLIES 6
tarheel13
Rhodochrosite | Level 12
data data;
input @1 seqno $8.	@9 grade 1. @11 aedtc $11. @22 aeevnt 1.;
cards;
K-001   1 .          0
K-001   2 .          0
K-001   2 19DEC2019  1
K-001   3 19DEC2019  1
K-001   3 .          0
K-002   1 .          0
K-002   4 .          0
K-002   3 20APR2020  1
K-002   1 .	         0
K-003   2 20JAN2020	 1
K-003   2 20JAN2020	 1
K-003   1 .	         0
K-003   3 .	         0
K-003   3 23JAN2020	 1
K-003   3 .	         0
K-003   1 .	         0
K-003   . .	         0
K-003   1 .	         0
K-003   3 .	         0
K-003   2 .	         0
K-005   2 .	         0
K-005   1 .	         0
K-005   1 .	         0
K-006   3 30APR2021	 1
K-006   3 30APR2021	 1
K-006   1 .	         0
K-006   2 06APR2021	 1
K-006   2 06APR2021	 1
K-006   1 .	         0
K-006   2 .	         0
K-006   3 .	         0
K-006   1 .	         0
K-007   1 .	         0
K-007   3 .	         0
K-007   1 .	         0
K-007   1 .	         0
DK-001  2 .	         0
DK-001  . .	         0
DK-001  2 .	         0
DK-001  . .	         0
DK-001  1 .	         0
DK-003  . .	         0
DK-003  2 .	         0
DK-003  . .	         0
DK-003  2 .	         0
DK-003  1 .	         0
DK-003  1 .	         0
DK-005  . .	         0
DK-006  . .	         0
DK-007  . .	         0
CDK-001 2 .	         0
CDK-001 3 .	         0
CDK-001 2 .	         0
CDK-001 3 04SEP2020  1
CDK-001 3 .	         0
CDK-002 1 .	         0
CDK-002 3 .	         0
;
run;

proc sql number;
	create table data2 as select * from data
		group by seqno
		having grade=max(grade)
		order by seqno, aedtc;
quit;

data data3;
	set data2;
	by seqno aedtc;
	if last.seqno then output;
run;
RAVI2000
Lapis Lazuli | Level 10
It works!! Can this also be explained in first. last. concept ?
tarheel13
Rhodochrosite | Level 12
If it’s the first seqno in a by group then first.seqno =1. Otherwise it’s 0. If it’s the last seqno in a group then last.seqno =1. You only wanted 1 row per subject so I did if last.seqno then output.
novinosrin
Tourmaline | Level 20

Using the date variable aedtc as numeric-

 

data data;
input @1 seqno $8.	@9 grade 1.  aedtc :date9. @22 aeevnt 1.;
format aedtc date9.;
cards;
K-001   1 .          0
K-001   2 .          0
K-001   2 19DEC2019  1
K-001   3 19DEC2019  1
K-001   3 .          0
K-002   1 .          0
K-002   4 .          0
K-002   3 20APR2020  1
K-002   1 .	         0
K-003   2 20JAN2020	 1
K-003   2 20JAN2020	 1
K-003   1 .	         0
K-003   3 .	         0
K-003   3 23JAN2020	 1
K-003   3 .	         0
K-003   1 .	         0
K-003   . .	         0
K-003   1 .	         0
K-003   3 .	         0
K-003   2 .	         0
K-005   2 .	         0
K-005   1 .	         0
K-005   1 .	         0
K-006   3 30APR2021	 1
K-006   3 30APR2021	 1
K-006   1 .	         0
K-006   2 06APR2021	 1
K-006   2 06APR2021	 1
K-006   1 .	         0
K-006   2 .	         0
K-006   3 .	         0
K-006   1 .	         0
K-007   1 .	         0
K-007   3 .	         0
K-007   1 .	         0
K-007   1 .	         0
DK-001  2 .	         0
DK-001  . .	         0
DK-001  2 .	         0
DK-001  . .	         0
DK-001  1 .	         0
DK-003  . .	         0
DK-003  2 .	         0
DK-003  . .	         0
DK-003  2 .	         0
DK-003  1 .	         0
DK-003  1 .	         0
DK-005  . .	         0
DK-006  . .	         0
DK-007  . .	         0
CDK-001 2 .	         0
CDK-001 3 .	         0
CDK-001 2 .	         0
CDK-001 3 04SEP2020  1
CDK-001 3 .	         0
CDK-002 1 .	         0
CDK-002 3 .	         0
;
run;

proc sql;
  create table want as
  select distinct *
  from data 
  group by seqno
  having aeevnt and max(grade*aeevnt)=grade
  or not max(aeevnt) and max(grade)=grade;
 quit;
 
seqno grade aedtc aeevnt
CDK-001 3 04SEP2020 1
CDK-002 3 . 0
DK-001 2 . 0
DK-003 2 . 0
DK-005 . . 0
DK-006 . . 0
DK-007 . . 0
K-001 3 19DEC2019 1
K-002 3 20APR2020 1
K-003 3 23JAN2020 1
K-005 2 . 0
K-006 3 30APR2021 1
K-007 3 . 0
RAVI2000
Lapis Lazuli | Level 10
If you observe the data for grade variable with subject K-002 has max grade 4. It's not capturing the max grade here. I tried modifying but still not working.
mkeintz
PROC Star

@RAVI2000 wrote:
If you observe the data for grade variable with subject K-002 has max grade 4. It's not capturing the max grade here. I tried modifying but still not working.

The reason @novinosrin 's program does not generate a max grade 4 for K-002 is because neither your sample solution data, nor does your description suggest it should generate a 4.  The 4 only occurs with a missing aedtc date.  Your description presumably would rule that out:

     "last aedtc date with corresponding max grade"

 

Here is a data step solution that retains the last record with a date (not necessarily the latest date), and retains the max grade for dated, and max grade for undated records.  It then generates the results you specify:


data want (drop=_:);
  set data;
  by seqno notsorted;

  retain _lastest_date _maxgrade_dated  _maxgrade_undated ;
  if first.seqno then call missing (of _:);

  if aedtc^=. then do;
    _lastest_date=aedtc;
    _maxgrade_dated=max(_maxgrade_dated,grade);
  end;
  else if grade^=. then _maxgrade_undated=max(_maxgrade_undated,grade);

  if last.seqno;
  aedtc=_lastest_date;
  grade=coalesce(_maxgrade_dated,_maxgrade_undated);
  if aedtc^=. then aeevnt=1;
run;

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 3814 views
  • 3 likes
  • 4 in conversation