I have a data as below and want to flag records with following criteria.
1. Max and minimum value within each labtest and if there are multiple records then flag record with latest date.
I am able to do that with following algorithm but looking for more efficient method.
1. get Max and min value with proc sql step
2. merge data from step 1 with "have" data and if value equals max and min value then flag record as 1.
3. subset data from above step to get only records with max=1. Then use data step by group processing to choose last.labtest and flag the max record with latest date as Y.
4. Repeat step 3 for flagging minimum value records.
DATA WORK.Book3;
LENGTH
Subject $ 3
Labtest $ 3
value 8
date 8 ;
FORMAT
Subject $CHAR3.
Labtest $CHAR3.
value BEST12.
date DATE9. ;
INFORMAT
Subject $CHAR3.
Labtest $CHAR3.
value BEST12.
date DATE9. ;
INFILE DATALINES4
DLM='7F'x
MISSOVER
DSD ;
INPUT
Subject : $CHAR3.
Labtest : $CHAR3.
value : BEST32.
date : BEST32. ;
DATALINES4;
001RBC1522397
001RBC1222398
001RBC1522399
001WBC922400
001WBC1022401
001ESR5522402
001ESR422403
001ESR5522404
;;;;
You can obviously do that, but what about using the MINID/MAXID options in PROC SUMMARY?
Examples of both here:
https://communities.sas.com/t5/SAS-Programming/Proc-SQL-MINID-equivalent/td-p/311628
@chetan3125 wrote:
I have a data as below and want to flag records with following criteria.
1. Max and minimum value within each labtest and if there are multiple records then flag record with latest date.
I am able to do that with following algorithm but looking for more efficient method.
1. get Max and min value with proc sql step
2. merge data from step 1 with "have" data and if value equals max and min value then flag record as 1.
3. subset data from above step to get only records with max=1. Then use data step by group processing to choose last.labtest and flag the max record with latest date as Y.
4. Repeat step 3 for flagging minimum value records.
DATA WORK.Book3; LENGTH Subject $ 3 Labtest $ 3 value 8 date 8 ; FORMAT Subject $CHAR3. Labtest $CHAR3. value BEST12. date DATE9. ; INFORMAT Subject $CHAR3. Labtest $CHAR3. value BEST12. date DATE9. ; INFILE DATALINES4 DLM='7F'x MISSOVER DSD ; INPUT Subject : $CHAR3. Labtest : $CHAR3. value : BEST32. date : BEST32. ; DATALINES4; 001RBC1522397 001RBC1222398 001RBC1522399 001WBC922400 001WBC1022401 001ESR5522402 001ESR422403 001ESR5522404 ;;;;
I think your methodology would work but it's also overly complex. A sort seems to work just fine and I think it replicates your logic fairly well. Does this work? If not, please explain how in detail with example cases.
PROC SORT DATA=BOOK3;
BY SUBJECT LABTEST DESCENDING VALUE DESCENDING DATE;
RUN;
DATA FLAG_MAX;
SET BOOK3;
BY SUBJECT LABTEST;
IF FIRST.LABTEST THEN FLAG='Y';
RUN;
PROC SORT DATA=FLAG_MAX;
BY SUBJECT LABTEST DATE;
RUN;
PROC PRINT DATA=FLAG_MAX;
RUN;
These are the results:
Obs Subject Labtest value date FLAG 1 001 ESR 55 02MAY2021 2 001 ESR 4 03MAY2021 3 001 ESR 55 04MAY2021 Y 4 001 RBC 15 27APR2021 5 001 RBC 12 28APR2021 6 001 RBC 15 29APR2021 Y 7 001 WBC 9 30APR2021 8 001 WBC 10 01MAY2021 Y
PROC SORT DATA=BOOK3;
BY SUBJECT LABTEST DESCENDING VALUE DESCENDING DATE;
RUN;
DATA FLAG_MAX;
SET BOOK3;
BY SUBJECT LABTEST;
IF FIRST.LABTEST THEN MAX_FLAG='Y';
RUN;
PROC SORT DATA=FLAG_MAX;
BY SUBJECT LABTEST VALUE DESCENDING DATE;
RUN;
DATA FLAG_MAX_MIN;
SET FLAG_MAX;
BY SUBJECT LABTEST;
IF FIRST.LABTEST THEN MIN_FLAG='Y';
RUN;
PROC SORT DATA=FLAG_MAX_MIN;
BY SUBJECT LABTEST DATE;
RUN;
PROC PRINT DATA=FLAG_MAX_MIN;
RUN;
Would you please show us the expected result from your (very nicely posted) example dataset?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.