Hi Team,
I am working to assign flag values for below data ,as below are rules
> if three row values for same date , same visit and two values 'Normal' and one values 'Abnormal' then latest record will flag as 'Y'
> in three row records all are same then latest/last record will flag as 'Y'
> in three results first one is normal and second one is abnormal and third one is Normal then , third normal values will flag 'Y'
> if first two values are abnormal and last values is normal then second value abnormal will flag 'Y'
Current data :
Subjid | Visit | Date | id | Result |
101 | Visit1 | 10-Sep-24 | 1 | Normal |
101 | Visit1 | 10-Sep-24 | 2 | Normal |
101 | Visit1 | 10-Sep-24 | 3 | Abnormal |
102 | Visiti2 | 13-Jun-24 | 1 | Normal |
102 | Visiti2 | 13-Jun-24 | 2 | Abnormal |
102 | Visiti2 | 13-Jun-24 | 3 | Normal |
103 | Visit1 | 02-Mar-24 | 1 | Abnormal |
103 | Visit1 | 02-Mar-24 | 2 | Normal |
103 | Visit1 | 02-Mar-24 | 3 | Abnormal |
104 | Visit1 | 23-Feb-24 | 1 | Abnormal |
104 | Visit1 | 23-Feb-24 | 2 | Abnormal |
104 | Visit1 | 23-Feb-24 | 3 | Normal |
105 | Visit1 | 18-Jan-24 | 1 | Normal |
105 | Visit1 | 18-Jan-24 | 2 | Normal |
105 | Visit1 | 18-Jan-24 | 3 | Normal |
105 | Visit2 | 25-Jan-24 | 1 | Abnormal |
105 | Visit2 | 25-Jan-24 | 2 | Abnormal |
105 | Visit2 | 25-Jan-24 | 3 | Abnormal |
want assign flag values as below
Subjid | Visit | Date | id | Result | Flag |
101 | Visit1 | 10-Sep-24 | 1 | Normal | |
101 | Visit1 | 10-Sep-24 | 2 | Normal | Y |
101 | Visit1 | 10-Sep-24 | 3 | Abnormal | |
102 | Visiti2 | 13-Jun-24 | 1 | Normal | |
102 | Visiti2 | 13-Jun-24 | 2 | Abnormal | |
102 | Visiti2 | 13-Jun-24 | 3 | Normal | Y |
103 | Visit1 | 02-Mar-24 | 1 | Abnormal | |
103 | Visit1 | 02-Mar-24 | 2 | Normal | |
103 | Visit1 | 02-Mar-24 | 3 | Abnormal | Y |
104 | Visit1 | 23-Feb-24 | 1 | Abnormal | |
104 | Visit1 | 23-Feb-24 | 2 | Abnormal | Y |
104 | Visit1 | 23-Feb-24 | 3 | Normal | |
105 | Visit1 | 18-Jan-24 | 1 | Normal | |
105 | Visit1 | 18-Jan-24 | 2 | Normal | |
105 | Visit1 | 18-Jan-24 | 3 | Normal | Y |
105 | Visit2 | 25-Jan-24 | 1 | Abnormal | |
105 | Visit2 | 25-Jan-24 | 2 | Abnormal | |
105 | Visit2 | 25-Jan-24 | 3 | Abnormal | Y |
Thank you,
Raja.
First, I recreate the input data set:
data have;
infile datalines dsd dlm='09'x;
input Subjid:3. Visit:$6. Date:ANYDTDTE9. id:1. Result:$10.;
format Date YYMMDD10.;
datalines;
101 Visit1 10-Sep-24 1 Normal
101 Visit1 10-Sep-24 2 Normal
101 Visit1 10-Sep-24 3 Abnormal
102 Visit2 13-Jun-24 1 Normal
102 Visit2 13-Jun-24 2 Abnormal
102 Visit2 13-Jun-24 3 Normal
103 Visit1 02-Mar-24 1 Abnormal
103 Visit1 02-Mar-24 2 Normal
103 Visit1 02-Mar-24 3 Abnormal
104 Visit1 23-Feb-24 1 Abnormal
104 Visit1 23-Feb-24 2 Abnormal
104 Visit1 23-Feb-24 3 Normal
105 Visit1 18-Jan-24 1 Normal
105 Visit1 18-Jan-24 2 Normal
105 Visit1 18-Jan-24 3 Normal
105 Visit2 25-Jan-24 1 Abnormal
105 Visit2 25-Jan-24 2 Abnormal
105 Visit2 25-Jan-24 3 Abnormal
;
I'd do this using temporary arrays to store values for the whole group before deciding where to put the flag. Temp arrays don't use PDV variables to store values, so you don't have to worry about PDV initialization wiping them out. Like this:
data want;
/* Temporary arrays will retain values from all records in a group */
array i[3] _temporary_;
array r[3] $8 _temporary_;
set have;
by SubjID Visit Date;
Length Flag $1;
/* Start of new group - make array and flag values missing */
if first.Date then do;
call missing(of r[*],Flag);
end;
/* Stash current record values in the arrays */
i[id]=id;
r[id]=Result;
/* When we have them all, process and write output */
if last.Date then do;
/* First record is never flagged */
id=i[1];result=r[1];output;
/* If all 3 results are the same, flag the last */
if r[1]=r[2] and r[1]=r[3] then do;
id=i[2];result=r[2];output;
Flag='Y';
id=i[3];result=r[3];output;
end;
/* If results 1 and 2 are the same, flag the second */
else if r[1]=r[2] then do;
Flag='Y';
id=i[2];result=r[2];output;
call missing(Flag);
id=i[3];result=r[3];output;
end;
/* If results 1 and 3 are the same, flag the last */
else if r[1]=r[3] then do;
id=i[2];result=r[2];output;
Flag='Y';
id=i[3];result=r[3];output;
end;
end;
run;
And, violá!
Obs | Subjid | Visit | Date | id | Result | Flag |
---|---|---|---|---|---|---|
1 | 101 | Visit1 | 2024-09-10 | 1 | Normal | |
2 | 101 | Visit1 | 2024-09-10 | 2 | Normal | Y |
3 | 101 | Visit1 | 2024-09-10 | 3 | Abnormal | |
4 | 102 | Visit2 | 2024-06-13 | 1 | Normal | |
5 | 102 | Visit2 | 2024-06-13 | 2 | Abnormal | |
6 | 102 | Visit2 | 2024-06-13 | 3 | Normal | Y |
7 | 103 | Visit1 | 2024-03-02 | 1 | Abnormal | |
8 | 103 | Visit1 | 2024-03-02 | 2 | Normal | |
9 | 103 | Visit1 | 2024-03-02 | 3 | Abnormal | Y |
10 | 104 | Visit1 | 2024-02-23 | 1 | Abnormal | |
11 | 104 | Visit1 | 2024-02-23 | 2 | Abnormal | Y |
12 | 104 | Visit1 | 2024-02-23 | 3 | Normal | |
13 | 105 | Visit1 | 2024-01-18 | 1 | Normal | |
14 | 105 | Visit1 | 2024-01-18 | 2 | Normal | |
15 | 105 | Visit1 | 2024-01-18 | 3 | Normal | Y |
16 | 105 | Visit2 | 2024-01-25 | 1 | Abnormal | |
17 | 105 | Visit2 | 2024-01-25 | 2 | Abnormal | |
18 | 105 | Visit2 | 2024-01-25 | 3 | Abnormal | Y |
Hi ,
Thank you for resolution, however i am very poor to understanding the array in SAS.
Is any other way to do the same task with out array.
Thank you,
Raja.
Hi,
Alternatively, it can be solved via BY/GROUP processing. For that data should be sorted by subject, visit, date and result. When more than one results are the same for sorted variables, then LAST.RESULT is flagged.
Using data from @SASJedi .
proc sort data=have;
by subjid visit date result;
run;
data want;
set have;
by subjid visit date result;
if last.result and not first.result then Flag='Y';
run;
proc sort data=want;
by subjid visit date id result;
run;
Hi,
Thank you for solution , but in my data some subjects have only one record and those not flagged see below screen shot
Thank you,
Raja.
Then FIRST.variable and LAST.variable condition should be extended to related variables. If there is one record per subject, then if first.subject and last.subject condition works. If there are 2 visits per subject and each has one result, then if first.visit and last.visit condition works etc..
For your given example data in the screenshot, the first condition above works.
Try this:
if first.subjid and last.subjid then flag='Y';
else if first.visit and last.visit then flag='Y';
else if last.result and not first.result then flag='Y';
@raja777pharma wrote:
Hi,
Thank you for solution , but in my data some subjects have only one record and those not flagged
Please note that your original post does not provide any requirements or rules for what is to happen if a subject has only one visit, only two visits ore more than 3 visits.
So what are the rules for those?
@raja777pharma wrote:
Hi ,
Thank you for resolution, however i am very poor to understanding the array in SAS.
Is any other way to do the same task with out array.
Thank you,
Raja.
As arrays are among the most useful and important elements of DATA step programming, this simply means you have to learn their use. NOW.
See Maxim 13.
data have;
infile cards expandtabs;
input Subjid Visit $ Date :date11. id Result $;
format date date11.;
cards;
101 Visit1 10-Sep-24 1 Normal
101 Visit1 10-Sep-24 2 Normal
101 Visit1 10-Sep-24 3 Abnormal
102 Visiti2 13-Jun-24 1 Normal
102 Visiti2 13-Jun-24 2 Abnormal
102 Visiti2 13-Jun-24 3 Normal
103 Visit1 02-Mar-24 1 Abnormal
103 Visit1 02-Mar-24 2 Normal
103 Visit1 02-Mar-24 3 Abnormal
104 Visit1 23-Feb-24 1 Abnormal
104 Visit1 23-Feb-24 2 Abnormal
104 Visit1 23-Feb-24 3 Normal
105 Visit1 18-Jan-24 1 Normal
105 Visit1 18-Jan-24 2 Normal
105 Visit1 18-Jan-24 3 Normal
105 Visit2 25-Jan-24 1 Abnormal
105 Visit2 25-Jan-24 2 Abnormal
105 Visit2 25-Jan-24 3 Abnormal
;
data want;
do until(last.date);
set have;
by subjid date;
if id=1 and Result='Normal' then first_normal=1;
if id=1 and Result='Abnormal' then first_abnormal=1;
if id=2 and Result='Normal' then second_normal=1;
if id=2 and Result='Abnormal' then second_abnormal=1;
if id=3 and Result='Normal' then third_normal=1;
if id=3 and Result='Abnormal' then third_abnormal=1;
end;
do until(last.date);
set have;
by subjid date;
if first_normal and second_normal and third_abnormal and id=3 then flag='Y';
if ((first_normal and second_normal and third_normal) or
(first_abnormal and second_abnormal and third_abnormal)) and id=3 then flag='Y';
if first_normal and second_abnormal and third_normal and id=3 then flag='Y';
if first_abnormal and second_abnormal and third_normal and id=2 then flag='Y';
output;
call missing(flag);
end;
drop first: second: third:;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.