I have a mapping specification for an analysis flag
Flag any data required for safety summaries. Set ANL01FL to "Y" for: All scheduled visits from Week 1 to Week 16; Data recorded under the Week 20 visit. If multiple records satisfy these conditions within a single analysis visit, select only the latest occurring visit to set ANL01FL to "Y". Scheduled visits are the visits which will be included in the summaries for this particular type of data. |
This is a mock up sample of data for a subject
USUBJID | VISIT | VISITNUM | PARAMCD | PARAM |
1001 | Screening | 1 | BMI | Body Mass Index (kg/m2) |
1001 | Screening | 1 | DIABP | Diastolic Blood Pressure (mmHg) |
1001 | Day 1 | 2 | DIABP | Diastolic Blood Pressure (mmHg) |
1001 | Week 2 | 4 | DIABP | Diastolic Blood Pressure (mmHg) |
1001 | Week 4 | 5 | DIABP | Diastolic Blood Pressure (mmHg) |
1001 | Week 8 | 6 | DIABP | Diastolic Blood Pressure (mmHg) |
1001 | Week 12 | 7 | DIABP | Diastolic Blood Pressure (mmHg) |
1001 | Screening | 1 | HEIGHT | Height (cm) |
1001 | Screening | 1 | HR | Heart Rate (beats/min) |
1001 | Day 1 | 2 | HR | Heart Rate (beats/min) |
1001 | Week 2 | 4 | HR | Heart Rate (beats/min) |
1001 | Week 4 | 5 | HR | Heart Rate (beats/min) |
1001 | Week 8 | 6 | HR | Heart Rate (beats/min) |
1001 | Week 12 | 7 | HR | Heart Rate (beats/min) |
1002 | Screening | 1 | BMI | Body Mass Index (kg/m2) |
1002 | Screening | 1 | DIABP | Diastolic Blood Pressure (mmHg) |
1002 | Screening | 1 | HEIGHT | Height (cm) |
1002 | Screening | 1 | HR | Heart Rate (beats/min) |
1002 | Screening | 1 | RESP | Respiratory Rate (breaths/min) |
1002 | Screening | 1 | SYSBP | Systolic Blood Pressure (mmHg) |
1002 | Screening | 1 | TEMP | Temperature (C) |
1002 | Screening | 1 | WEIGHT | Weight (kg) |
I was trying to use a dow loop to assign a flag to the latest visit
proc sort data=advs out=stage1; by usubjid paramcd avalc visitnum; run;
data want(rename=(_anl01fl = anl01fl));
do _n_ = 1 by 1 until (last.visitnum);
set stage1;
by usubjid paramcd avalc visitnum;
length _anl01fl $2;
_anl01fl = 'y';
end;
drop anl01fl;
run;
Anyone have any efficient recommendations/alternatives that doesn't involve too many proc sort/data steps?
I think this is what you want
data have;
input USUBJID $ VISIT :$20. VISITNUM PARAMCD :$10. PARAM :$50.;
infile datalines dlm = '|';
datalines;
1001|Screening|1|BMI |Body Mass Index (kg/m2)
1001|Screening|1|DIABP |Diastolic Blood Pressure (mmHg)
1001|Day 1 |2|DIABP |Diastolic Blood Pressure (mmHg)
1001|Week 2 |4|DIABP |Diastolic Blood Pressure (mmHg)
1001|Week 4 |5|DIABP |Diastolic Blood Pressure (mmHg)
1001|Week 8 |6|DIABP |Diastolic Blood Pressure (mmHg)
1001|Week 12 |7|DIABP |Diastolic Blood Pressure (mmHg)
1001|Screening|1|HEIGHT|Height (cm)
1001|Screening|1|HR |Heart Rate (beats/min)
1001|Day 1 |2|HR |Heart Rate (beats/min)
1001|Week 2 |4|HR |Heart Rate (beats/min)
1001|Week 4 |5|HR |Heart Rate (beats/min)
1001|Week 8 |6|HR |Heart Rate (beats/min)
1001|Week 12 |7|HR |Heart Rate (beats/min)
1002|Screening|1|BMI |Body Mass Index (kg/m2)
1002|Screening|1|DIABP |Diastolic Blood Pressure (mmHg)
1002|Screening|1|HEIGHT|Height (cm)
1002|Screening|1|HR |Heart Rate (beats/min)
1002|Screening|1|RESP |Respiratory Rate (breaths/min)
1002|Screening|1|SYSBP |Systolic Blood Pressure (mmHg)
1002|Screening|1|TEMP |Temperature (C)
1002|Screening|1|WEIGHT|Weight (kg)
;
proc sort data = have;
by USUBJID VISITNUM;
run;
data want(drop = r week);
set have;
by USUBJID VISITNUM;
if first.VISITNUM then r = 0;
week = ifn(find(visit, 'Week'), compress(visit, , 'kd'), .);
if week in (1 : 16, 20) then r = 1;
if last.VISITNUM and r = 1 then ANL01FL = 'Y';
retain r;
run;
Result:
USUBJID VISIT VISITNUM PARAMCD PARAM ANL01FL 1001 Screening 1 BMI Body Mass Index (kg/m2) 1001 Screening 1 DIABP Diastolic Blood Pressure (mmHg) 1001 Screening 1 HEIGHT Height (cm) 1001 Screening 1 HR Heart Rate (beats/min) 1001 Day 1 2 DIABP Diastolic Blood Pressure (mmHg) 1001 Day 1 2 HR Heart Rate (beats/min) 1001 Week 2 4 DIABP Diastolic Blood Pressure (mmHg) 1001 Week 2 4 HR Heart Rate (beats/min) Y 1001 Week 4 5 DIABP Diastolic Blood Pressure (mmHg) 1001 Week 4 5 HR Heart Rate (beats/min) Y 1001 Week 8 6 DIABP Diastolic Blood Pressure (mmHg) 1001 Week 8 6 HR Heart Rate (beats/min) Y 1001 Week 12 7 DIABP Diastolic Blood Pressure (mmHg) 1001 Week 12 7 HR Heart Rate (beats/min) Y 1002 Screening 1 BMI Body Mass Index (kg/m2) 1002 Screening 1 DIABP Diastolic Blood Pressure (mmHg) 1002 Screening 1 HEIGHT Height (cm) 1002 Screening 1 HR Heart Rate (beats/min) 1002 Screening 1 RESP Respiratory Rate (breaths/min) 1002 Screening 1 SYSBP Systolic Blood Pressure (mmHg) 1002 Screening 1 TEMP Temperature (C) 1002 Screening 1 WEIGHT Weight (kg)
YOur project description talks about variables ANL01FL and ABLFL which don't exist. Mark me down as "confused".
Nevertheless, if you want to keep the last record where something appears, assign sequential numbers to the rows, then sort by DESCENDING sequential numbers and keep the first.
So, given this data, only visitnum 7 should be flagged for USUBJID 1001, right?
And should both obs for visitnum 7 be flagged?
I think this is what you want
data have;
input USUBJID $ VISIT :$20. VISITNUM PARAMCD :$10. PARAM :$50.;
infile datalines dlm = '|';
datalines;
1001|Screening|1|BMI |Body Mass Index (kg/m2)
1001|Screening|1|DIABP |Diastolic Blood Pressure (mmHg)
1001|Day 1 |2|DIABP |Diastolic Blood Pressure (mmHg)
1001|Week 2 |4|DIABP |Diastolic Blood Pressure (mmHg)
1001|Week 4 |5|DIABP |Diastolic Blood Pressure (mmHg)
1001|Week 8 |6|DIABP |Diastolic Blood Pressure (mmHg)
1001|Week 12 |7|DIABP |Diastolic Blood Pressure (mmHg)
1001|Screening|1|HEIGHT|Height (cm)
1001|Screening|1|HR |Heart Rate (beats/min)
1001|Day 1 |2|HR |Heart Rate (beats/min)
1001|Week 2 |4|HR |Heart Rate (beats/min)
1001|Week 4 |5|HR |Heart Rate (beats/min)
1001|Week 8 |6|HR |Heart Rate (beats/min)
1001|Week 12 |7|HR |Heart Rate (beats/min)
1002|Screening|1|BMI |Body Mass Index (kg/m2)
1002|Screening|1|DIABP |Diastolic Blood Pressure (mmHg)
1002|Screening|1|HEIGHT|Height (cm)
1002|Screening|1|HR |Heart Rate (beats/min)
1002|Screening|1|RESP |Respiratory Rate (breaths/min)
1002|Screening|1|SYSBP |Systolic Blood Pressure (mmHg)
1002|Screening|1|TEMP |Temperature (C)
1002|Screening|1|WEIGHT|Weight (kg)
;
proc sort data = have;
by USUBJID VISITNUM;
run;
data want(drop = r week);
set have;
by USUBJID VISITNUM;
if first.VISITNUM then r = 0;
week = ifn(find(visit, 'Week'), compress(visit, , 'kd'), .);
if week in (1 : 16, 20) then r = 1;
if last.VISITNUM and r = 1 then ANL01FL = 'Y';
retain r;
run;
Result:
USUBJID VISIT VISITNUM PARAMCD PARAM ANL01FL 1001 Screening 1 BMI Body Mass Index (kg/m2) 1001 Screening 1 DIABP Diastolic Blood Pressure (mmHg) 1001 Screening 1 HEIGHT Height (cm) 1001 Screening 1 HR Heart Rate (beats/min) 1001 Day 1 2 DIABP Diastolic Blood Pressure (mmHg) 1001 Day 1 2 HR Heart Rate (beats/min) 1001 Week 2 4 DIABP Diastolic Blood Pressure (mmHg) 1001 Week 2 4 HR Heart Rate (beats/min) Y 1001 Week 4 5 DIABP Diastolic Blood Pressure (mmHg) 1001 Week 4 5 HR Heart Rate (beats/min) Y 1001 Week 8 6 DIABP Diastolic Blood Pressure (mmHg) 1001 Week 8 6 HR Heart Rate (beats/min) Y 1001 Week 12 7 DIABP Diastolic Blood Pressure (mmHg) 1001 Week 12 7 HR Heart Rate (beats/min) Y 1002 Screening 1 BMI Body Mass Index (kg/m2) 1002 Screening 1 DIABP Diastolic Blood Pressure (mmHg) 1002 Screening 1 HEIGHT Height (cm) 1002 Screening 1 HR Heart Rate (beats/min) 1002 Screening 1 RESP Respiratory Rate (breaths/min) 1002 Screening 1 SYSBP Systolic Blood Pressure (mmHg) 1002 Screening 1 TEMP Temperature (C) 1002 Screening 1 WEIGHT Weight (kg)
That's great thank you so much!. The only thing I would change is that, to me and how I'm interpreting the specs, Day 1 is technically under Week 1.
Anytime. Should be easy to edit
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.