BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
awardell
Obsidian | Level 7

Hi All, 

 

I am working on a data set that has PatientID, Visit_Date, and Age. I need to check data quality by flagging those who have a decrease in age from an earlier visit_date to a later visit_date, as this is not possible. My data set looks like the following example: 

PatientIDVisit_DateAge
11/2/200325
12/4/200523
13/7/200626
23/8/200350
25/12/200451
25/13/200552
25/14/200653
34/6/202072
312/7/202065

 

where observations in PatientID 1 and 3 would need to be flagged. How should I go about doing this? Could I use the lag function? I am fairly unfamiliar with the lag function, so any help would be greatly appreciated! Thank you in advance! 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

One way using DIF function:

 

Data want;
   set have;
   ageflag = (dif(age)<0);
   by patientid;
   if first.patientid then ageflag=.;
run;

DIF returns current value - value on previous record. SAS will return 1 for true and 0 for false, so the comparison will set a flag of 1 when the value on the current record is less than for the previous.

Since you wouldn't want to compare one patient's age with another patient then the first patientid sets the flag to missing (could use 0 but missing is more appropriate in many cases).

The above solution assumes the data is sorted by patientid and date of visit.

If not sorted by patientid but grouped you could use NOTSORTED on the BY patientid statement.

 

DIF is closely related to LAG but since it is very common to do what you requested, the difference between current and previous, SAS supplies the DIF function as well.

View solution in original post

2 REPLIES 2
ballardw
Super User

One way using DIF function:

 

Data want;
   set have;
   ageflag = (dif(age)<0);
   by patientid;
   if first.patientid then ageflag=.;
run;

DIF returns current value - value on previous record. SAS will return 1 for true and 0 for false, so the comparison will set a flag of 1 when the value on the current record is less than for the previous.

Since you wouldn't want to compare one patient's age with another patient then the first patientid sets the flag to missing (could use 0 but missing is more appropriate in many cases).

The above solution assumes the data is sorted by patientid and date of visit.

If not sorted by patientid but grouped you could use NOTSORTED on the BY patientid statement.

 

DIF is closely related to LAG but since it is very common to do what you requested, the difference between current and previous, SAS supplies the DIF function as well.

awardell
Obsidian | Level 7

Thank you so much! This worked perfectly!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 260 views
  • 0 likes
  • 2 in conversation