BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
112211
Obsidian | Level 7

data vs ;
input patient cpevent$ vstest$ vsren vsorresu$;
cards ;
100 base pr 78 min
100 scrn hr 72 /min
100 week1 sbp 70 mmhg
100 week7 dbp 110 mm
100 week21 weight 75 kg

100 fwp   height 120 kg
100 base pr 78 min
100 scrn hr 79 /mn
100 week1 sbp 70 mmhg
100 week7 dbp 110 mm
100 week21 weight 80 pounds

100 fwp height 75 kg
;

Condition below 

If VSTEST having same VSREN and VSORRESU for 4 consecutive visits then highlight those with flag as 

"dulipcate record" 

 

Based on above condition  and data , create a flag 

Please anyone help me .....

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

Below uses amended sample data so it contains at least one case with 4 repeated values. I've also added a variable visit_date. I do hope that your real data contains sufficient data to ensure your analysis runs against data in the required sort order.

It would also help if you show us the desired result as this will clarify which rows you would want flagged (all of them or only from the 4th repetition onwards). 

data have ;
  infile datalines dsd dlm=' ' truncover;
  input patient cpevent$ vstest$ vsren vsorresu$;
  visit_date='01jan2024'd+_n_;
  format visit_date date9.;
cards ;
100 base pr 78 min
100 base pr 78 min
100 scrn hr 72 /min
100 week1 sbp 70 mmhg
100 week7 dbp 110 mm
100 week21 weight 75 kg
100 fwp height 120 kg
100 base pr 78 min
100 base4 pr 78 min
100 base1 pr 78 min
100 base6 pr 78 min
100 base pr 78 min
100 scrn hr 79 /mn
100 week1 sbp 70 mmhg
100 week7 dbp 110 mm
100 week21 weight 80 pounds
100 fwp height 75 kg
;

data want;
  set have;
  by patient visit_date;
  if not first.patient and lag(vsren)=vsren and lag(vsorresu)=vsorresu then occurrence_cnt+1;
  else occurrence_cnt=1;
  multi_flg= (occurrence_cnt>=4);
run;

proc print data=want;
run;

View solution in original post

4 REPLIES 4
Quentin
Super User

Please update your question to show the output you would want from the sample data you posted. Also please show the code you have tried.  Both of those will help people help you.  You might want to consider updating your sample data as well.  If I'm understanding what you want, the sample data will not result in any duplicates being flagged.

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
112211
Obsidian | Level 7
data vs_flag;
set vs;
by patient;

retain count_vs ;
retain prev_vsresn .;
retain prev_vsorresu ;

if first.patient then do;
count_vs = 1;
prev_vsresn = vsresn;
prev_vsorresu = vsorresu;
end;
else do;
if vsresn = prev_vsresn and vsorresu = prev_vsorresu then count_vs + 1;
else do;
count_vs = 1;
prev_vsresn = vsresn;
prev_vsorresu = vsorresu;
end;
end;

if count_vs >= 3 then flag = "duplicate record";
else flag = "";

/* drop count_vs prev_vsresn prev_vsorresu;*/
run;
Patrick
Opal | Level 21

Below uses amended sample data so it contains at least one case with 4 repeated values. I've also added a variable visit_date. I do hope that your real data contains sufficient data to ensure your analysis runs against data in the required sort order.

It would also help if you show us the desired result as this will clarify which rows you would want flagged (all of them or only from the 4th repetition onwards). 

data have ;
  infile datalines dsd dlm=' ' truncover;
  input patient cpevent$ vstest$ vsren vsorresu$;
  visit_date='01jan2024'd+_n_;
  format visit_date date9.;
cards ;
100 base pr 78 min
100 base pr 78 min
100 scrn hr 72 /min
100 week1 sbp 70 mmhg
100 week7 dbp 110 mm
100 week21 weight 75 kg
100 fwp height 120 kg
100 base pr 78 min
100 base4 pr 78 min
100 base1 pr 78 min
100 base6 pr 78 min
100 base pr 78 min
100 scrn hr 79 /mn
100 week1 sbp 70 mmhg
100 week7 dbp 110 mm
100 week21 weight 80 pounds
100 fwp height 75 kg
;

data want;
  set have;
  by patient visit_date;
  if not first.patient and lag(vsren)=vsren and lag(vsorresu)=vsorresu then occurrence_cnt+1;
  else occurrence_cnt=1;
  multi_flg= (occurrence_cnt>=4);
run;

proc print data=want;
run;
Tom
Super User Tom
Super User

The only way this makes any sense if you process it BY the PATIENT and the VSTEST (vital sign test name).  Otherwise you might be comparing Systolic BP readings to Diastolic BP readings and flagging false repetitions.

 

So you will need to sort your data, which means you need a sortable variable.  Normally that would be some type of visit date or reading date for each observation.  For the purpose of an example let's just assume your example data is already in order and just number the observations to create an ordering variable, let's call it ROW.

data vs ;
  row+1;
  input patient cpevent$ vstest$ vsren vsorresu$;
cards ;
100 base pr 78 min
100 scrn hr 72 /min
100 week1 sbp 70 mmhg
100 week7 dbp 110 mm
100 week21 weight 75 kg
100 fwp   height 120 kg
100 base pr 78 min
100 scrn hr 79 /mn
100 week1 sbp 70 mmhg
100 week7 dbp 110 mm
100 week21 weight 80 pounds
100 fwp height 75 kg
;

Now we can sort the data into an order than can be used to find the repetitious values.

proc sort data=vs ;
  by patient vstest row ;
run;

Then we can use a double DOW loop to count how many repetitions, set your duplicate FLAG, and then re-read the data so we can output the FLAG on every observation in the group.

data want;
  do reps=1 by 1 until(last.vsorresu);
    set vs;
    by patient vstest vsren vsorresu notsorted;
  end;
  dup_flag = (reps>= 4);
  do _n_=1 to reps;
    set vs;
    output;
  end;
run;

None of the observations in your data meet the 4 repeated values criteria.  The most reps is only 2.  So if we change the criteria to look for 2 or more reps:

  dup_flag = (reps>= 2);

We get this result:

Tom_0-1715086780063.png

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 340 views
  • 0 likes
  • 4 in conversation