BookmarkSubscribeRSS Feed
RVA
Fluorite | Level 6 RVA
Fluorite | Level 6

Dear all

 

I have the following data.  What do I need to do to flag rows 1 and 2 as they indeed happen at the same days and time (overlapping)?  I have been thinking but still cannot come up with SAS scripts to flag those two rows.  Any help is much appreciated.

 

Course section Meeting_days Beginning_time Ending_time
CHEM 100 001 MWF 13:00 13:45
CHEM 100 007 MWF 13:15 13:45
CHEM 100 010 MWF 14:00 16:00
CHEM 100 021 MWF 9:00 10:00
6 REPLIES 6
novinosrin
Tourmaline | Level 20

Hi @RVA  Thank you for posting the input sample. What is the corresponding output sample for the input sample?

blueroad
Fluorite | Level 6

brute force method - here is your algorithm (no SAS code)

 

make a data structure for every minute for the days of interest

mo0001-mo2359

tu0000-tu2359

 

if your data is limited to say 800 to 1800 your minutes array can be smaller

 

set all values to zero

take each row - use the start and end time to add 100 to each minute value

 

once you have processed all your rows, any values of your minute values GT 100 have overlaps

how many rows do you expect ?  put an integer on each row and add that to your minute flags

 

if I get a free hour or so I may be able to come up with some code

I recommend not using "scripts" to describe your work in SAS - use programs or code.

IMHO it "cheapens" your efforts.

 

Bob B, Somerset NJ

 

 

KachiM
Rhodochrosite | Level 12

Hi @RVA ,

 

It looks that the comparison has to be made with all records to decide the match. Here is a way where the records are matched and if found to have overlapping records then their RID (Record IDs in terms of _N_) is output. I have modified your input data with some more overlapping records to verify my code. Here is the augmented data set and the program. The approach is to note the Beginning, Ending times and the RID for the first record. The subsequent records are then compared with the previous values and if there is an overlap then the corresponding RIDs are output. If there is no overlap, the current record will be stored as previous values, prevRid taking the current _N_.

 

CHEM100  001 MWF 13:00 13:45
CHEM100  007 MWF 13:15 13:45
CHEM100  007 MWF 13:35 13:40
CHEM100  010 MWF 14:00 16:00
CHEM100  010 MWF 15:00 15:30
CHEM100  021 MWF  9:00 10:00
CHEM100  025 MWF  9:30 10:00

data want;
   do i = 1 by 1 until(eof);
      set have end = eof;
      if prevB = . then do; 
         prevB = Beginning_time;
         prevE = Ending_time;
         prevRid = i; 
      end;
      else do;
         if prevB <= Beginning_time and prevE >= Ending_time then do;
            nextRid = i;
            output;
         end;
         else do;
            prevB = Beginning_time;
            prevE = Ending_time;
            prevRid = i;
         end;
      end;
   end;
run;

The output is:

 

Capture_01.JPG

 

KachiM
Rhodochrosite | Level 12

Hi @RVA,

 

My previous program assumes that the overlapping records are placed one-after the other. What if the overlapping records are randomly placed?

 

If the overlapping record happens to fall anywhere (not to next) then my previous code can be revised using POINT= option of SET statement. The algorithm is to to select the i-th record and compare it with all records starting from i+1, until the end of the data set. If there is a overlap, then output the Record IDs of both the records.

 

I have reordered some of the overlapping records as shown below:

 

CHEM100  001 MWF 13:00 13:45
CHEM100  007 MWF 13:35 13:40
CHEM100  010 MWF 14:00 16:00
CHEM100  021 MWF  9:00 10:00
CHEM100  025 MWF  9:30 10:00
CHEM100  007 MWF 13:15 13:45
CHEM100  010 MWF 15:00 15:30

The revised program:

 

data want;
   do i = 1 to n - 1;
      set have nobs = n point = i; 
         prevB = Beginning_time;
         prevE = Ending_time;
         prevRid = i; 

      do j = i + 1 to n;
         set have point = j;
         if prevB <= Beginning_time and prevE >= Ending_time then do;
            nextRid = j;
            output;
         end;
      end;
   end;
stop;
run;

The OUPUT Is:

 

Capture_01.JPG

 

ed_sas_member
Meteorite | Level 14

Hi @RVA 

 

Here is my attempt. Let me know!

data have;
	input Course_section $ 1-12 Meeting_days $ 14-16 Beginning_time Ending_time;
	informat Beginning_time Ending_time time10.;
	format Beginning_time Ending_time time10.;
	datalines;
CHEM 100 001 MWF 13:00 13:45
CHEM 100 007 MWF 13:15 13:45
CHEM 100 010 MWF 14:00 16:00
CHEM 100 021 MWF 9:00  10:00
; run; proc sort data=have; by Meeting_days Beginning_time; run; data have2; set have; by Meeting_days; format _lag time10.; _lag = lag(Ending_time); if first.Meeting_days then _lag=.; if _lag >= Beginning_time then flag = 1; else flag=0; drop _lag; run; data want; merge have2 have2 (firstobs=2 keep=flag rename=(flag=flag_lead)); if flag=0 and flag_lead=1 then flag=1; drop flag_lead; run;

Output:

 

Capture d’écran 2019-12-30 à 14.34.00.png

RVA
Fluorite | Level 6 RVA
Fluorite | Level 6

Dear all

 

Thank you so much for your help.  Let me plug your scripts in and run them on my data.  I'll get back if I have any questions.

 

Much appreciated.

 

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 2043 views
  • 0 likes
  • 5 in conversation