BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Mercadja
Fluorite | Level 6

Hello Community,

I am attempting to solve this coding issue. I have the dataset below.  The status hour is set based on time in my dataset and then I need to count 24 hours from there to complete an analysis on just those set of hours once they go on status. I am attempting to count 24 hours from the point they go on status. The easiest way is for me to go to the source excel and add the '1' status for the subsequent 24 hours but it is not efficient as I will have many IDs I would have to do this for. So far my attempts count the spots that do not have a status.  Can I please have help either adding 1 to the 23 rows below when status appears or just counting 24 rows with the new variable 'Status hour'.

 

Here is a few things I have tried from other threads to this point but it is not what I am looking for. 

 

Please ask if I can clarify anything and thank you in advance for any and all help.

 

data Want;
set Have;
by ID;
if Status=1
then Status_Hour= 1;
else Status_Hour + 1;
run;


data Want;
set Have;
by ID Status notsorted;
if first.Status then Status_Hour= 0;
Status_Hour + 1;
run;

 

data Want;
set Have;
by ID;

if Status=1 then Status_Hour=1;
if first.ID_ then Status_Hour+1;

run;

 

Have:

IDDateHourStatusStatus Hour
22325726-Jul-231  
22325726-Jul-232  
22325726-Jul-233  
22325726-Jul-234  
22325726-Jul-235  
22325726-Jul-236  
22325726-Jul-237  
22325726-Jul-2381 
22325726-Jul-239  
22325726-Jul-2310  
22325726-Jul-2311  
22325726-Jul-2312  
22325726-Jul-2313  
22325726-Jul-2314  
23270626-Jul-231  
23270626-Jul-232  
23270626-Jul-233  
23270626-Jul-234  
23270626-Jul-235  
23270626-Jul-236  
23270626-Jul-237  
23270626-Jul-238  
23270626-Jul-239  
23270626-Jul-2310  
23270626-Jul-2311  
23270626-Jul-23121 
23270626-Jul-2313  
23270626-Jul-2314  
23270626-Jul-2315  
23270626-Jul-2316  

Want:

IDDateHourStatusStatus Hour
22325726-Jul-231  
22325726-Jul-232  
22325726-Jul-233  
22325726-Jul-234  
22325726-Jul-235  
22325726-Jul-236  
22325726-Jul-237  
22325726-Jul-23811
22325726-Jul-239 2
22325726-Jul-2310 3
22325726-Jul-2311 4
22325726-Jul-2312 5
22325726-Jul-2313 6
22325726-Jul-2314 7
23270626-Jul-231  
23270626-Jul-232  
23270626-Jul-233  
23270626-Jul-234  
23270626-Jul-235  
23270626-Jul-236  
23270626-Jul-237  
23270626-Jul-238  
23270626-Jul-239  
23270626-Jul-2310  
23270626-Jul-2311  
23270626-Jul-231211
23270626-Jul-2313 2
23270626-Jul-2314 3
23270626-Jul-2315 4
23270626-Jul-2316 5
1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

Under the assumption you have no missing hours in your data and that I understood your requirement correctly below should work.

data want;
  set have;
  by id date hour;
  
  if first.id or status_hour=24 then status_hour=0;
  else if status=1 then status_hour=1;
  else if status_hour>0 then status_hour+1;

run;

View solution in original post

9 REPLIES 9
ballardw
Super User

Are those "hour" values of 1, 2, 3 actually used in your analysis or other process or are you just showing them as example counters to show the rows selected?

You also do not show a full set of 24. There are a couple ways "count 24 hours" might be interpreted and it helps to have an actual example.

 

Please provide example data in the form of working data step code pasted into a text box such as:

data have;
  infile datalines missover;
  input ID	Date :date10.	Hour	Status;
  format date date9.;
datalines;
223257	26-Jul-23	1	 	 
223257	26-Jul-23	2	 	 
223257	26-Jul-23	3	 	 
223257	26-Jul-23	4	 	 
223257	26-Jul-23	5	 	 
223257	26-Jul-23	6	 	 
223257	26-Jul-23	7	 	 
223257	26-Jul-23	8	1	 
223257	26-Jul-23	9	 	 
223257	26-Jul-23	10	 	 
223257	26-Jul-23	11	 	 
223257	26-Jul-23	12	 	 
223257	26-Jul-23	13	 	 
223257	26-Jul-23	14	 	 
232706	26-Jul-23	1	 	 
232706	26-Jul-23	2	 	 
232706	26-Jul-23	3	 	 
232706	26-Jul-23	4	 	 
232706	26-Jul-23	5	 	 
232706	26-Jul-23	6	 	 
232706	26-Jul-23	7	 	 
232706	26-Jul-23	8	 	 
232706	26-Jul-23	9	 	 
232706	26-Jul-23	10	 	 
232706	26-Jul-23	11	 	 
232706	26-Jul-23	12	1	 
232706	26-Jul-23	13	 	 
232706	26-Jul-23	14	 	 
232706	26-Jul-23	15	 	 
232706	26-Jul-23	16	 	 
;

Is that similar enough to your actual data to serve as a Have example?

 

Mercadja
Fluorite | Level 6

I felt that the post was really long and I shorten it. The data set per ID goes for days several hundreds of rows. The hour being named could be of use but not needed as I can get it there if instead of counting the hours this was solved simply by making the status repeat to the 23 rows below. Whatever is easiest honestly. 

Here is a dataset with all 24 hours included.

IDDateHourStatusStatus Hour
22325726-Jul-231  
22325726-Jul-232  
22325726-Jul-233  
22325726-Jul-234  
22325726-Jul-235  
22325726-Jul-236  
22325726-Jul-237  
22325726-Jul-23811
22325726-Jul-239 2
22325726-Jul-2310 3
22325726-Jul-2311 4
22325726-Jul-2312 5
22325726-Jul-2313 6
22325726-Jul-2314 7
22325726-Jul-2315 8
22325726-Jul-2316 9
22325726-Jul-2317 10
22325726-Jul-2318 11
22325726-Jul-2319 12
22325726-Jul-2320 13
22325726-Jul-2321 14
22325726-Jul-2322 15
22325726-Jul-2323 16
22325727-Jul-231 17
22325727-Jul-232 18
22325727-Jul-233 19
22325727-Jul-234 20
22325727-Jul-235 21
22325727-Jul-236 22
22325727-Jul-237 23
22325727-Jul-238 24
22325727-Jul-239  
22325727-Jul-2310  
223257  
23270626-Jul-231  
23270626-Jul-232  
23270626-Jul-233  
23270626-Jul-234  
23270626-Jul-235  
23270626-Jul-236  
23270626-Jul-237  
23270626-Jul-238  
23270626-Jul-239  
23270626-Jul-2310  
23270626-Jul-2311  
23270626-Jul-231211
23270626-Jul-2313 2
23270626-Jul-2314 3
23270626-Jul-2315 4
23270626-Jul-2316 5
23270626-Jul-2317 6
23270626-Jul-2318 7
23270626-Jul-2319 8
23270626-Jul-2320 9
23270626-Jul-2321 10
23270626-Jul-2322 11
23270626-Jul-2323 12
23270626-Jul-2324 13
23270627-Jul-231 14
23270627-Jul-232 15
23270627-Jul-233 16
23270627-Jul-234 17
23270627-Jul-235 18
23270627-Jul-236 19
23270627-Jul-237 20
23270627-Jul-238 21
23270627-Jul-239 22
23270627-Jul-2310 23
23270627-Jul-2311 24
23270627-Jul-2312  
23270627-Jul-2313  
23270627-Jul-2314  
Mercadja
Fluorite | Level 6

Sorry I am putting it into a dataset now one moment.

 

Here it is. 

data Testing;
input ID Date Hour Status;
format date date9.;
datalines;
223257	26-Jul-23	1		
223257	26-Jul-23	2		
223257	26-Jul-23	3		
223257	26-Jul-23	4		
223257	26-Jul-23	5		
223257	26-Jul-23	6		
223257	26-Jul-23	7		
223257	26-Jul-23	8	1	
223257	26-Jul-23	9		
223257	26-Jul-23	10		
223257	26-Jul-23	11		
223257	26-Jul-23	12		
223257	26-Jul-23	13		
223257	26-Jul-23	14		
223257	26-Jul-23	15		
223257	26-Jul-23	16		
223257	26-Jul-23	17		
223257	26-Jul-23	18		
223257	26-Jul-23	19		
223257	26-Jul-23	20		
223257	26-Jul-23	21		
223257	26-Jul-23	22		
223257	26-Jul-23	23		
223257	27-Jul-23	1		
223257	27-Jul-23	2		
223257	27-Jul-23	3		
223257	27-Jul-23	4		
223257	27-Jul-23	5		
223257	27-Jul-23	6		
223257	27-Jul-23	7		
223257	27-Jul-23	8		
223257	27-Jul-23	9		
223257	27-Jul-23	10		
232706	26-Jul-23	1		
232706	26-Jul-23	2		
232706	26-Jul-23	3		
232706	26-Jul-23	4		
232706	26-Jul-23	5		
232706	26-Jul-23	6		
232706	26-Jul-23	7		
232706	26-Jul-23	8		
232706	26-Jul-23	9		
232706	26-Jul-23	10		
232706	26-Jul-23	11		
232706	26-Jul-23	12	1	
232706	26-Jul-23	13		
232706	26-Jul-23	14		
232706	26-Jul-23	15		
232706	26-Jul-23	16		
232706	26-Jul-23	17		
232706	26-Jul-23	18		
232706	26-Jul-23	19		
232706	26-Jul-23	20		
232706	26-Jul-23	21		
232706	26-Jul-23	22		
232706	26-Jul-23	23		
232706	26-Jul-23	24		
232706	27-Jul-23	1		
232706	27-Jul-23	2		
232706	27-Jul-23	3		
232706	27-Jul-23	4		
232706	27-Jul-23	5		
232706	27-Jul-23	6		
232706	27-Jul-23	7		
232706	27-Jul-23	8		
232706	27-Jul-23	9		
232706	27-Jul-23	10		
232706	27-Jul-23	11		
232706	27-Jul-23	12		
232706	27-Jul-23	13		
232706	27-Jul-23	14		

;
run;

ballardw
Super User

Is there some reason you are excluding midnight from your data? I would expect to see one of

26-Jul-23 24    OR  27-Jul-23 0 as a midnight value for ID 223257?

 

The below code doesn't exactly match your "want" because you don't have a midnight for 223257 and do for the 232706 id. So the definition of "24 hours" differs a bit between the two. If there is data problem with midnight consistently I might look into that.  When midnight is not present your actual time interval between start and end is 25 hours. So something needs to be considered about either the problem requirement or the data condition when starting.

 

You do want to test your data step code before showing it. You didn't apply an informat for date so that throws invalid data messages and without the Infile datalines missover; (or truncover) only every  other line of the data is actually read sort of because of the missing value for STATUS on most of your lines. The option Missover or Truncover will stop reading when the input line runs out. Without it the ID value is read as STATUS for most of the lines and the reading pointer advances at that point.

 

Quite often a process that uses date and time is best done by creating a DATETIME value and appropriate functions with those values because then SAS will take care of those pesky change of date behaviors.

 

My take on your problem (note the corrected data step code).

data Testing;
   infile datalines missover;
input ID Date :date10. Hour Status;
format date date9.;
datalines;
223257	26-Jul-23	1		
223257	26-Jul-23	2		
223257	26-Jul-23	3		
223257	26-Jul-23	4		
223257	26-Jul-23	5		
223257	26-Jul-23	6		
223257	26-Jul-23	7		
223257	26-Jul-23	8	1	
223257	26-Jul-23	9		
223257	26-Jul-23	10		
223257	26-Jul-23	11		
223257	26-Jul-23	12		
223257	26-Jul-23	13		
223257	26-Jul-23	14		
223257	26-Jul-23	15		
223257	26-Jul-23	16		
223257	26-Jul-23	17		
223257	26-Jul-23	18		
223257	26-Jul-23	19		
223257	26-Jul-23	20		
223257	26-Jul-23	21		
223257	26-Jul-23	22		
223257	26-Jul-23	23		
223257	27-Jul-23	1		
223257	27-Jul-23	2		
223257	27-Jul-23	3		
223257	27-Jul-23	4		
223257	27-Jul-23	5		
223257	27-Jul-23	6		
223257	27-Jul-23	7		
223257	27-Jul-23	8		
223257	27-Jul-23	9		
223257	27-Jul-23	10		
232706	26-Jul-23	1		
232706	26-Jul-23	2		
232706	26-Jul-23	3		
232706	26-Jul-23	4		
232706	26-Jul-23	5		
232706	26-Jul-23	6		
232706	26-Jul-23	7		
232706	26-Jul-23	8		
232706	26-Jul-23	9		
232706	26-Jul-23	10		
232706	26-Jul-23	11		
232706	26-Jul-23	12	1	
232706	26-Jul-23	13		
232706	26-Jul-23	14		
232706	26-Jul-23	15		
232706	26-Jul-23	16		
232706	26-Jul-23	17		
232706	26-Jul-23	18		
232706	26-Jul-23	19		
232706	26-Jul-23	20		
232706	26-Jul-23	21		
232706	26-Jul-23	22		
232706	26-Jul-23	23		
232706	26-Jul-23	24		
232706	27-Jul-23	1		
232706	27-Jul-23	2		
232706	27-Jul-23	3		
232706	27-Jul-23	4		
232706	27-Jul-23	5		
232706	27-Jul-23	6		
232706	27-Jul-23	7		
232706	27-Jul-23	8		
232706	27-Jul-23	9		
232706	27-Jul-23	10		
232706	27-Jul-23	11		
232706	27-Jul-23	12		
232706	27-Jul-23	13		
232706	27-Jul-23	14		
;
run;

proc sql;
   create table want as 
   select b.id, b.date, b.hour ,b.datetime2 format datetime18.
   from (select id, date,hour,status, dhms(date,hour,0,0) as datetime
         from testing
         where status=1) as a
         left join
        (select id, date,hour,status, dhms(date,hour,0,0) as datetime2
         from testing) as b
        on a.id=b.id and
           a.datetime le b.datetime2 le intnx('hour',a.datetime,24,'b')
   order by b.id, b.date, b.hour
   ;
quit;
       

The Input statement wants some way to know that DATE is to be read as date value using an informat such as Date10. The :date10. on the input statement tells SAS to read the next value and attempt to use that format. You could also use an INFORMAT statement to assign a format to the variable .

 

Proc SQL is a way to match records. Select lists variables or calculated values. The From provides instructions of where data comes from. Where places constraints on what is selected.

Note that placing a select /from /where inside paraentheses as a sub query provides one group of values, in this case those that only have Status=1 and refers to the whole set with the alias A. Values from that group are addressed with a.<variable name>.

Note that I used the DHMS function to combine the date and hour into a single datetime value.

The Left JOIN says take every record from A and join on the second alias (all of the records in testing plus an added datetime variable) where the ID is the same and the second datetime variable is between the Status=1 datetime and , using the INTNX function to advance that time, up to + 24 hours.

Patrick
Opal | Level 21

Under the assumption you have no missing hours in your data and that I understood your requirement correctly below should work.

data want;
  set have;
  by id date hour;
  
  if first.id or status_hour=24 then status_hour=0;
  else if status=1 then status_hour=1;
  else if status_hour>0 then status_hour+1;

run;
Mercadja
Fluorite | Level 6

Thank you very much this works perfectly. I applied to my dataset and Status_hour = 0 for the hours before status = 1 and 0 after 24 hours are counted. Then it does it for every ID. Thank you so very much.

Patrick
Opal | Level 21

Actually: For this to also work as desired for cases where status=1 and hour=1 I believe you need to remove the first ELSE.

Patrick_0-1693480282636.png

 

Mercadja
Fluorite | Level 6

I ran it both ways and there is no difference in the result. Both programs identify the hour that status =1 appears then applies a new variable Status_Hour as 1 and continues counting down 24 rows.  Works perfectly either way.

 

Thank you for your time. 

Patrick
Opal | Level 21

If you don't remove the first ELSE then the code won't work properly if status=1 for the very first row of a new ID.

Please use below code to always get a correct result.

data want;
  set have;
  by id date hour;
  
  if first.id or status_hour=24 then status_hour=0;
  if status=1 then status_hour=1;
  else if status_hour>0 then status_hour+1;

run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1325 views
  • 0 likes
  • 3 in conversation