Hi there. This thread is further to the previous thread "Create Format to Display Time Interval" where I was told how to use cntlin to create a custom format to format a numeric field holding seconds as hours/minutes/seconds.
Can I do a similar thing to create an informat to take in the original time interval value in text from my input dataset and convert it to a numeric field holding seconds? I'm not clear how much of the Proc Format instructions apply to creating informats. What I need to do is create an informat to take in time interval values such as 1 hrs 0 mins 46 secs from a text field (or raw text file) and store them in seconds e.g. 3646 in a numeric field should be the result in this case.
The code I use is below. It takes the 'timespent' field and creates a timespent_seconds field from it. Probably handier ways to code it. But my questions is: Can this be transformation be done in a data step to create a fmtname and a permanent informat, in a similar way as the format was created in the thread: "Create Format to Display Time Interval" as advised by Geniz, solution 3. It would be ideal to hold all this code in an informat for re-use. Then I could do these kind of transformations in 1 or 2 lines for future datasets holding time intervals in this kind of format.
*3. Convert Time Interval field;
*******************************;
*3.1 Set up the Reg Expressions to be used for matching;
***************************************************;
*Put the prx setup in an if obs = 1 statement becaues doesn't need to keep being re-created..;
*Actually if don't do this prxparse runs millions of times and creates memory errors;
if _n_ = 1 then do;
reNum = "/\d+/";
reCNum = PRXPARSE(reNum);
retain reCNum;
reMins = "/\d+\s*mins/"; *builds the regular expression or search pattern. Will be
*looking for text containing char or varcher;
reCMins = PRXPARSE(reMins ); *compiles the expression for SAS, so called reText now;
retain reCMins;
reHours = "/\d+\s*hrs/";
reCHours = prxparse(reHours);
retain reCHours;
reSeconds = "/\d+\s*secs/";
reCSeconds = prxparse(reSeconds);
retain reCSeconds;
end;
*3.2 Extract the hours value;
******************************;
*put "************Matching reg expression" reCMins " against field TimeSpent";
If PRXMATCH(reCHours, timespent) then do;
call prxposn(reCHours, 0, pos, len);
Hours_text = substr(timespent, pos, len);
if prxmatch(reCNum, Hours_text) then do;
*Extracting just the numbers i.e. getting rid of the text characters like Hours or space;
call prxposn(reCNum, 0, pos, len);
Hours_num = substr(Hours_text, pos, len);
end;
end;
if Hours_num = . then Hours_num = 0;
*3.3 Extract the minutes value;
******************************;
*put "************Matching reg expression" reCMins " against field TimeSpent";
If PRXMATCH(reCMins, timespent) then do;
*put "Got a match on minutes for obs no: " _n_;
call prxposn(reCMins, 0, pos, len);
mins_text = substr(timespent, pos, len);
if prxmatch(reCNum, mins_text) then do;
*Extracting just the numbers i.e. getting rid of the text characters like mins or space;
call prxposn(reCNum, 0, pos, len);
mins_num = substr(mins_text, pos, len);
if mins_num = . then mins_num = 0;
end;
end;
if mins_num = . then mins_num = 0;
*3.4 Extract the seconds value;
******************************;
*put "************Matching reg expression" reCSeconds " against field TimeSpent";
If PRXMATCH(reCSeconds, timespent) then do;
*put "Got a match on minutes for obs no: " _n_;
call prxposn(reCSeconds, 0, pos, len);
Seconds_text = substr(timespent, pos, len);
if prxmatch(reCNum, Seconds_text) then do;
*Extracting just the numbers i.e. getting rid of the text characters like Seconds or space;
call prxposn(reCNum, 0, pos, len);
Seconds_num = substr(Seconds_text, pos, len);
end;
end;
if seconds_num = . then Seconds_num = 0;
*3.5 Add the values to get number of seconds in the field;
timespent_seconds = seconds_num + (mins_num * 60) + (hours_num * 60 * 60);