BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
fordcr2
Obsidian | Level 7

Hello, 

 

I need help parsing out this string variable into something usable: 

 

Mon Mar 25 2019 17:46:55 GMT+0000 (Coordinated Universal Time)

 

I do not need anything after 17:46:55, so I only really need the Mar 25 2019 and the 17:46:55 parts of this string. I would like this conversion to be in some kind of loop, as this will need to be done >200 times. 

 

Thanks!!

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

See this:

data have;
input invar $80.;
datalines;
Mon Mar 25 2019 17:46:55 GMT+0000 (Coordinated Universal Time)
Mon Mar 11 2019 19:27:44 GMT+0000 (Coordinated Universal Time)
Mon Mar 11 2019 19:28:29 GMT+0000 (Coordinated Universal Time)
;
run;

data want;
set have;
datestr = scan(invar,3) !! scan(invar,2) !! scan(invar,4);
dtvar = input(datestr,date9.);
timevar = input(scan(invar,5),time8.);
format dtvar date9. timevar time8.;
run;

I use invar in my example dataset as column name. If your column is named differently, just replace "invar" with your column name.

View solution in original post

9 REPLIES 9
andreas_lds
Jade | Level 19

A loop is required if you have multiple variables containing those strings. If you have multiple observations, you don't need to write a loop, because a data step is iterating automatically.

 

Can you post some more examples in usable form (data step with datalines)?

 

It could be a good idea to create one datetime-variable, but this depends on the way you want to use the extracted information.

fordcr2
Obsidian | Level 7

Hi thank you for the clarification. You are correct - it is just one variable with multiple observations. 

 

The variable name is Regimen_Date and here are some more examples of what the data looks like. All observations come in in this format as well. 

 

Mon Mar 25 2019 17:46:55 GMT+0000 (Coordinated Universal Time)

Mon Mar 11 2019 19:27:44 GMT+0000 (Coordinated Universal Time)

Mon Mar 11 2019 19:28:29 GMT+0000 (Coordinated Universal Time)

 

So I will need a date and time variable from this (separate). 

Kurt_Bremser
Super User

The string can quite easily be re-assembled into a valid SAS datetime input:

data have;
input invar $80.;
datalines;
Mon Mar 25 2019 17:46:55 GMT+0000 (Coordinated Universal Time)
;
run;

data want;
set have;
dtstr = scan(invar,3) !! scan(invar,2) !! scan(invar,4) !! ':' !! scan(invar,5);
dtvar = input(dtstr,datetime19.);
format dtvar datetime19.;
run;
fordcr2
Obsidian | Level 7

Thank you for your response. This works well, except I need the date and time variables to be separated and I need to be able to do this over and over for multiple observations under the variable name Regimen_Date.  I have very little experience with this so I am sorry if the explanation needs to be a little elementary, but is there a way to assign invar as the Regimen_Date variable I currently have?

Kurt_Bremser
Super User

See this:

data have;
input invar $80.;
datalines;
Mon Mar 25 2019 17:46:55 GMT+0000 (Coordinated Universal Time)
Mon Mar 11 2019 19:27:44 GMT+0000 (Coordinated Universal Time)
Mon Mar 11 2019 19:28:29 GMT+0000 (Coordinated Universal Time)
;
run;

data want;
set have;
datestr = scan(invar,3) !! scan(invar,2) !! scan(invar,4);
dtvar = input(datestr,date9.);
timevar = input(scan(invar,5),time8.);
format dtvar date9. timevar time8.;
run;

I use invar in my example dataset as column name. If your column is named differently, just replace "invar" with your column name.

fordcr2
Obsidian | Level 7

That worked wonderfully! Thank you so much!

ballardw
Super User

A slightly different approach that does require a very fixed layout of the variable:

data have;
   input invar $80.;
   dt = input (substr(invar,5,25),anydtdtm21.);
   date=datepart(dt);
   time=timepart(dt);
   format dt datetime20. date date9. time time8.;
datalines;
Mon Mar 25 2019 17:46:55 GMT+0000 (Coordinated Universal Time)
Mon Mar 11 2019 19:27:44 GMT+0000 (Coordinated Universal Time)
Mon Mar 11 2019 19:28:29 GMT+0000 (Coordinated Universal Time)
;
run;
Cynthia_sas
Diamond | Level 26

Hi:

  "Some kind of loop" is a bit vague. Do you have 200 rows of data, with a date like this on each row? Or, do you have data with 200 variables or columns on each row and each of the columns is a date like this?

 

  What code have you tried? What is your end result? Do you want to have a SAS date value and a SAS time value, as 2 separate values? Do you want 2 character strings that you can display? Splitting out the information your want is fairly easy. Is there always a GMT and a + in the value; could there be a GMT and - sign in the value? Are there always parentheses in the value? Is the day of the week at the beginning always 3 characters?

 

  Can you describe what you have and what you need in more detail?

 

Cynthia

 

 

fordcr2
Obsidian | Level 7

Hello, 

 

What I meant is that it needs to be repeatable (this is a dataset that is updated often with medical device data and there will be thousands of observations that need to be converted this way - right now there are about 200). The data comes in under the variable Regimen_Date and the observations are as follows:

 

Mon Mar 25 2019 17:46:55 GMT+0000 (Coordinated Universal Time)

Mon Mar 11 2019 19:27:44 GMT+0000 (Coordinated Universal Time)

Mon Mar 11 2019 19:28:29 GMT+0000 (Coordinated Universal Time)

 

I want the date and time to be separate variables - all of the observations come in the same format, but I would need a date variable and a time variable separately parsed out from these strings.