Esteemed Advisers;
I'm trying to import a csv file with datetime data in the following format:
yyyy-mm-dd hh:mm:ss UTC
I can't find an informat that works and need some advice on how to proceed.
Below is a sample of what I've tried but isn't working--(thanks in advance!):
data have;
/*INFORMAT*/
informat created_at $19.;
/*FORMAT*/
format created_UTC datetime16. ;
/**/
input
created_at
;
datalines;
2022-04-01 00:01:20 UTC
2022-04-01 00:03:20 UTC
2022-04-01 00:05:20 UTC
run;
data want
created_UTC
01APR22:00:01:20
01APR22:00:03:20
01APR22:00:04:20
You cannot use a numeric format with a character variable. In particular to use the DATETIME format to display a variable it needs to be a numeric variable with datetime values (number of seconds since 1960).
So if you have this data:
data have;
input created_at $23.;
datalines;
2022-04-01 00:01:20 UTC
2022-04-01 00:03:20 UTC
2022-04-01 00:05:20 UTC
;
You can try to convert it to a datetime value by ignoring the timezone. You could then use the TZONEOFF() function to try to adjust to a different timezone if you want.
data want;
set have;
dt = input(created_at,anydtdtm19.);
timezone = scan(created_at,-1,' ');
offset = tzoneoff(timezone);
local_offset = tzoneoff();
local_dt = dt - offset + local_offset;
format dt local_dt datetime19. offset local_offset time6.;
run;
Results:
data have;
format created_at datetime21. ;
input
created_at ANYDTDTM19.
;
datalines;
2022-04-01 00:01:20 UTC
2022-04-01 00:03:20 UTC
2022-04-01 00:05:20 UTC
run;
proc print;
run;
You cannot use a numeric format with a character variable. In particular to use the DATETIME format to display a variable it needs to be a numeric variable with datetime values (number of seconds since 1960).
So if you have this data:
data have;
input created_at $23.;
datalines;
2022-04-01 00:01:20 UTC
2022-04-01 00:03:20 UTC
2022-04-01 00:05:20 UTC
;
You can try to convert it to a datetime value by ignoring the timezone. You could then use the TZONEOFF() function to try to adjust to a different timezone if you want.
data want;
set have;
dt = input(created_at,anydtdtm19.);
timezone = scan(created_at,-1,' ');
offset = tzoneoff(timezone);
local_offset = tzoneoff();
local_dt = dt - offset + local_offset;
format dt local_dt datetime19. offset local_offset time6.;
run;
Results:
Tom:
Thanks for the quick response. I tried your code but am not getting the result that you show. Namely, the dt and local_dt variable columns are empty. My code is below:
data have;
input created_at $23.;
datalines;
2022-04-01 00:01:20 UTC
2022-04-01 00:03:20 UTC
2022-04-01 00:05:20 UTC
run;
data want;
set have;
dt=input(created_at,anytdtdtm19.);
timezone=scan(created_at,-1,' ');
offset=tzoneoff(timezone);
local_offset=tzoneoff();
local_dt=dt - offset + local_offset;
format dt local_dt datetime19. offset local_offset time6.;
run;
proc print;
run;
An extra T got inserted into the informat name in the INPUT() function call.
1421 set have; 1422 dt=input(created_at,anytdtdtm19.); ------------ 48 ERROR 48-59: The informat ANYTDTDTM was not found or could not be loaded.
ANYDTDTM is the name of the informat.
If the extra T is not in your code then change that the spaces in CREATED_AT are actually spaces and not tabs ('09'x) or non-breaking spaces ('A0'x) or some other invisible character.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.