I have a file that is in CSV format. The time stamps are in the format
2011-12-30T05:00:00Z
2012-04-09T04:00:00Z
2014-03-10T04:00:00Z
When I import them into SAS using the import wizard, they all show up as the same
01JAN60:00:00:00
01JAN60:00:00:00
01JAN60:00:00:00
I have discovered that the problem is the portion "T05:00:00Z". If I delete this from the observations I can import the dates correctly, though it drops the time portion of the date/time variable. This is not a viable solution for two reasons: (1) The file is too large to even open in excel so I do not know how to find and delete all the time stamps from the date/time variable and (2) although this example is daily observations (ie. they all have the same time portion of the date/time stamp), I have another file that is intraday so the time stamp is important.
Any help would be greatly appreciated.
Thanks!
I would simply run that code with one modification (and removing the unnecessry lines if desired). i.e.,
data WORK.ABC ; infile 'E:\Research\Data\filename.csv' delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ; informat SOURCE $10. ; informat SYMBOL $4. ; informat TIMESTAMP_UTC e8601dz. ; informat BULLISH_INTENSITY best32. ; informat BEARISH_INTENSITY best32. ; informat BULL_MINUS_BEAR best32. ; informat BULL_SCORED_MESSAGES best32. ; informat BEAR_SCORED_MESSAGES best32. ; informat BULL_BEAR_MSG_RATIO best32. ; informat TOTAL_SCANNED_MESSAGES best32. ; format SOURCE $10. ; format SYMBOL $4. ; format TIMESTAMP_UTC datetime. ; format BULLISH_INTENSITY best12. ; format BEARISH_INTENSITY best12. ; format BULL_MINUS_BEAR best12. ; format BULL_SCORED_MESSAGES best12. ; format BEAR_SCORED_MESSAGES best12. ; format BULL_BEAR_MSG_RATIO best12. ; format TOTAL_SCANNED_MESSAGES best12. ; input SOURCE $ SYMBOL $ TIMESTAMP_UTC BULLISH_INTENSITY BEARISH_INTENSITY BULL_MINUS_BEAR BULL_SCORED_MESSAGES BEAR_SCORED_MESSAGES BULL_BEAR_MSG_RATIO TOTAL_SCANNED_MESSAGES ; run;
You may have to change thee8601dz, informat to include the width and number of decimal places desired.
HTH,
Art, CEO, AnalystFinder.com
If you're using proc import, why not simply grab the code it is using (as shown in your log) and change the informat for that(those) variable(s) to
see: http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a003065430.htm
HTH,
Art, CEO, AnalystFinder.com
Thanks for the reply. My import code looks like:
PROC IMPORT OUT= WORK.a
DATAFILE= "E:\Research\....file.csv"
DBMS=CSV REPLACE;
GETNAMES=YES;
DATAROW=2;
RUN;
I'm not entirely sure how you are suggesting to change the code. Could you provide an example given what is above?
Thanks!
After you run that code, open and copy (ctrl-c) your log and paste (ctrl-v) it here.
Art, CEO, AnalystFinder.com
60 data WORK.ABC ;
61 %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
62 infile 'E:\Research\Data\filename.csv' delimiter =
62 ! ',' MISSOVER DSD lrecl=32767 firstobs=2 ;
63 informat SOURCE $10. ;
64 informat SYMBOL $4. ;
65 informat TIMESTAMP_UTC anydtdtm40. ;
66 informat BULLISH_INTENSITY best32. ;
67 informat BEARISH_INTENSITY best32. ;
68 informat BULL_MINUS_BEAR best32. ;
69 informat BULL_SCORED_MESSAGES best32. ;
70 informat BEAR_SCORED_MESSAGES best32. ;
71 informat BULL_BEAR_MSG_RATIO best32. ;
72 informat TOTAL_SCANNED_MESSAGES best32. ;
73 format SOURCE $10. ;
74 format SYMBOL $4. ;
75 format TIMESTAMP_UTC datetime. ;
76 format BULLISH_INTENSITY best12. ;
77 format BEARISH_INTENSITY best12. ;
78 format BULL_MINUS_BEAR best12. ;
79 format BULL_SCORED_MESSAGES best12. ;
80 format BEAR_SCORED_MESSAGES best12. ;
81 format BULL_BEAR_MSG_RATIO best12. ;
82 format TOTAL_SCANNED_MESSAGES best12. ;
83 input
84 SOURCE $
85 SYMBOL $
86 TIMESTAMP_UTC
87 BULLISH_INTENSITY
88 BEARISH_INTENSITY
89 BULL_MINUS_BEAR
90 BULL_SCORED_MESSAGES
91 BEAR_SCORED_MESSAGES
92 BULL_BEAR_MSG_RATIO
93 TOTAL_SCANNED_MESSAGES
94 ;
95 if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */
96 run;
NOTE: The infile 'E:\Research\Data\filename.csv' is:
Filename=E:\Research\Data\filename.csv,
RECFM=V,LRECL=32767,File Size (bytes)=6326951,
Last Modified=03Jan2017:19:17:20,
Create Time=18Jan2017:11:47:25
NOTE: 102660 records were read from the infile
'E:\Research\Data\filename.csv'.
The minimum record length was 56.
The maximum record length was 97.
NOTE: The data set WORK.ABC has 102660 observations and 10 variables.
NOTE: DATA statement used (Total process time):
real time 0.56 seconds
cpu time 0.53 seconds
102660 rows created in WORK.ABC from
E:\Research\Data\filename.csv.
NOTE: WORK.ABC data set was successfully created.
NOTE: The data set WORK.ABC has 102660 observations and 10 variables.
NOTE: PROCEDURE IMPORT used (Total process time):
real time 0.67 seconds
cpu time 0.57 seconds
I would simply run that code with one modification (and removing the unnecessry lines if desired). i.e.,
data WORK.ABC ; infile 'E:\Research\Data\filename.csv' delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ; informat SOURCE $10. ; informat SYMBOL $4. ; informat TIMESTAMP_UTC e8601dz. ; informat BULLISH_INTENSITY best32. ; informat BEARISH_INTENSITY best32. ; informat BULL_MINUS_BEAR best32. ; informat BULL_SCORED_MESSAGES best32. ; informat BEAR_SCORED_MESSAGES best32. ; informat BULL_BEAR_MSG_RATIO best32. ; informat TOTAL_SCANNED_MESSAGES best32. ; format SOURCE $10. ; format SYMBOL $4. ; format TIMESTAMP_UTC datetime. ; format BULLISH_INTENSITY best12. ; format BEARISH_INTENSITY best12. ; format BULL_MINUS_BEAR best12. ; format BULL_SCORED_MESSAGES best12. ; format BEAR_SCORED_MESSAGES best12. ; format BULL_BEAR_MSG_RATIO best12. ; format TOTAL_SCANNED_MESSAGES best12. ; input SOURCE $ SYMBOL $ TIMESTAMP_UTC BULLISH_INTENSITY BEARISH_INTENSITY BULL_MINUS_BEAR BULL_SCORED_MESSAGES BEAR_SCORED_MESSAGES BULL_BEAR_MSG_RATIO TOTAL_SCANNED_MESSAGES ; run;
You may have to change thee8601dz, informat to include the width and number of decimal places desired.
HTH,
Art, CEO, AnalystFinder.com
That fixed it, thanks so much!
Using the
e8601dz.
informat causes the script to malfunction when there are blank timestamps:
NOTE: Invalid data for
TIMESTAMP_UTC
Is there any workaround for this?
Any help would be appreciated.
Add 2 question marks after the variable is specified in the input statement. i.e.:
input SOURCE $ SYMBOL $ TIMESTAMP_UTC ?? BULLISH_INTENSITY
That will cause SAS to ignore the error, continue processing, and simply leave the variable with a missing value.
HTH,
Art, CEO, AnalystFinder.com
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.