Hello, I am trying to capture a time stamp while importing a CSV file. The raw data looks like:
the time stamp i am trying to capture is the "Tue Mar 02 2021 14:44:45 GMT-0700" from line 2. But I don't want another information from the first three lines. I need the data to start at line 5. Ideally, I would like to add a new variable called 'TimeStamp' that captures the time stamp from line 2.
M current code looks like:
data vacc_ltc ;
infile "filepath" dsd truncover firstobs=5;
input
State :$20.
Total_LTC_Doses_Adminstered :20.
Dose1 :20.
Dose2 :20.
;
run;
Any advice on how to add in that time stamp?
Please copy/paste the real contents of the csv file into a window opened with the </> button. A picture of what Excel thinks is in the file is no good base to work on.
COVID-19 Vaccinations in Long Term Care Facilities | |||
Date generated: Tue Mar 02 2021 14:44:45 GMT-0700 (Mountain Standard Time) | |||
State/Territory | Total LTC Doses Adminstered | People in LTC with 1+ Doses | People in LTC with 2 Doses |
Alabama | 77434 | 48746 | 28403 |
Alaska | 6346 | 4192 | 2131 |
American Samoa | N/A | N/A | N/A |
Arizona | 102230 | 75263 | 26624 |
Arkansas | 28084 | 18858 | 9129 |
California | 743328 | 469515 | 270387 |
Colorado | 124448 | 81007 | 42850 |
Connecticut | 128606 | 77042 | 50792 |
Delaware | 35133 | 24151 | 10767 |
District of Columbia | 7678 | 5274 | 2295 |
Federated States of Micronesia | N/A | N/A | N/A |
Florida | 385199 | 234959 | 147605 |
Georgia | 116804 | 71915 | 44195 |
Guam | N/A | N/A | N/A |
Hawaii | 21657 | 13875 | 7768 |
Idaho | 26324 | 16282 | 9963 |
Illinois | 390482 | 274369 | 114992 |
Indiana | 123351 | 79432 | 43647 |
Iowa | 134635 | 89229 | 44581 |
Kansas | 87541 | 54817 | 32493 |
Kentucky | 92607 | 62913 | 29262 |
Louisiana | 105803 | 73451 | 31893 |
Maine | 36308 | 23189 | 13038 |
Marshall Islands | N/A | N/A | N/A |
Maryland | 159639 | 108419 | 50136 |
Massachusetts | 300678 | 181899 | 117069 |
Michigan | 253356 | 172716 | 79545 |
Minnesota | 155038 | 104952 | 49524 |
Mississippi | 50049 | 31749 | 18108 |
Missouri | 142315 | 93900 | 47998 |
Montana | 20684 | 13563 | 7087 |
Nebraska | 56505 | 37532 | 18817 |
Nevada | 71827 | 51292 | 20295 |
New Hampshire | 46428 | 27909 | 18332 |
New Jersey | 230338 | 143660 | 85498 |
New Mexico | 36874 | 25124 | 11503 |
New York State | 388955 | 248250 | 136929 |
North Carolina | 203098 | 141124 | 60888 |
North Dakota | 6461 | 3616 | 2815 |
Northern Mariana Islands | N/A | N/A | N/A |
Ohio | 349179 | 226999 | 119284 |
Oklahoma | 63105 | 43196 | 19580 |
Oregon | 108865 | 70605 | 38066 |
Pennsylvania | 389781 | 232318 | 155200 |
Puerto Rico | 70055 | 52079 | 17942 |
Republic of Palau | N/A | N/A | N/A |
Rhode Island | 47554 | 29677 | 17699 |
South Carolina | 96850 | 63177 | 33061 |
South Dakota | 12416 | 8486 | 3919 |
Tennessee | 95696 | 66131 | 29281 |
Texas | 448036 | 280652 | 164563 |
Utah | 54398 | 34541 | 19605 |
Vermont | 20370 | 13036 | 7255 |
Virgin Islands | N/A | N/A | N/A |
Virginia | 197850 | 120465 | 75517 |
Washington | 145539 | 98350 | 46488 |
West Virginia | N/A | N/A | N/A |
Wisconsin | 149935 | 110821 | 38762 |
Wyoming | 6652 | 3876 | 2755 |
What happened to the commas? Do NOT open the CSV with EXCEL (or any other spreadsheet program) as that might change some of the text in file based on what the spreadsheet program thinks the text means. Instead open it with a text editor. The SAS program editor will even work. or just open a command window and cat (or type) the text of the file to the screen.
That is NOT a CSV file.
A csv file is pure text (no graphics), and the columns are separated by commas.
I have attached the original file from the CDC website.
THIS is how to post text data:
COVID-19 Vaccinations in Long Term Care Facilities Date generated: Tue Mar 02 2021 14:44:45 GMT-0700 (Mountain Standard Time) State/Territory,Total LTC Doses Adminstered,People in LTC with 1+ Doses,People in LTC with 2 Doses Alabama,77434,48746,28403 Alaska,6346,4192,2131 American Samoa,N/A,N/A,N/A Arizona,102230,75263,26624 Arkansas,28084,18858,9129 California,743328,469515,270387 Colorado,124448,81007,42850 Connecticut,128606,77042,50792 Delaware,35133,24151,10767 District of Columbia,7678,5274,2295
Instead of having the INFILE statement skip the headers read the headers under your control. Then you can read the information in them.
This will read the datetime from the second line and skip the others. The / moves to next line. The @ moves to column on the line. The test (_n_=1) means that it only runs the first time data step iterates.
It will use the RETAIN so that the datetime is kept on all observations. Note that it is ignoring the offset from GMT. If that is important then figure out how to read that also.
I also added logic to convert the N/A into . so that SAS will treat it as missing without throwing an error message.
data want;
infile "filename" dsd truncover ;
if _n_=1 then input / @20 timestamp anydtdtm20. //;
retain timestamp;
format timestamp datetime19.;
input @;
_infile_=tranwrd(_infile_,'N/A','.');
input state :$20. total dose1 dose2;
run;
Obs timestamp state total dose1 dose2 1 02MAR2021:14:44:04 Alabama 77434 48746 28403 2 02MAR2021:14:44:04 Alaska 6346 4192 2131 3 02MAR2021:14:44:04 American Samoa . . . 4 02MAR2021:14:44:04 Arizona 102230 75263 26624 5 02MAR2021:14:44:04 Arkansas 28084 18858 9129
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.