BookmarkSubscribeRSS Feed
marleeakerson
Calcite | Level 5

Hello, I am trying to capture a time stamp while importing a CSV file. The raw data looks like: 

marleeakerson_0-1614966926376.png

the time stamp i am trying to capture is the "Tue Mar 02 2021 14:44:45 GMT-0700" from line 2. But I don't want another information from the first three lines. I need the data to start at line 5. Ideally, I would like to add a new variable called 'TimeStamp' that captures the time stamp from line 2. 

 

M current code looks like:

data vacc_ltc ;
  infile "filepath" dsd truncover firstobs=5;
  input 
State :$20. 
Total_LTC_Doses_Adminstered	:20.
Dose1 :20.
Dose2 :20.
 ;
run;

Any advice on how to add in that time stamp?

8 REPLIES 8
Kurt_Bremser
Super User

Please copy/paste the real contents of the csv file into a window opened with the </> button. A picture of what Excel thinks is in the file is no good base to work on.

marleeakerson
Calcite | Level 5
COVID-19 Vaccinations in Long Term Care Facilities
Date generated: Tue Mar 02 2021 14:44:45 GMT-0700 (Mountain Standard Time)
       
State/Territory Total LTC Doses Adminstered People in LTC with 1+ Doses People in LTC with 2 Doses
Alabama 77434 48746 28403
Alaska 6346 4192 2131
American Samoa N/A N/A N/A
Arizona 102230 75263 26624
Arkansas 28084 18858 9129
California 743328 469515 270387
Colorado 124448 81007 42850
Connecticut 128606 77042 50792
Delaware 35133 24151 10767
District of Columbia 7678 5274 2295
Federated States of Micronesia N/A N/A N/A
Florida 385199 234959 147605
Georgia 116804 71915 44195
Guam N/A N/A N/A
Hawaii 21657 13875 7768
Idaho 26324 16282 9963
Illinois 390482 274369 114992
Indiana 123351 79432 43647
Iowa 134635 89229 44581
Kansas 87541 54817 32493
Kentucky 92607 62913 29262
Louisiana 105803 73451 31893
Maine 36308 23189 13038
Marshall Islands N/A N/A N/A
Maryland 159639 108419 50136
Massachusetts 300678 181899 117069
Michigan 253356 172716 79545
Minnesota 155038 104952 49524
Mississippi 50049 31749 18108
Missouri 142315 93900 47998
Montana 20684 13563 7087
Nebraska 56505 37532 18817
Nevada 71827 51292 20295
New Hampshire 46428 27909 18332
New Jersey 230338 143660 85498
New Mexico 36874 25124 11503
New York State 388955 248250 136929
North Carolina 203098 141124 60888
North Dakota 6461 3616 2815
Northern Mariana Islands N/A N/A N/A
Ohio 349179 226999 119284
Oklahoma 63105 43196 19580
Oregon 108865 70605 38066
Pennsylvania 389781 232318 155200
Puerto Rico 70055 52079 17942
Republic of Palau N/A N/A N/A
Rhode Island 47554 29677 17699
South Carolina 96850 63177 33061
South Dakota 12416 8486 3919
Tennessee 95696 66131 29281
Texas 448036 280652 164563
Utah 54398 34541 19605
Vermont 20370 13036 7255
Virgin Islands N/A N/A N/A
Virginia 197850 120465 75517
Washington 145539 98350 46488
West Virginia N/A N/A N/A
Wisconsin 149935 110821 38762
Wyoming 6652 3876 2755
Tom
Super User Tom
Super User

What happened to the commas?  Do NOT open the CSV with EXCEL (or any other spreadsheet program) as that might change some of the text in file based on what the spreadsheet program thinks the text means.  Instead open it with a text editor.  The SAS program editor will even work.  or just open a command window and cat (or type) the text of the file to the screen.

 

 

marleeakerson
Calcite | Level 5

I have attached the original file from the CDC website. 

Kurt_Bremser
Super User

THIS is how to post text data:

COVID-19 Vaccinations in Long Term Care Facilities
Date generated: Tue Mar 02 2021 14:44:45 GMT-0700 (Mountain Standard Time)

State/Territory,Total LTC Doses Adminstered,People in LTC with 1+ Doses,People in LTC with 2 Doses
Alabama,77434,48746,28403
Alaska,6346,4192,2131
American Samoa,N/A,N/A,N/A
Arizona,102230,75263,26624
Arkansas,28084,18858,9129
California,743328,469515,270387
Colorado,124448,81007,42850
Connecticut,128606,77042,50792
Delaware,35133,24151,10767
District of Columbia,7678,5274,2295
marleeakerson
Calcite | Level 5
Thank you! Given this information - do you know how to add a timestamp variable?
Tom
Super User Tom
Super User

Instead of having the INFILE statement skip the headers read the headers under your control. Then you can read the information in them.  

This will read the datetime from the second line and skip the others.  The / moves to next line. The @ moves to column on the line.  The test (_n_=1) means that it only runs the first time data step iterates.

 

It will use the RETAIN so that the datetime is kept on all observations.  Note that it is ignoring the offset from GMT.  If that is important then figure out how to read that also.

 

I also added logic to convert the N/A into . so that SAS will treat it as missing without throwing an error message.

data want;
  infile "filename" dsd truncover ;
  if _n_=1 then input / @20 timestamp anydtdtm20. //;
  retain timestamp;
  format timestamp datetime19.;
  input @;
  _infile_=tranwrd(_infile_,'N/A','.');
  input state :$20. total dose1 dose2;
run;
Obs              timestamp    state              total    dose1    dose2

 1      02MAR2021:14:44:04    Alabama            77434    48746    28403
 2      02MAR2021:14:44:04    Alaska              6346     4192     2131
 3      02MAR2021:14:44:04    American Samoa         .        .        .
 4      02MAR2021:14:44:04    Arizona           102230    75263    26624
 5      02MAR2021:14:44:04    Arkansas           28084    18858     9129

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 886 views
  • 0 likes
  • 3 in conversation