I have some oddly formatted data from an excel spread sheet I'm trying to read in. A sample of the excel file I'm trying to read in is below. So in particular there are two challenges. I would like to ignore the title and start reading in and get names from the top header: Date, A, B, C, D, E, F, Time/Date. This is an sample of the spread sheet:
| TITLE | |||||||
| Date | A | B | C | D | E | F | Time |
| Date | |||||||
| January | |||||||
| 1/20/2004 | Company A | 2:00 p.m. | Connecticut | NA | 4,000 | 400,00 | 1/26/04, 9:00 a.m. |
| February | |||||||
| 2/12/2004 | Company B | 11:32 a.m. | San Francisco | NA | 1,000 | 340,000 | 3/01/04, 8:50 a.m. |
| 2/28/2004 | CompanyC | 1:00 pm | New York | NA | NA | NA | |
| None | |||||||
The Date column is particularly tricky for me because it has January and February, etc followed by MM/DD/YYYY format beneath it. How can I just read in the MM/DD/YYYY, and ignore the January, February, March, etc or place it side by side while reading in such as January 1/20/2004, in separate columns, and then the same for February, etc. So When reading in I would like to see something like
| Month | Date | A | B | C | D | E | F | Time/Date |
| January | 1/20/2004 | Company A | 2:00 p.m. | Connecticut | NA | 4,000 | 400,00 | 1/26/04, 9:00 a.m. |
| February | 2/12/2004 | Company B | 11:32 a.m. | San Francisco | NA | 1,000 | 340,000 | 3/01/04, 8:50 a.m. |
| February | 2/28/2004 | CompanyC | 1:00 pm | New York | NA | NA | NA |
Any suggestions?
I'm using this code to begin reading the file in:
PROC IMPORT OUT= WORK.Out
DATAFILE= "C:\Users\Desktop\Data\Excel1.xls"
DBMS=EXCEL REPLACE;
RANGE="Sheet1$";
GETNAMES=YES;
MIXED=NO;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;
I don't think there's an easy way in any language ![]()
There are ways though..
Reading Excel files like this is like reading reports.
The best option if available is to try to get a more clean layout from your source (this is to done on a regular basis right?, otherwise fix the Excel manually first).
Is the month name in a separate cell/row?
Either way, import, and then do some data step programming to fix it in the way you like it.
Yeah, I don't mind fixing the excel manually except. I have a several files that I would have to do so, it seems being able to read in the native format with sas would be the best strategy. The Time and Date are in a separate cell/row.
Is the data the exact same in each file, ie same row/same column? If so you could do it "manually" once using DDE and then run for each file.
Still a pain though.
So I guess there's no easy ways to do this in sas?
I don't think there's an easy way in any language ![]()
There are ways though..
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.