DATA Step, Macro, Functions and more

read in data

Accepted Solution Solved
Reply
Contributor
Posts: 32
Accepted Solution

read in data

Hello,

 

I am trying to read in a file in Excel however it won't accept date formats when I use a PROC IMPORT.

 

Of course INFILE does now work with XLSX.

 

How can I best import an external Excel file whereby I can read all character fields with NO TRUNCATION and also be able to specify data formats for the date fields?

 

Thank you


Accepted Solutions
Solution
‎05-08-2018 11:50 PM
Super User
Posts: 13,358

Re: read in data


@markc wrote:

Hello,

 

I am trying to read in a file in Excel however it won't accept date formats when I use a PROC IMPORT.

 

Of course INFILE does now work with XLSX.

 

How can I best import an external Excel file whereby I can read all character fields with NO TRUNCATION and also be able to specify data formats for the date fields?

 

Thank you


One of the biggest potential headaches, IMHO, is that way too many Excel files have manually entered/modified data. And since columns of data in Excel may have some cells as numeric valued dates (with a date display format in Excel) and others as text entered with '01Mar2018 or similar then anything reading such garbage has problems.

 

So I tend to turn any XLSX file into a CSV where I can see what sort of stuff exists and use the tools for dealing with delimited text.

Often the exported CSV file will show the mixed types of columns as some having values in quotes (that look like dates) and other values as numeric. Better results may be possible by highlighting a column in Excel and setting the same display type for the entire column before saving to CSV.

 

Truncation and other issues may be an issue because of the (not friendly for data interchange) of having the "alt-enter" to create vertical breaks inside a data cell.

View solution in original post


All Replies
PROC Star
Posts: 2,319

Re: read in data

[ Edited ]

1. What is your truncation problem? See here.

2. Just run proc datasets after importing to change the formats or any other metadata (variable names, labels, etc).

Contributor
Posts: 32

Re: read in data

[ Edited ]

Thank you ChrisNZ!

 

I actually wasn't sure if I was having a truncation problem, I just wanted to be 100% certain that I wasn't creating something which would truncate somehow moving forward.    Based on a reply within your link, it seems that using XLSX will look forward through the entire file to determine the max character length of any field to prevent inadvertent truncation, would this be true?

 

With regards to the date issue, when I use PROC IMPORT it seems to be reading the data in format DD/MM/YYYY as $CHAR5 so when I try to use the PROC DATASETS to MODIFY the file, it is not recognising DDMMYY10., in other words when I use:

 

format Effective_Date ddmmyy10.;

 

Do you know what I am doing wrong?

 

Many thanks!

Mark

 

 

PROC Star
Posts: 2,319

Re: read in data

[ Edited ]

> would this be true?

I thankfully hardly ever import excel data, but it seems to be the case.

 

 

> Do you know what I am doing wrong?

If the dates are imported as strings, you need to create new *numeric* variables.

This is done in a data step: it involves more than metadata changes and requires creating a new table.

And you use the input function to read the string into a SAS date.

For further details, examples of issues you encounter should be provided.

Solution
‎05-08-2018 11:50 PM
Super User
Posts: 13,358

Re: read in data


@markc wrote:

Hello,

 

I am trying to read in a file in Excel however it won't accept date formats when I use a PROC IMPORT.

 

Of course INFILE does now work with XLSX.

 

How can I best import an external Excel file whereby I can read all character fields with NO TRUNCATION and also be able to specify data formats for the date fields?

 

Thank you


One of the biggest potential headaches, IMHO, is that way too many Excel files have manually entered/modified data. And since columns of data in Excel may have some cells as numeric valued dates (with a date display format in Excel) and others as text entered with '01Mar2018 or similar then anything reading such garbage has problems.

 

So I tend to turn any XLSX file into a CSV where I can see what sort of stuff exists and use the tools for dealing with delimited text.

Often the exported CSV file will show the mixed types of columns as some having values in quotes (that look like dates) and other values as numeric. Better results may be possible by highlighting a column in Excel and setting the same display type for the entire column before saving to CSV.

 

Truncation and other issues may be an issue because of the (not friendly for data interchange) of having the "alt-enter" to create vertical breaks inside a data cell.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 142 views
  • 2 likes
  • 3 in conversation