BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
kajal_30
Quartz | Level 8

Hi there,

 

I have a dataset and few columns have numeric values but when I reading that excel using the proc import SAS is reading them as character. Is there a way if I can retain the type while proc importing? 

for eg : sd_id has values as 3,4,5,6,7,8 similarly I have other few columns causing the same issue but sas is reading it as char. so currently we can see the specific columns showing this behavior but when it goes to production environment we are not even sure if other numeric columns are start getting read as char. Is there a way we can handle this issue.

 

Regards

Kajal

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

@kajal_30 wrote:

one more concern while creating the csv file from dataset I can see there are 2 empty lines after every record. but actual excel file doesn't have any empty lines.

- importing xlsx as dataset

- exporting dataset into a csv file

May I know why am I seeing these empty lines.

regards

Kajal


If the XLSX file has end of line characters in it then that will make the CSV file unusable.  Remove them.

data _null_:
  set have;
  file csv dsd ;
  array _character_ _character_ ;
  do over _character_;
    _character_=translate(_character_,'  ','0D0A'x);
  end;
  put (_all_) (+0);
run;

View solution in original post

26 REPLIES 26
ballardw
Super User

@kajal_30 wrote:

Hi there,

 

I have a dataset and few columns have numeric values but when I reading that excel using the proc import SAS is reading them as character. Is there a way if I can retain the type while proc importing? 

for eg : sd_id has values as 3,4,5,6,7,8 similarly I have other few columns causing the same issue but sas is reading it as char. so currently we can see the specific columns showing this behavior but when it goes to production environment we are not even sure if other numeric columns are start getting read as char. Is there a way we can handle this issue.

 

Regards

Kajal

 


Since EXCEL does not actually have the concept of a variable there is no TYPE involved at all.

Proc Import expects simple data layouts such as at maximum one row of column headings that might be treated as variable names. I see way too many Excel workbooks with multiple rows of header information. So the second (and or 3rd/4th/etc rows) are not actually numeric and the column gets treated as character because there are character values.

 

In a production environment you really don't want to use import. Not just because of this issue but variable lengths can change because SAS makes the decision for such separately for each file read. If you data source is sloppy you may also have issues with variable names changing as well as type.

 

The most reliable way to get specific properties for "columns" is to 1) save the file to a text format such as CSV and 2) use a data step to read that resulting file with a data step. One of the options in the INFILE used for reading with a data step allows you to specify which is the first line of actual data.

DrAbhijeetSafai
Lapis Lazuli | Level 10

Very informative @ballardw ! Thanks!

 

I am facing the reverse issue. SAS is reading character values as numeric because of the above issue in proc import and now I get that proc import is not a good idea but reading file by data step. Thanks again. 

 

- Dr. Abhijeet Safai

Dr. Abhijeet Safai
Certified Base and Clinical SAS Programmer
Associate Data Analyst
Actu-Real
Kurt_Bremser
Super User

"Production" and "Excel" are mutually exclusive. Excel files are simply useless as a medium for data transfer.

Save your data as text files (e.g. csv), which you can read with a custom written data step, so you have full control and get consistent results.

Tom
Super User Tom
Super User

There are two situations that can happen.

1) There is some non numeric cell in that column.

2) All of the cells are empty.  In that case PROC IMPORT will create the variable as character of length 1 since that will take 7 bytes less storage in the SAS dataset than a numeric variable.

 

In either case you have choices.

1) Fix the Excel file so the column only contains numeric values and at least one of them is not empty.

2) Save the data from the Excel into a different format.  Either one that enforces variable types, like a DBF file.  Or a text file that you can read with a data step.

3) Post process the SAS dataset to try to convert the character values to a numeric variable.

kajal_30
Quartz | Level 8

We will have no control over the excel file as it will be directly be provided by business to the process. so no manual intervention will be allowed.

Kurt_Bremser
Super User

@kajal_30 wrote:

We will have no control over the excel file as it will be directly be provided by business to the process. so no manual intervention will be allowed.


You can convert the useless Excel files to csv in batch, e.g. with LibreOffice. The Excel files stay as they are, and the csv files allow consistent reading into SAS.

 

kajal_30
Quartz | Level 8

Thanks @Kurt_Bremser  actually we only base sas available. do we have a base sas code for playing the same role as libre office?

Kurt_Bremser
Super User

@kajal_30 wrote:

Thanks @Kurt_Bremser  actually we only base sas available. do we have a base sas code for playing the same role as libre office?


Not really. XLSX files are ZIP archives of XML files, so they would be quite complex to read. You may find utilities for batch MS Office conversion on the internet.

So if you don't want to (or can't) install the necessary software, request data in a reasonable format in the first place. I am quite appalled that a "professional" shop uses Excel files as a transport medium.

 

 

Reeza
Super User

Here's a VBS routine, SAS generated/called that converts an XLSX file to CSV. 

https://gist.github.com/statgeek/878e585102c14e01581f55dbe972d27e

 

 

Tom
Super User Tom
Super User

You can make the check/fix step as complex as you need.

Here is a simple way, but it might still have trouble with somethings, like DATE fields.

 

Import the Excel sheet into a dataset.

Write the data from that dataset into a text file.

Read the text file into a dataset with the expected format.  If you have an existing file that has the variables defined the way you want and in the same order as the columns in the XLSX file it is very simple.

proc import datafile='myfile.xlsx' dbms=xlsx out=step1 replace;
run;
filename csv temp;
data _null_;
  set step1;
  file csv dsd ;
  put (_all_) (+0);
run;
data want;
  if 0 then set template ;
  infile csv dsd truncover;
  input (_all_) (+0);
run;
kajal_30
Quartz | Level 8
Hi Tom, where do we have this dataset template coming from ?
kajal_30
Quartz | Level 8

Hi Tom,

 

AS I am reimporting the csv to a dataset I can see 0 variables . Is there a way we can retain column names while importing from a csv file ? refraining from using input statement as in future business might change the sequence of the columns in the sheet

 

Regards

Kajal

Kurt_Bremser
Super User

If we're talking about production, a change in file structure must be documented before it is implemented, so your codes can be adapted. So the INPUT statement is a non-issue.

kajal_30
Quartz | Level 8

ok I can take it as a fair statement. So while we are doing the import from csv file the imported data without columns will follow the same sequence as we have in the file or it can be different.

for eg in csv file we have 

cust_no region 

100        USA

so for imported dataset can I put 

input cust_no region should be ok ? 

 

regards

Jagdeep

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 26 replies
  • 4895 views
  • 4 likes
  • 6 in conversation