BookmarkSubscribeRSS Feed
superbibi
Obsidian | Level 7

Hi Friends;

I have a data set in excel format. When I imported it to sas studio, I converted it to "window comma seperated csv" format. But some of the numeric variables were imported as string, such as hba1c. 

 

 

I looked up the log and corrected the informat statement, and I can fix the problem. But I hope to find an efficient way that can avoid this problem.

 

Also, when I import the excel table to STATA, I did not find the format problem.

 

Thanks.

 

 

 

          data WORK.IMPORT1    ;
 97             %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
 98             infile REFFILE delimiter = ',' MISSOVER DSD  firstobs=2 ;
....
 113               informat eos best32. ;
 114               informat baso best32. ;
 115               informat glucose best32. ;
 116               informat hba1c $1. ;
 117               informat crt best32. ;
 118               informat egfr best32. ;
 119               informat na best32. ;
........

 

data I have:

hpbpidhctwbcrbchba1c
19270818FM-B0.446.64.55 
19270818FM-B0.385.24.415.9
19270818FM-B0.387.34.335.9
19270818FM-B0.425.74.985.8
19270818FM-B0.39114.495.8
19270818FM-B0.4184.845.8
6 REPLIES 6
error_prone
Barite | Level 11
The information is not necessary for numeric variables, containing no such things as fancy decimal separator etc. If your file has only numeric content all you need are infile and input in a data-step. If. You have char-columns defining their length with the length-statement is all you need (use length before input).
Tom
Super User Tom
Super User

I am confused about whether you are importing an EXCEL file or a delimited text file?

If you have an EXCEL file with mixed types in a column then SAS will make the column as character.  

What did you expect it to do?

Make sure the missing cells are actually empty instead of contain a character string in then EXCEL file.

 

If you have a text file then PROC IMPORT should ignore empty cells and properly figure out that your column is numeric.  You might need to increase the GUESSINGROWS option.

 

But the best is to save your data as a text file (CSV or other delimited) and just write you own data step to read it.  You can write a clearer, shorter, and easier to maintain data step than PROC IMPORT will generate.

data want;
  infile REFFILE dsd truncover firstobs=2;
  length hpbpid $12 hct wbc rbc hba1c 8;
  input hpbpid hct wbc rbc hba1c;
run;

 

 

superbibi
Obsidian | Level 7

Hi @Tom,

 

Thank you for the response.

 

I used CSV format to do the import, and I used the import function in SAS studio. The code above is in the log after I ran the code of proc import from SAS studio. You are right the empty cell might cause the problem. But I am wondering why STATA does not have that problem and treats those blank cells as missing?

 

My data set has around 30 variables. If I import with the data step you showed , it requires a lot of work. 

 

Thanks.

Tom
Super User Tom
Super User

@superbibi wrote:

Hi @Tom,

... 

 

My data set has around 30 variables. If I import with the data step you showed , it requires a lot of work. 

 

Thanks.


It is not really any harder to write the code to import 30 variables than 4 variables. Just copy the header row and turn it into the LENGTH statement. Make your own guess at how to define the variables based on your interpretation of their names. I bet you will do a much better approximation than any computer algorithm. Or just define them all as character strings and look at the resulting dataset using the power of SAS. Then it is trivial to call the data step code back and modify it based on what you discover.

 

This is especially important if this data file is just one example of a series of data files you are going to get for this project. Letting SAS guess at how to define the variables based on one sample will usually result in a structure that is incompatible with the other samples.

ballardw
Super User

@Tom wrote:

I am confused about whether you are importing an EXCEL file or a delimited text file?

If you have an EXCEL file with mixed types in a column then SAS will make the column as character.  

What did you expect it to do?

Make sure the missing cells are actually empty instead of contain a character string in then EXCEL file.

 

If you have a text file then PROC IMPORT should ignore empty cells and properly figure out that your column is numeric.  You might need to increase the GUESSINGROWS option.

 

But the best is to save your data as a text file (CSV or other delimited) and just write you own data step to read it.  You can write a clearer, shorter, and easier to maintain data step than PROC IMPORT will generate.

data want;
  infile REFFILE dsd truncover firstobs=2;
  length hpbpid $12 hct wbc rbc hba1c 8;
  input hpbpid hct wbc rbc hba1c;
run;

 

 


Different programs => different defaults, different algorithms, different storage, …

Perhaps STATA assumes everything is numeric and ignore mixed data in column (I've never used STATA, so just comments). Or defaults to numeric when data is mixed (which is real bad for some sorts of data that has values that sometime look like numbers but aren't such as accounts, phone numbers, inventory stock codes and many others).

Tom
Super User Tom
Super User

What does your CSV file actually look like?  You can the Insert Code icon (looks like lowercase letter i in curly brackets) to get a popup window to let you paste some of the lines in the file into your forum post without the forum editor messing with the content.

 

Perhaps there are some "invisible" characters that are confusing PROC IMPORT?

 

If you do not have a tool to check the file you can just ask SAS to look at it for you.  For example to see the first 5 lines from the file in the SAS log run this little program.

data _null_;
  infile REFFILE obs=5;
  input;
  list;
run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1791 views
  • 0 likes
  • 4 in conversation