Help using Base SAS procedures

looking for suggestions importing an excel table with 5000 columns

Accepted Solution Solved
Reply
Contributor
Posts: 57
Accepted Solution

looking for suggestions importing an excel table with 5000 columns

Hi,

I've been trying to import an excel table with around 5000 columns (variables) and 100 rows (observations) into SAS 9.2. This is directly not possible, while SAS can only import 255 variables at a time.

I tried converting my table into .txt data with Tab delimited. Somehow in this process, some variable names went wrong. I am guessing because in my original table there are cells that contain a lot of characters and texts. So if i read this .txt data into SAS, I only got some of the variable names right. the others are renamed by SAS to VAR. I looked up this in internet, and it says it might be caused, because some names contain multiple spaces. I can copy the log and do the correction per hand, but with the number of variables i am having, this will take long and since i probably only need to import the data once (not once every month or sth. ), this is really not my first option.

I thought about transposing the excel data first and in that case, i only have 100 variables. SAS can read all of them at a time and then i can transpose them back, but the problem is, some characters were showed as 'missing values' after the import, which i dont quite understand why.

Can someone please share his/her experience with me reading such a dataset into SAS? I am very grateful for any kind of tipps or suggestions.

BR  Dingdang


Accepted Solutions
Solution
‎11-12-2013 11:33 AM
Super User
Posts: 11,343

Re: looking for suggestions importing an excel table with 5000 columns

When you say:

I tried converting my table into .txt data with Tab delimited. Somehow in this process, some variable names went wrong. I am guessing because in my original table there are cells that contain a lot of characters and texts.

You are probably pointing to why things worked better with multiple files. When building variable names from a text file using proc import, SAS takes the first 32 characters, replacing non-letter and non-numeric with underscores. If there are ties in the resulting values the second and subsequent variables will be named sequentially: Var1 Var2.

If this process is going to be repeated it is worth importing the single file as CSV or tab delimited and cleaning up the code generated by proc import to specify variable names SAS can use. Assign the current variable names as LABELs to the shorter variable names you generate. Also you can correct the variable types such as numeric codes that should be strings, or those columns that contain text and numbers should actually be strings.

I have to do this routinely as it seems that way too many databases are being created, or there export process, with column headings like "If yes special needs or developmental delay describe" for each of multiple children in a family. So I create shorter variables which indicate which child number and create a label with the descriptive text and the child number. Yes it can be a much work. But if this going to be done every week/month/quarter or what have then the effort usually pays off with the second or third processing period.

Hint: make sting variables about 10% longer than the longest in the first data example, or if you have access to a data dictionary for the source data use the lengths specified to reduce data loss.

And you're on your own for free form basically unlimited length text entry fields.

View solution in original post


All Replies
Super User
Posts: 5,424

Re: looking for suggestions importing an excel table with 5000 columns

Why do you want to store the data with so many columns at all? What do you intend to so with it in SAS? Transposing sounds like a good idea to me. You have to be more specific with you problem with missing values, maybe show some sample data.

Data never sleeps
Contributor
Posts: 57

Re: looking for suggestions importing an excel table with 5000 columns

hi LinusH,

we are doing a kind of survey and would like somehow to bild a databank in SAS and we have around 5000 data points that we queried.

I think the problem with the missing values come from the formats, like BallardW guessed. this is difficult to solve with transpose, because originally i have columns with texts and columns with numbers. when i tranpose the data, SAS cannot make the right guess any more.

thanks for your answer. I am working on the CSV solution and try to see how that comes out. I will probably upload some data later if i cannot solve it myself Smiley Happy

have a nice day.  Dingdang

Super User
Posts: 11,343

Re: looking for suggestions importing an excel table with 5000 columns

Try saving the Excel as CSV and import that. After import either with wizard or proc import, us either the F4 key to bring submitted datastep code created by import into the editor. If you don't like the data types adjust the informats.

The missing values on your transposed data were likely original values containing characters that the import thought/decided were numeric.

Super User
Posts: 19,768

Re: looking for suggestions importing an excel table with 5000 columns

To add on to BallardW suggestion there's a macro code on Microsoft website that will export a quoted CSV file if you have any text fields that may have comma's within them.

A second option if this is a one off - Import to Access then to SAS directly.

Contributor
Posts: 57

Re: looking for suggestions importing an excel table with 5000 columns

thank you for your help Smiley Happy

I kind of solved the problem by saving the big excel table into 4 .txt data and read into SAS and the "renaming" problem didnt come up any more. somehow when there are less than 1900 variables in a txt data, SAS can read all the variable names corretly. I dont know why, but anyway, this finishes the task.

If anyone knows the trick in this, please let me know. otherweise, this remains another SAS myth for me Smiley Happy

BR  Dingdang

Solution
‎11-12-2013 11:33 AM
Super User
Posts: 11,343

Re: looking for suggestions importing an excel table with 5000 columns

When you say:

I tried converting my table into .txt data with Tab delimited. Somehow in this process, some variable names went wrong. I am guessing because in my original table there are cells that contain a lot of characters and texts.

You are probably pointing to why things worked better with multiple files. When building variable names from a text file using proc import, SAS takes the first 32 characters, replacing non-letter and non-numeric with underscores. If there are ties in the resulting values the second and subsequent variables will be named sequentially: Var1 Var2.

If this process is going to be repeated it is worth importing the single file as CSV or tab delimited and cleaning up the code generated by proc import to specify variable names SAS can use. Assign the current variable names as LABELs to the shorter variable names you generate. Also you can correct the variable types such as numeric codes that should be strings, or those columns that contain text and numbers should actually be strings.

I have to do this routinely as it seems that way too many databases are being created, or there export process, with column headings like "If yes special needs or developmental delay describe" for each of multiple children in a family. So I create shorter variables which indicate which child number and create a label with the descriptive text and the child number. Yes it can be a much work. But if this going to be done every week/month/quarter or what have then the effort usually pays off with the second or third processing period.

Hint: make sting variables about 10% longer than the longest in the first data example, or if you have access to a data dictionary for the source data use the lengths specified to reduce data loss.

And you're on your own for free form basically unlimited length text entry fields.

Contributor
Posts: 57

Re: looking for suggestions importing an excel table with 5000 columns

hi Ballardw,

thanks for your explanation and hint. Now I understand Proc Import better Smiley Happy

BR  Dingdang

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 583 views
  • 7 likes
  • 4 in conversation