BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Dingdang
Fluorite | Level 6

Hi,

I've been trying to import an excel table with around 5000 columns (variables) and 100 rows (observations) into SAS 9.2. This is directly not possible, while SAS can only import 255 variables at a time.

I tried converting my table into .txt data with Tab delimited. Somehow in this process, some variable names went wrong. I am guessing because in my original table there are cells that contain a lot of characters and texts. So if i read this .txt data into SAS, I only got some of the variable names right. the others are renamed by SAS to VAR. I looked up this in internet, and it says it might be caused, because some names contain multiple spaces. I can copy the log and do the correction per hand, but with the number of variables i am having, this will take long and since i probably only need to import the data once (not once every month or sth. ), this is really not my first option.

I thought about transposing the excel data first and in that case, i only have 100 variables. SAS can read all of them at a time and then i can transpose them back, but the problem is, some characters were showed as 'missing values' after the import, which i dont quite understand why.

Can someone please share his/her experience with me reading such a dataset into SAS? I am very grateful for any kind of tipps or suggestions.

BR  Dingdang

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

When you say:

I tried converting my table into .txt data with Tab delimited. Somehow in this process, some variable names went wrong. I am guessing because in my original table there are cells that contain a lot of characters and texts.

You are probably pointing to why things worked better with multiple files. When building variable names from a text file using proc import, SAS takes the first 32 characters, replacing non-letter and non-numeric with underscores. If there are ties in the resulting values the second and subsequent variables will be named sequentially: Var1 Var2.

If this process is going to be repeated it is worth importing the single file as CSV or tab delimited and cleaning up the code generated by proc import to specify variable names SAS can use. Assign the current variable names as LABELs to the shorter variable names you generate. Also you can correct the variable types such as numeric codes that should be strings, or those columns that contain text and numbers should actually be strings.

I have to do this routinely as it seems that way too many databases are being created, or there export process, with column headings like "If yes special needs or developmental delay describe" for each of multiple children in a family. So I create shorter variables which indicate which child number and create a label with the descriptive text and the child number. Yes it can be a much work. But if this going to be done every week/month/quarter or what have then the effort usually pays off with the second or third processing period.

Hint: make sting variables about 10% longer than the longest in the first data example, or if you have access to a data dictionary for the source data use the lengths specified to reduce data loss.

And you're on your own for free form basically unlimited length text entry fields.

View solution in original post

7 REPLIES 7
LinusH
Tourmaline | Level 20

Why do you want to store the data with so many columns at all? What do you intend to so with it in SAS? Transposing sounds like a good idea to me. You have to be more specific with you problem with missing values, maybe show some sample data.

Data never sleeps
Dingdang
Fluorite | Level 6

hi LinusH,

we are doing a kind of survey and would like somehow to bild a databank in SAS and we have around 5000 data points that we queried.

I think the problem with the missing values come from the formats, like BallardW guessed. this is difficult to solve with transpose, because originally i have columns with texts and columns with numbers. when i tranpose the data, SAS cannot make the right guess any more.

thanks for your answer. I am working on the CSV solution and try to see how that comes out. I will probably upload some data later if i cannot solve it myself Smiley Happy

have a nice day.  Dingdang

ballardw
Super User

Try saving the Excel as CSV and import that. After import either with wizard or proc import, us either the F4 key to bring submitted datastep code created by import into the editor. If you don't like the data types adjust the informats.

The missing values on your transposed data were likely original values containing characters that the import thought/decided were numeric.

Reeza
Super User

To add on to BallardW suggestion there's a macro code on Microsoft website that will export a quoted CSV file if you have any text fields that may have comma's within them.

A second option if this is a one off - Import to Access then to SAS directly.

Dingdang
Fluorite | Level 6

thank you for your help Smiley Happy

I kind of solved the problem by saving the big excel table into 4 .txt data and read into SAS and the "renaming" problem didnt come up any more. somehow when there are less than 1900 variables in a txt data, SAS can read all the variable names corretly. I dont know why, but anyway, this finishes the task.

If anyone knows the trick in this, please let me know. otherweise, this remains another SAS myth for me Smiley Happy

BR  Dingdang

ballardw
Super User

When you say:

I tried converting my table into .txt data with Tab delimited. Somehow in this process, some variable names went wrong. I am guessing because in my original table there are cells that contain a lot of characters and texts.

You are probably pointing to why things worked better with multiple files. When building variable names from a text file using proc import, SAS takes the first 32 characters, replacing non-letter and non-numeric with underscores. If there are ties in the resulting values the second and subsequent variables will be named sequentially: Var1 Var2.

If this process is going to be repeated it is worth importing the single file as CSV or tab delimited and cleaning up the code generated by proc import to specify variable names SAS can use. Assign the current variable names as LABELs to the shorter variable names you generate. Also you can correct the variable types such as numeric codes that should be strings, or those columns that contain text and numbers should actually be strings.

I have to do this routinely as it seems that way too many databases are being created, or there export process, with column headings like "If yes special needs or developmental delay describe" for each of multiple children in a family. So I create shorter variables which indicate which child number and create a label with the descriptive text and the child number. Yes it can be a much work. But if this going to be done every week/month/quarter or what have then the effort usually pays off with the second or third processing period.

Hint: make sting variables about 10% longer than the longest in the first data example, or if you have access to a data dictionary for the source data use the lengths specified to reduce data loss.

And you're on your own for free form basically unlimited length text entry fields.

Dingdang
Fluorite | Level 6

hi Ballardw,

thanks for your explanation and hint. Now I understand Proc Import better Smiley Happy

BR  Dingdang

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1478 views
  • 7 likes
  • 4 in conversation