SAS Office Analytics, SAS Add-In for Microsoft Office, and other integrations

importing xlsx files with difficult names

Accepted Solution Solved
Reply
New Contributor
Posts: 3
Accepted Solution

importing xlsx files with difficult names

I am trying to import an xlsx file. Unfortunately the variable names will contain special characters and blanks (there are many such columns). The data is being downloaded from a third party provider to this xlsx sheet and I have no control over their naming convention. Furthermore, we will want to periodically update this spreadsheet (possibly adding more variables and switching the order of columns) and rerun the import statement for analysis purposes. Because of these concerns it would not be practical to hard code the names of each variable upon import.

I would like to strip out all of the special characters and spaces upon import (or replace them with underscores). This used to be feasible with the GETNAMES option for xls files but I do not believe that it is supported for xlsx file types (we need the xlsx file type because there is to much data for an xls).

Does anyone have any thoughts about how to rename these variables upon import (without changing the excel file)?


Accepted Solutions
Solution
‎04-22-2014 10:41 AM
Valued Guide
Posts: 3,206

Re: importing xlsx files with difficult names

Than do it in two steps....
At first just reading the column headers.  Do not forget options like validvarname=V7 obs= 

Do your inputvalidation on that dataset checking the unique=-names, after input-validation is passing than process further.

Remembering the rootcause of heart-bleed. The coder did not check/validate the incoming request. Just forgot it.

Fool/full-proof coding is hard work.

---->-- ja karman --<-----

View solution in original post


All Replies
Valued Guide
Posts: 3,206

Re: importing xlsx files with difficult names

Bypass the column names row and use your own naming

---->-- ja karman --<-----
New Contributor
Posts: 3

Re: importing xlsx files with difficult names

It is important that I know which data column corresponds to which variable since the analysis that I need to run is unique for each column. If I use a standardized naming scale at the import level I am concerned that if the order of the columns changes in the excel spreadsheet (the next time I download the sheet from our data provider) it will not be noticed causing misleading results. This is why I suspect I need the variable names to be imported as valid sas variable names, but unique names that will not depend on ordering of columns.

Solution
‎04-22-2014 10:41 AM
Valued Guide
Posts: 3,206

Re: importing xlsx files with difficult names

Than do it in two steps....
At first just reading the column headers.  Do not forget options like validvarname=V7 obs= 

Do your inputvalidation on that dataset checking the unique=-names, after input-validation is passing than process further.

Remembering the rootcause of heart-bleed. The coder did not check/validate the incoming request. Just forgot it.

Fool/full-proof coding is hard work.

---->-- ja karman --<-----
New Contributor
Posts: 3

Re: importing xlsx files with difficult names

Thank you for your help. The option validvarname=v7 worked very nicely and I will be sure to validate the names.

Grand Advisor
Posts: 16,358

Re: importing xlsx files with difficult names

SAS will generally do this by default, assuming you don't have the option validvarname=any set.

Post a Question
Discussion Stats
  • 5 replies
  • 617 views
  • 0 likes
  • 3 in conversation