I am trying to import an xlsx file. Unfortunately the variable names will contain special characters and blanks (there are many such columns). The data is being downloaded from a third party provider to this xlsx sheet and I have no control over their naming convention. Furthermore, we will want to periodically update this spreadsheet (possibly adding more variables and switching the order of columns) and rerun the import statement for analysis purposes. Because of these concerns it would not be practical to hard code the names of each variable upon import.
I would like to strip out all of the special characters and spaces upon import (or replace them with underscores). This used to be feasible with the GETNAMES option for xls files but I do not believe that it is supported for xlsx file types (we need the xlsx file type because there is to much data for an xls).
Does anyone have any thoughts about how to rename these variables upon import (without changing the excel file)?
Than do it in two steps....
At first just reading the column headers. Do not forget options like validvarname=V7 obs=
Do your inputvalidation on that dataset checking the unique=-names, after input-validation is passing than process further.
Remembering the rootcause of heart-bleed. The coder did not check/validate the incoming request. Just forgot it.
Fool/full-proof coding is hard work.
Bypass the column names row and use your own naming
It is important that I know which data column corresponds to which variable since the analysis that I need to run is unique for each column. If I use a standardized naming scale at the import level I am concerned that if the order of the columns changes in the excel spreadsheet (the next time I download the sheet from our data provider) it will not be noticed causing misleading results. This is why I suspect I need the variable names to be imported as valid sas variable names, but unique names that will not depend on ordering of columns.
Than do it in two steps....
At first just reading the column headers. Do not forget options like validvarname=V7 obs=
Do your inputvalidation on that dataset checking the unique=-names, after input-validation is passing than process further.
Remembering the rootcause of heart-bleed. The coder did not check/validate the incoming request. Just forgot it.
Fool/full-proof coding is hard work.
Thank you for your help. The option validvarname=v7 worked very nicely and I will be sure to validate the names.
SAS will generally do this by default, assuming you don't have the option validvarname=any set.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!