BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
HerdingDog
Fluorite | Level 6

Hello,

I would like to streamline the CSV import process in SAS. I am looking for a way to automatically pull in column names  AND I would like to automatically remove spaces from the column names in the same or one extra step. I've looked at using the compress function, and that's more work than just editing the CSV, and I end up typing all the variable names anyway. Othe solutions have involved macros, etc, but don't include the code for the macro. 

Any help would be appreciated!

Thanks

Dave

1 ACCEPTED SOLUTION
7 REPLIES 7
Reeza
Super User
options VALIDVARNAME=V7 forces spaces to underscore.

You can set it in your settings in SAS Studio under Preferences>Tables>Policies>SAS variable name policy - change it to V7.

HerdingDog
Fluorite | Level 6

Thank you for your help. Can you clarify your "guessing of import" statement?

HerdingDog
Fluorite | Level 6
Also, if I use a data step, do I not still need to manually type the column names?
Reeza
Super User
When you use PROC IMPORT it doesn't know for sure what things are so it has to guess. If your data is well formatted, always have the same structure this can be quite inefficient. So if that's the case its better to do it once and get it right, even if it's manual. Note that when you use PROC IMPORT it prints the code in the log so I usually use that, modify as needed and include that code in my production process, not the PROC IMPORT.

If your data structure is changing each time then this isn't feasible.

Note that as your SAS version changes, PROC IMPORT also changes as the algorithm behind it is updated and can behave differently based on the different system options. Its theoretically possible for it to guess wrong as well if the order of data changes. For example, if a mixed column has all numeric in one version by chance (or the first 10,000 rows are) but is typically a mixed character/character field it can guess wrong which means that each time you import your data could have different types. Then you cannot combine the different data sets

So using a data step avoids these issues. But if you're only doing this once then it doesn't matter.

ballardw
Super User

@HerdingDog wrote:

Thank you for your help. Can you clarify your "guessing of import" statement?


If a variable (column in the CSV) has values such as account "numbers" where the leading 0 are significant,

i.e. account 000345 is different than 0000345 you won't know with Proc Import results as the Import will "guess" that the variable is supposed to be numeric and yield 345 for both. Some types of values with dashes or slashes might be interpreted as dates or values with colon as times.

 

Yes proc import will get column names.

It will also create a data step to read CSV files. Which can be copied from the log to the editor and change the INFORMAT statements to match what you need if there are any questions about types of variables. (Remove the format statements for $ variables). The code may be a tad ugly but is easy to deal with.

 

Once you have a proper data step then the next file of that structure you need to read you can use the data step and change the input file and output data set names. All the files read would have the same variable names, types and lengths (not likely from using Proc Import on multiple files).

If you don't have a document describing the expected variable types and lengths I suggest the initial file should be read with the option Guessingrows= max (or at least a large number of rows). Proc Import otherwise will default to setting properties after looking at 20 rows. Which means variables that are not always present may end up as character variables of length one and likely not what you want.

 

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 6426 views
  • 7 likes
  • 4 in conversation