DATA Step, Macro, Functions and more

var names change after import

Reply
Contributor
Posts: 53

var names change after import

Hi,

 

When I am importing a CSV file, it is changing some of the variables name from say e.g 01/01/00 to var1. Please note that my variables are all dates. Although some of the variables names are imported as _01/01/00 which is fine but the rest as var xyz. Any suggestions on how to handle this issue?

Super Contributor
Posts: 339

Re: var names change after import

Hi,

 

Please share the code you are using for importing the CSV and a representative sample of the data, including the first record.

 

Regards,

Amir.

Contributor
Posts: 53

Re: var names change after import

Please find below,

proc import datafile="C:\Users\amalik\Desktop\dailydata\HAVE.csv"
        out=have1
        dbms=csv
        replace;
		GETNAMES=YES;
		run;
;Untitled.png
Super User
Posts: 13,500

Re: var names change after import


@Amalik wrote:

Please find below,

proc import datafile="C:\Users\amalik\Desktop\dailydata\HAVE.csv"
        out=have1
        dbms=csv
        replace;
		GETNAMES=YES;
		run;
;Untitled.png

One column per date is likely going to present other difficulties later in processing this data.

Also getting variable names like VAR3634 indicates that you have thousands of variables. While SAS will handle such it may well be worth considering what kind of code you will have to write to handle thousands of variable names.

 

Super User
Super User
Posts: 9,599

Re: var names change after import

How are you importing the csv file?  I guess you are using proc import yes.  That is not the best idea as proc import is a guessing procedure.  Now 01/01/00 is not a valid variable name hence why sas adds a underscore to the name.  However, it is likely not guessing all your variables correctly.  To handle this properly (well, I will get onto "properly" shortly) take the code that is generated - you will find it in the log after running the proc import, then put this in a code window and change the names to what they should be.  Then run that datastep import.  Datastep import allows full control over the file coming in, how the data is to be read, and what the output should look like - this is what proc import guesses at, but as you know your data better, you should code this.

 

Now, lets get onto to "properly" importing the data.  Having variable names like 01/01/00 is a really bad idea for a number of reasons.  Firstly it is not easily readable by software - which is your issue here.  Secondly it is very hard to write good code which can process a non-fixed datastructure (i.e. more dates could be added, or removed etc.).  It is also not good to put partial data anywhere, 01/01/00, could mean any number of things - dates are an abstract concept.  Finally, data as columns will really cause you headaches.  How will you program with these possibly hundreds or thousands of columns?  Yes, you will write endless macro code to try to work out variable names and loop over these, wasting your time, and storage/processing time.  A simple change to the data from:

... 01/0/1/00  02/0/1/00  03/01/00...

... a               b               c...

...

 

To:

...   Date_var    result

...   01Jan2000 a

...   02Jan2000 b

...   03Jan2000 c

...

Will result in  a far simpler import, far simpler programming to work with it, and most likely will be a smaller file to work with.

That is my suggestion on "how to handle this issue".  In fact it is my suggestion whenever talking about data from Excel.

Super User
Posts: 10,766

Re: var names change after import

add system option before running PROC IMPORT.

 

options validvarname=any ;

proc import ..........

 

and refer to these special variable name via '01/12/2018'n  

Super User
Super User
Posts: 9,599

Re: var names change after import

Doesn't that just ignore the initial problem, and then increase the future coding even more - i.e. having to refer to '01/12/18'n each time, rather than a proper SAS name?  Just thinking, no lists or shortcuts then, possible unmatched quotes, impossible to read code etc.

Super User
Posts: 10,766

Re: var names change after import

Yeah. You are right . But you are standing on English language side .

If your mother language is not English, like me (from China), make a variable name as a Chinese name that would make code more readable due to most Chinese can barely read English .

Super User
Posts: 10,211

Re: var names change after import


@Amalik wrote:

Hi,

 

When I am importing a CSV file, it is changing some of the variables name from say e.g 01/01/00 to var1. Please note that my variables are all dates. Although some of the variables names are imported as _01/01/00 which is fine but the rest as var xyz. Any suggestions on how to handle this issue?


That's because 01/01/00 is not a a valid SAS name. Read the SAS documentation about that (google "SAS valid names").

Keeping data (dates) in structure (variable names) is to be avoided. After import, transpose your dataset and use the _name_ field for converting into valid SAS dates:

data have;
input '_11/27/09'n '_11/28/09'n '_11/29/09'n;
cards;
-2.17 0 0
-2.26 0 0
-2.64 0 0
;
run;

proc transpose data=have out=int;
var _numeric_;
run;

data want;
format date yymmddd10.;
set int;
date = input(substr(_name_,2),mmddyy9.);
drop _name_;
run;

Add by-group processing to the transpose step to create separate rows instead of COL1-Coln.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Ask a Question
Discussion stats
  • 8 replies
  • 153 views
  • 5 likes
  • 6 in conversation