Importing a .csv file with 314 variables having names of more than 50 characters.

Accepted Solution Solved
Reply
Contributor
Posts: 21
Accepted Solution

Importing a .csv file with 314 variables having names of more than 50 characters.

Hi All,

I have a clinical data .csv file containing 2000 rows.314 variables in total.The names of these variables are very long,more than  sas permits for a variable name.

example- Primary_Patient_Information_Diabetes_Associations_Associations Of Diabetes

So if i import the file in SAS and Getnames  = yes,then SAS truncates the names and gives it ,it's own names such as VAR1,VAR20,etc

If i write Getnames = No then the variables are named as var1-var314.

Is there any way that i can import the variable names as it is?Else i'll have to rename all the variables accordingly.

Regards,

Shaheen


Accepted Solutions
Solution
‎08-08-2012 09:14 AM
Respected Advisor
Posts: 3,889

Re: Importing a .csv file with 314 variables having names of more than 50 characters.

SAS variable names are limited to 32 characters. There is no way around this.

You get a mixture of truncated strings as variabel name and names like VAR2, VAR3 because  the truncation to 32 characters can result in multiple identical strings. The first time SAS uses this truncated string as variable name, the next times it creates a variable name of VAR2, VAR3... and so on.

You have now different options:

1. Manually modify your source data and give the fields explanatory names of max 32 characters in length.

2. Skip the first line (getnames=no) and deal with variable names VAR1 to VARn.

3. Like option 2 but then write some code which adds labels to the variable names of VAR1 to VARn. Depending on your data that could be quite handy as it allows you to address the variables as lists like VAR1-VAR20, but the labels allow you to create report using the labels.

Below some sample code illustrating how option 3 could be implemented:

/* create some sample data */
filename source temp lrecl=32000;
data _null_;
  file source;
  put 'This is variable name has more than 32 character,and this is another variable name telling a whole story,Reasonable Name';
  put '1,2,3';
run;

/* read data section (starting with 2nd line) */
PROC import
  datafile=source
  out=work.test
  DBMS=DLM REPLACE;
  DELIMITER=',';
  DATAROW=2;
  getNAMES=no;
RUN;

/* create code to add first line text as labels to variables */
filename code temp;
data _null_;
  file code;
  infile source obs=1 ;
  input;
  put 'attrib ';
  do i=1 to count(_infile_,',')+1;
    label=scan(_infile_,i,',');
    put 'var' i " label='" label +(-1) "'";
  end;
run;

/* apply labels */

options source2;
proc datasets lib=work nolist;
  modify test;
    %include code;;
  run;
quit;

/* list results */
title 'List Variable Attributes';
proc contents data=work.test;
quit;

title 'Print Variables using Labels as Headers';
proc print data=work.test label;
  var var1-var3;
run;
title;

View solution in original post


All Replies
New Contributor
Posts: 4

Re: Importing a .csv file with 314 variables having names of more than 50 characters.

Hi,

Can you please show me the error  when the getnames=yes?

Thanks & Regards,

Senthil.

Contributor
Posts: 21

Re: Importing a .csv file with 314 variables having names of more than 50 characters.

Hi,

SAS is truncating some of the names and to others it is giving its own names such as VAR1,VAR2 etc.(Not an error in the log,sorry for the confusion) I have edited the query.

Any other way i can import the names as it is?

Thanks ,

Shaheen

Solution
‎08-08-2012 09:14 AM
Respected Advisor
Posts: 3,889

Re: Importing a .csv file with 314 variables having names of more than 50 characters.

SAS variable names are limited to 32 characters. There is no way around this.

You get a mixture of truncated strings as variabel name and names like VAR2, VAR3 because  the truncation to 32 characters can result in multiple identical strings. The first time SAS uses this truncated string as variable name, the next times it creates a variable name of VAR2, VAR3... and so on.

You have now different options:

1. Manually modify your source data and give the fields explanatory names of max 32 characters in length.

2. Skip the first line (getnames=no) and deal with variable names VAR1 to VARn.

3. Like option 2 but then write some code which adds labels to the variable names of VAR1 to VARn. Depending on your data that could be quite handy as it allows you to address the variables as lists like VAR1-VAR20, but the labels allow you to create report using the labels.

Below some sample code illustrating how option 3 could be implemented:

/* create some sample data */
filename source temp lrecl=32000;
data _null_;
  file source;
  put 'This is variable name has more than 32 character,and this is another variable name telling a whole story,Reasonable Name';
  put '1,2,3';
run;

/* read data section (starting with 2nd line) */
PROC import
  datafile=source
  out=work.test
  DBMS=DLM REPLACE;
  DELIMITER=',';
  DATAROW=2;
  getNAMES=no;
RUN;

/* create code to add first line text as labels to variables */
filename code temp;
data _null_;
  file code;
  infile source obs=1 ;
  input;
  put 'attrib ';
  do i=1 to count(_infile_,',')+1;
    label=scan(_infile_,i,',');
    put 'var' i " label='" label +(-1) "'";
  end;
run;

/* apply labels */

options source2;
proc datasets lib=work nolist;
  modify test;
    %include code;;
  run;
quit;

/* list results */
title 'List Variable Attributes';
proc contents data=work.test;
quit;

title 'Print Variables using Labels as Headers';
proc print data=work.test label;
  var var1-var3;
run;
title;

Contributor
Posts: 21

Re: Importing a .csv file with 314 variables having names of more than 50 characters.

Thank you !!

Regards,

Shaheen

Super User
Posts: 10,500

Re: Importing a .csv file with 314 variables having names of more than 50 characters.

FYI, when you use the getnames=yes option, you are getting the VAR1 and such because there are variable names that are identical through the 32 characters that SAS allows.

Are you going to be reading a file with this structure multiple times? If so I would offer an option 4 to Patrick's list.

After running the proc import code examine the log. You will find data step code to read the data. Copy that out and paste into the editor window. Remove any line numbers and edit the infile statement to make sure it is executeable (there is likely to be an ancient continuation character for longer than 80 character lines).

You will have a bunch of INFORMAT and INFORMAT statements associated with the SAS assigned variable names. When you see a variable name you don't like do a find and replace with one you like within the 32 character limit. This is also a good time to see if you agree with numeric vs string assignments. You may want to change numeric codes you won't do arithematic with, usually with an INFORMAT like BEST12. to strings using $12. Make sure that the INPUT statement agrees for string variables.

Add a block of code to assign Labels. Patrick's code is a start but since you can have many more characters for a label, even though many procs only display 40 or so, make them meaningful and more attractive by use of sentence case and punctuation.

This may seem a tad tedious but in the case of weekly or monthly files the added work by controlling how things are read pays off in the long run. I go through this process several times monthly will files varying from 30 to 1000+ variables.

Contributor
Posts: 21

Re: Importing a .csv file with 314 variables having names of more than 50 characters.

Thank you for the helpful guidance.Smiley Happy

regards,

Shaheen

New Contributor
Posts: 2

Re: Importing a .csv file with 314 variables having names of more than 50 characters.

@ShaheenRanalvi

could u please provide me that csv format dataset. i really need it. kindly help me. it's urgent please. i'm doing work on diabetes.

New Contributor
Posts: 2

Re: Importing a .csv file with 314 variables having names of more than 50 characters.

could u please provide me that csv format dataset. i really need it. kindly help me. it's urgent please.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 6867 views
  • 6 likes
  • 5 in conversation