BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ShaheenRanalvi
Calcite | Level 5

Hi All,

I have a clinical data .csv file containing 2000 rows.314 variables in total.The names of these variables are very long,more than  sas permits for a variable name.

example- Primary_Patient_Information_Diabetes_Associations_Associations Of Diabetes

So if i import the file in SAS and Getnames  = yes,then SAS truncates the names and gives it ,it's own names such as VAR1,VAR20,etc

If i write Getnames = No then the variables are named as var1-var314.

Is there any way that i can import the variable names as it is?Else i'll have to rename all the variables accordingly.

Regards,

Shaheen

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

SAS variable names are limited to 32 characters. There is no way around this.

You get a mixture of truncated strings as variabel name and names like VAR2, VAR3 because  the truncation to 32 characters can result in multiple identical strings. The first time SAS uses this truncated string as variable name, the next times it creates a variable name of VAR2, VAR3... and so on.

You have now different options:

1. Manually modify your source data and give the fields explanatory names of max 32 characters in length.

2. Skip the first line (getnames=no) and deal with variable names VAR1 to VARn.

3. Like option 2 but then write some code which adds labels to the variable names of VAR1 to VARn. Depending on your data that could be quite handy as it allows you to address the variables as lists like VAR1-VAR20, but the labels allow you to create report using the labels.

Below some sample code illustrating how option 3 could be implemented:

/* create some sample data */
filename source temp lrecl=32000;
data _null_;
  file source;
  put 'This is variable name has more than 32 character,and this is another variable name telling a whole story,Reasonable Name';
  put '1,2,3';
run;

/* read data section (starting with 2nd line) */
PROC import
  datafile=source
  out=work.test
  DBMS=DLM REPLACE;
  DELIMITER=',';
  DATAROW=2;
  getNAMES=no;
RUN;

/* create code to add first line text as labels to variables */
filename code temp;
data _null_;
  file code;
  infile source obs=1 ;
  input;
  put 'attrib ';
  do i=1 to count(_infile_,',')+1;
    label=scan(_infile_,i,',');
    put 'var' i " label='" label +(-1) "'";
  end;
run;

/* apply labels */

options source2;
proc datasets lib=work nolist;
  modify test;
    %include code;;
  run;
quit;

/* list results */
title 'List Variable Attributes';
proc contents data=work.test;
quit;

title 'Print Variables using Labels as Headers';
proc print data=work.test label;
  var var1-var3;
run;
title;

View solution in original post

8 REPLIES 8
Senthilnathan_SAS
Calcite | Level 5

Hi,

Can you please show me the error  when the getnames=yes?

Thanks & Regards,

Senthil.

ShaheenRanalvi
Calcite | Level 5

Hi,

SAS is truncating some of the names and to others it is giving its own names such as VAR1,VAR2 etc.(Not an error in the log,sorry for the confusion) I have edited the query.

Any other way i can import the names as it is?

Thanks ,

Shaheen

Patrick
Opal | Level 21

SAS variable names are limited to 32 characters. There is no way around this.

You get a mixture of truncated strings as variabel name and names like VAR2, VAR3 because  the truncation to 32 characters can result in multiple identical strings. The first time SAS uses this truncated string as variable name, the next times it creates a variable name of VAR2, VAR3... and so on.

You have now different options:

1. Manually modify your source data and give the fields explanatory names of max 32 characters in length.

2. Skip the first line (getnames=no) and deal with variable names VAR1 to VARn.

3. Like option 2 but then write some code which adds labels to the variable names of VAR1 to VARn. Depending on your data that could be quite handy as it allows you to address the variables as lists like VAR1-VAR20, but the labels allow you to create report using the labels.

Below some sample code illustrating how option 3 could be implemented:

/* create some sample data */
filename source temp lrecl=32000;
data _null_;
  file source;
  put 'This is variable name has more than 32 character,and this is another variable name telling a whole story,Reasonable Name';
  put '1,2,3';
run;

/* read data section (starting with 2nd line) */
PROC import
  datafile=source
  out=work.test
  DBMS=DLM REPLACE;
  DELIMITER=',';
  DATAROW=2;
  getNAMES=no;
RUN;

/* create code to add first line text as labels to variables */
filename code temp;
data _null_;
  file code;
  infile source obs=1 ;
  input;
  put 'attrib ';
  do i=1 to count(_infile_,',')+1;
    label=scan(_infile_,i,',');
    put 'var' i " label='" label +(-1) "'";
  end;
run;

/* apply labels */

options source2;
proc datasets lib=work nolist;
  modify test;
    %include code;;
  run;
quit;

/* list results */
title 'List Variable Attributes';
proc contents data=work.test;
quit;

title 'Print Variables using Labels as Headers';
proc print data=work.test label;
  var var1-var3;
run;
title;

ballardw
Super User

FYI, when you use the getnames=yes option, you are getting the VAR1 and such because there are variable names that are identical through the 32 characters that SAS allows.

Are you going to be reading a file with this structure multiple times? If so I would offer an option 4 to Patrick's list.

After running the proc import code examine the log. You will find data step code to read the data. Copy that out and paste into the editor window. Remove any line numbers and edit the infile statement to make sure it is executeable (there is likely to be an ancient continuation character for longer than 80 character lines).

You will have a bunch of INFORMAT and INFORMAT statements associated with the SAS assigned variable names. When you see a variable name you don't like do a find and replace with one you like within the 32 character limit. This is also a good time to see if you agree with numeric vs string assignments. You may want to change numeric codes you won't do arithematic with, usually with an INFORMAT like BEST12. to strings using $12. Make sure that the INPUT statement agrees for string variables.

Add a block of code to assign Labels. Patrick's code is a start but since you can have many more characters for a label, even though many procs only display 40 or so, make them meaningful and more attractive by use of sentence case and punctuation.

This may seem a tad tedious but in the case of weekly or monthly files the added work by controlling how things are read pays off in the long run. I go through this process several times monthly will files varying from 30 to 1000+ variables.

ShaheenRanalvi
Calcite | Level 5

Thank you for the helpful guidance.Smiley Happy

regards,

Shaheen

fairyjee
Calcite | Level 5

@ShaheenRanalvi

could u please provide me that csv format dataset. i really need it. kindly help me. it's urgent please. i'm doing work on diabetes.

fairyjee
Calcite | Level 5

could u please provide me that csv format dataset. i really need it. kindly help me. it's urgent please.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 16739 views
  • 6 likes
  • 5 in conversation