I have a dataset (let's calll it "OLDDATA") with two variables, "Var1" and "Var2". I am trying to create a new dataset ("NEWDATA"), with Var1, Var2, and a third variable called Var3. Var3 is to be calculated by summing up all values in Var1 and multiplying each value in Var2 by this figure. I was told somewhere that the way to do it is as follows:
proc summary data = sasdata.olddata; var var1; output out=column1_summary sum=total_var1; run; data sasdata.newdata; set sasdata.olddata; if _n_=1 then set column1_summary; var3 = var2 * total_var1; run;
The code runs without errors and produces the Var3 variable I expect. However, the resulting dataset, NEWDATA, has two columns named Var1, one from OLDDATA and the other populated exclusively by the output of PROC SUMMARY, i.e., that same value for every observation. I feel that having duplicate column names would not be helpful in later calculations. What is the syntax for suppressing the second Var1 column, or giving it some other name to avoid confusion?
There are not two variables with the same name in any dataset, as the SAS system does not allow this. What you have is two variables which have the same label.
Run a proc contents and look at varname and varlabel, label will be the same, name will not. It you still think it does, post the output of the proc contents.
There are not two variables with the same name in any dataset, as the SAS system does not allow this. What you have is two variables which have the same label.
Run a proc contents and look at varname and varlabel, label will be the same, name will not. It you still think it does, post the output of the proc contents.
Without running the code, I can't think off the top of my head, but from memory, the original variable remains, and new variables are created with an additional prefix/suffix (you can probably alter that as well through options). Just open the dataset, and go up to (assuming you use SAS not UE or VA or something) View->Column Names, and you will see what each variable is called.
You are right. I ran PROC CONTENTS on the resulting dataset and saw that, while labels are the same, names are not. Am I right, then, that it does not matter if labels are the same as long as names are different?
Yes, that is correct. Column names have to be unique within a dataset. Column labels can be anything you want.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.