BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dvtarasov
Obsidian | Level 7

I have a dataset (let's calll it "OLDDATA") with two variables, "Var1" and "Var2". I am trying to create a new dataset ("NEWDATA"), with Var1, Var2, and a third variable called Var3. Var3 is to be calculated by summing up all values in Var1 and multiplying each value in Var2 by this figure. I was told somewhere that the way to do it is as follows:

 

proc summary data = sasdata.olddata;
var var1;
output out=column1_summary sum=total_var1;
run;

data sasdata.newdata;
set sasdata.olddata;
if _n_=1 then set column1_summary;
var3 = var2 * total_var1;
run;

 

The code runs without errors and produces the Var3 variable I expect. However, the resulting dataset, NEWDATA, has two columns named Var1, one from OLDDATA and the other populated exclusively by the output of PROC SUMMARY, i.e., that same value for every observation. I feel that having duplicate column names would not be helpful in later calculations. What is the syntax for suppressing the second Var1 column, or giving it some other name to avoid confusion?

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

There are not two variables with the same name in any dataset, as the SAS system does not allow this.  What you have is two variables which have the same label.  

 

Run a proc contents and look at varname and varlabel, label will be the same, name will not.  It you still think it does, post the output of the proc contents.

View solution in original post

5 REPLIES 5
RW9
Diamond | Level 26 RW9
Diamond | Level 26

There are not two variables with the same name in any dataset, as the SAS system does not allow this.  What you have is two variables which have the same label.  

 

Run a proc contents and look at varname and varlabel, label will be the same, name will not.  It you still think it does, post the output of the proc contents.

dvtarasov
Obsidian | Level 7
I knew the system wouldn't allow duplicate names, so that, too, surprised me. Is it safe, then, to perform a calculation that calls my original Var1 by name--the system won't mix it up with the other variable that it labeled Var1?
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Without running the code, I can't think off the top of my head, but from memory, the original variable remains, and new variables are created with an additional prefix/suffix (you can probably alter that as well through options).  Just open the dataset, and go up to (assuming you use SAS not UE or VA or something) View->Column Names, and you will see what each variable is called.

dvtarasov
Obsidian | Level 7

You are right. I ran PROC CONTENTS on the resulting dataset and saw that, while labels are the same, names are not. Am I right, then, that it does not matter if labels are the same as long as names are different?

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Yes, that is correct.  Column names have to be unique within a dataset.  Column labels can be anything you want.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 3094 views
  • 0 likes
  • 2 in conversation