BookmarkSubscribeRSS Feed
Calcite | Level 5

Hello our esteemed advisors,



I want to compare two data sets that have same variables but different IDs. I want to get a test of significance whether the variables have equal variances in both data sets. The test i desire to use include ttest or Mann Whitneys test. I have both continuous and categorical variables. 


 have tried Proc compare but since IDS are different, the procedure doesnt seem to work.


id IDnum;
VAR x y z;


I will be glad to get some advise.


Diamond | Level 26 RW9
Diamond | Level 26

First, don't code all in upper case, and use a code window - its the {i} above post area.

Second, post test data in the form of a datastep so that we can see what you are working with:

Third, if the data has no columns which match the other table, what is the logic to match them?

Fourth, describe the problem accurately, this setance for instance: "The test i desire to use include ttest or Mann Whitneys test. I have both continuous and categorical variables. " - makes no sense in terms of a proc compare.  Proc compare merely compares to datasets, ttest and such like are statistical models on the data, something totally different.

Calcite | Level 5


The data sets are exactly the same, just split the main data into two sets one for model development and the second for model validation.


I desire to check whether there is difference in distribution of the variables after splitting the data. So the original data had 4800 observation and after splitting, data1 has 3200 and data2 two has 1600 observation. 


For example checking whether the means a variable like body weight of the two data sets are the same etc.



Data new; 

infile analysis;

input ID sex age weigh height;



1   male  36  78  167

2   female  20  67  156

3   female  36  79  169

14   male  36  78  167



The data is in that format.





Super User

Rename it have the same ID variable name.


PROC COMPARE BASE=data1(rename=(data1_id=IDnum)) COMPARE=data2(rename=(data2_id=IDnum)) ALLSTATS MAXPRINT = (3,6);
id IDnum;
VAR x y z;


Calcite | Level 5

The variables are already similar and the data sets have exactly the same variables.

I have one concern, I want to compare two the variables not in terms of data structure but in terms of descriptive statistics eg is mean of weight in data1 equal to mean of weight in data2? 


In single data sets I can use PROC ttest to get the results , but in this case I want to compare the two data sets. 


i will be glad to be advised if there is any procedure available. 

Jade | Level 19



Recently I came across an article which might be applicable to what you're planning to do: The author discusses arguments for and against such tests and suggests an omnibus test of joint orthogonality as opposed to univariate comparisons. So, this might come down to PROC LOGISTIC or PROC PROBIT rather than (multiple runs of) PROC TTEST -- if you're still convinced that you need a significance test.



Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg



Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 4 in conversation