BookmarkSubscribeRSS Feed
MUKASADAVID
Calcite | Level 5

Hello our esteemed advisors,

 

 

I want to compare two data sets that have same variables but different IDs. I want to get a test of significance whether the variables have equal variances in both data sets. The test i desire to use include ttest or Mann Whitneys test. I have both continuous and categorical variables. 

 

 have tried Proc compare but since IDS are different, the procedure doesnt seem to work.

 


PROC COMPARE BASE=data1 COMPARE=data2 ALLSTATS MAXPRINT = (3,6);
id IDnum;
VAR x y z;
RUN;

 

I will be glad to get some advise.

 

5 REPLIES 5
RW9
Diamond | Level 26 RW9
Diamond | Level 26

First, don't code all in upper case, and use a code window - its the {i} above post area.

Second, post test data in the form of a datastep so that we can see what you are working with:
https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

Third, if the data has no columns which match the other table, what is the logic to match them?

Fourth, describe the problem accurately, this setance for instance: "The test i desire to use include ttest or Mann Whitneys test. I have both continuous and categorical variables. " - makes no sense in terms of a proc compare.  Proc compare merely compares to datasets, ttest and such like are statistical models on the data, something totally different.

MUKASADAVID
Calcite | Level 5

 

The data sets are exactly the same, just split the main data into two sets one for model development and the second for model validation.

 

I desire to check whether there is difference in distribution of the variables after splitting the data. So the original data had 4800 observation and after splitting, data1 has 3200 and data2 two has 1600 observation. 

 

For example checking whether the means a variable like body weight of the two data sets are the same etc.

 

 

Data new; 

infile analysis;

input ID sex age weigh height;

datalines;

 

1   male  36  78  167

2   female  20  67  156

3   female  36  79  169

14   male  36  78  167

 

 

The data is in that format.

 

 

Thanks

 

Ksharp
Super User

Rename it have the same ID variable name.

 

PROC COMPARE BASE=data1(rename=(data1_id=IDnum)) COMPARE=data2(rename=(data2_id=IDnum)) ALLSTATS MAXPRINT = (3,6);
id IDnum;
VAR x y z;
RUN;

 

MUKASADAVID
Calcite | Level 5

The variables are already similar and the data sets have exactly the same variables.

I have one concern, I want to compare two the variables not in terms of data structure but in terms of descriptive statistics eg is mean of weight in data1 equal to mean of weight in data2? 

 

In single data sets I can use PROC ttest to get the results , but in this case I want to compare the two data sets. 

 

i will be glad to be advised if there is any procedure available. 

FreelanceReinh
Jade | Level 19

Hi @MUKASADAVID,

 

Recently I came across an article which might be applicable to what you're planning to do: https://blogs.worldbank.org/impactevaluations/should-we-require-balance-t-tests-baseline-observables.... The author discusses arguments for and against such tests and suggests an omnibus test of joint orthogonality as opposed to univariate comparisons. So, this might come down to PROC LOGISTIC or PROC PROBIT rather than (multiple runs of) PROC TTEST -- if you're still convinced that you need a significance test.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 939 views
  • 0 likes
  • 4 in conversation