Programming the statistical procedures from SAS

Splitting dataset into two groups

Reply
Occasional Contributor
Posts: 13

Splitting dataset into two groups

Hi,

I have two separate datasets that I would like to compare. I concatenated the datasets in order to be able to do t-tests and chi-square tests on but I'm not sure how to split the new dataset into two groups. There is no special features for either group only different ID numbers for each observation.

Grand Advisor
Posts: 16,916

Re: Splitting dataset into two groups

So what differentiates the data? The source data sets? If so use INDSNAME to identify the source when appending.

data want;

set data1 data2 indsname=source;

indata=source;

run;

Occasional Contributor
Posts: 13

Re: Splitting dataset into two groups

Hi Reeza,

So, basically there was a larger dataset initially, random samples were taken from that larger datasets. This random sample has 70 people. I want to compare features from these 70 people with features from the observations that weren't randomly selected (n=472) to assess representativeness. Does that make more sense?

Thanks.

Grand Advisor
Posts: 16,916

Re: Splitting dataset into two groups

That's a standard comparison - sample is similar to 'population'.

Using the method above will work to identify and then you can use class variable for comparison.

data want;

set pop sample indsname=source;

datain=source;

run;

proc freq data=want;

table datain*<variable of interest>/chisq;

run;

Grand Advisor
Posts: 10,055

Re: Splitting dataset into two groups

One would hope that the original datasets, or source files to recreate the data sets, still exist. If the original data sets before concatenation no longer exist it may be that re-reading the source data files would be the best option.

Ask a Question
Discussion stats
  • 4 replies
  • 297 views
  • 0 likes
  • 3 in conversation