Hi, I am trying to run a t-test on a variable that is present in 2 large datasets that are already created as SAS datasets.
Example: trying to run a t-test between 2 "Age" groups in 2 different files
File 1: n = 5,000
File 2: n = 150,000
Is there a way to easily run the t-test between these files without actually combining the files? I've googled and all I find is how to do the t-test within one file or with the infile approach.
Thanks.
With a SAS view, you can have SAS create the combined dataset only for the duration of the proc ttest execution. For example:
/* Split sashelp.class into two datasets */
data M;
set sashelp.class;
if sex="M";
run;
data F;
set sashelp.class;
if sex="F";
run;
/* Define a SAS view. This does not create a dataset */
data both / view=both;
set M F indsname=ds;
group = ds;
run;
/* Call the view as input to proc ttest */
proc ttest data=both;
class group;
var age;
run;
@EddieJackson wrote:
Hi, I am trying to run a t-test on a variable that is present in 2 large datasets that are already created as SAS datasets.
Example: trying to run a t-test between 2 "Age" groups in 2 different files
File 1: n = 5,000
File 2: n = 150,000
Is there a way to easily run the t-test between these files without actually combining the files? I've googled and all I find is how to do the t-test within one file or with the infile approach.
Thanks.
General rule of thumb, if your data fits in Excel it's not large 🙂
You can also create individual summary data sets from each and combine the summaries and use that data set in Proc TTest if you structure the summaries correctly. There is an example is an example in the Proc TTest documentation using summary from a single data source but the steps should not be too difficult to extend. Your description is pretty vague but I think you might want to use the name of our data sources as a class variable if there is not an obvious existing variable to define groups.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.