Hi,
I am trying to merge 2 datasets with all identical attributes but a different number of observations. The datasets are information regarding students in 2 different classes. None of the variables contain the names or any identifying information, so the only way I would be able to find the students that are in both classes (datasets) is by identifying specific attributes that characterize each student and are the same across both datasets.
I know the R code for this, but not quite sure how to do this merge in Sas. Please help!
The R code is:
d1=read.table("student-mat.csv",sep=";",header=TRUE) d2=read.table("student-por.csv",sep=";",header=TRUE) d3=merge(d1,d2,by=c("school","sex","age","address","famsize","Pstatus","Medu","Fedu","Mjob","Fjob","reason","nursery","internet")) print(nrow(d3)) # 382 students
as in, there are 382 students in both classes.
If it helps, here is my SAS code so far:
data work.mathproject;
infile 'X:\STATS 521 DATA\PROJECT\student-mat.csv' dlm=',' firstobs=2;
input school $ sex $ age address $ famsize $ Pstatus $ Medu Fedu Mjob $ Fjob $ reason $
guardian $ traveltime studytime failures schoolsup $ famsup $ paid $
activities $ nursery $ higher $ internet $ romantic $ famrel freetime
goout Dalc Walc health absences G1 G2 G3;
run;
data work.portproject;
infile 'X:\STATS 521 DATA\PROJECT\student-por.csv' dlm=',' firstobs=2;
input school $ sex $ age address $ famsize $ Pstatus $ Medu Fedu Mjob $ Fjob $ reason $
guardian $ traveltime studytime failures schoolsup $ famsup $ paid $
activities $ nursery $ higher $ internet $ romantic $ famrel freetime
goout Dalc Walc health absences G1 G2 G3;
run;
proc sort data=work.mathproject
out=work.math;
by age;
run;
proc sort data=work.portproject
out=work.port;
by age;
run;
If you merge by all of these in R,
school sex age address famsize Pstatus Medu Fedu Mjob Fjob reason nursery internet
you're going to need to sort before you merge them in SAS.
I'm not sure if this is what you need or what kind of merge you're trying to do, but I'm assuming you're looking to keep records that appear in both data sets.
proc sort data=work.mathproject
out=work.math;
by school sex age address famsize Pstatus Medu Fedu Mjob Fjob reason nursery internet;
run;
proc sort data=work.portproject
out=work.port;
by school sex age address famsize Pstatus Medu Fedu Mjob Fjob reason nursery internet;
run;
data want;
merge math (in=a)
port (in=b);
by school sex age address famsize Pstatus Medu Fedu Mjob Fjob reason;
if a and b; * if you want to keep records that match on both sides ;
run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.