Both the new tables were giving me that output. My first original file of students (Data.master_After2001) has 5,000 students my 2nd original file of grades (Data.grades) has over 1,000,000 lines as each student has more then one grade. I am looking to convert the grades to A=4, B=3, C=2, D=1, and F=0, sum them up for each student then divide them by the number of courses taken (how many grades supplied for each student) to find the gpa for each student. In total the file should be 5,000 rows but is currently over 1,000,000 with the code provided. data Data.grades_recoded; set Data.grades; if grade='F' then grade_value=0; else if grade='D' then grade_value = 1; else if grade='C' then grade_value = 2; else if grade='B' then grade_value = 3; else if grade='A' then grade_value = 4; RUN; PROC SORT DATA=Data.grades_recoded; by studentID; RUN; PROC SORT DATA=Data.master_After2001; by studentID; RUN; data Data.master_GPA; merge Data.grades_recoded Data.master_After2001 (keep = studentID gender); by studentID; RUN; PROC MEANS DATA=Data.master_GPA STACKODS N MEAN MAX MEDIAN MIN; class gender; var grade_value; ods output summary=SummaryStats; RUN;
... View more