I am new to SAS and have been working on some exercises, but don't know if my code is correct. I have three data sets and need to combine all three into one and then create a new one with only unique records.
Here are the variable definitions and my code:
SAS Dataset Name: STUDENTS Column Name Type Format First_name Text $30. Last_name Text $30. DOB Text mmddyy10. StudentID Text $5. SAS Dataset Name: COURSES Column Name Type Format StudentID Text $5. CourseID Text $8. Course_name Text $30. Begin_date Number mmddyy10. End_date Number mmddyy10. TestID Text $10. SAS Dataset Name: GRADES Column Name Type Format TestID Text $10. Test_type Text $5. Grade Number +3. Grade_date Number Mmddyy10. // Put all results from the three tables into one table PROC SQL; CREATE TABLE All_Grades AS SELECT Students.*, Courses.*, Grades.* FROM Students, Courses, Grades WHERE Students.StudentID = Courses.StudentID AND Courses.TestID = Grades.TestID; quit; // From the previous combined table, select unique student IDs with their the earliest test grade expressed as a year PROC SQL; CREATE TABLE Earliest_Test AS SELECT DISTINCT StudentID, TestID, MIN(Test_date) num AS test_year format=YEAR4 FROM All_Grades ORDER BY StudentID ; quit;
To me, it looks like that last step should be
PROC SQL;
CREATE TABLE Earliest_Grade AS
SELECT StudentID, TestID, grade, year(Grade_date) AS grade_year
FROM All_Grades
GROUP BY StudentID
HAVING grade_date = min(grade_date);
quit;
and the comments syntax should be converted into proper SAS.
So what is your question? Do you have some reason to believe the code is incorrect? Are you getting errors or issues with the results?
Here's some improvement suggestions but I suspect your answer is fine.
Given lack of data or context, I'm guessing a lot here.
@jrsm21 wrote:
I am new to SAS and have been working on some exercises, but don't know if my code is correct. I have three data sets and need to combine all three into one and then create a new one with only unique records.
Here are the variable definitions and my code:
SAS Dataset Name: STUDENTS Column Name Type Format First_name Text $30. Last_name Text $30. DOB Text mmddyy10. StudentID Text $5. SAS Dataset Name: COURSES Column Name Type Format StudentID Text $5. CourseID Text $8. Course_name Text $30. Begin_date Number mmddyy10. End_date Number mmddyy10. TestID Text $10. SAS Dataset Name: GRADES Column Name Type Format TestID Text $10. Test_type Text $5. Grade Number +3. Grade_date Number Mmddyy10. // Put all results from the three tables into one table PROC SQL; CREATE TABLE All_Grades AS SELECT Students.*, Courses.*, Grades.* FROM Students, Courses, Grades WHERE Students.StudentID = Courses.StudentID AND Courses.TestID = Grades.TestID; quit; // From the previous combined table, select unique student IDs with their the earliest test grade expressed as a year PROC SQL; CREATE TABLE Earliest_Test AS SELECT DISTINCT StudentID, TestID, MIN(Test_date) num AS test_year format=YEAR4 FROM All_Grades ORDER BY StudentID ; quit;
To me, it looks like that last step should be
PROC SQL;
CREATE TABLE Earliest_Grade AS
SELECT StudentID, TestID, grade, year(Grade_date) AS grade_year
FROM All_Grades
GROUP BY StudentID
HAVING grade_date = min(grade_date);
quit;
and the comments syntax should be converted into proper SAS.
Thanks!
Only one course record for each student. It should be the test record with the earliest test_date.
Only the variables StudentID, CourseID, Test_date, and Grade should be included in Earliest_Test data table. The test_year variable should contain the test_date year.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.