DATA Step, Macro, Functions and more

Group by Query Produces Larger Numbers with More Variables

Accepted Solution Solved
Reply
Super Contributor
Posts: 497
Accepted Solution

Group by Query Produces Larger Numbers with More Variables

I’m running into a puzzling group by query.  When I run the query with the columns not commented out I receive done number for the sum of ret_1 when I run the same query with the additional columns I receive a bigger number.  Thoughts on why?

 


proc sql;

create table ret_deg4test2 as
select a.cohort, a.person_uid, a.gender, a.gender_desc, a.residency, a.RESIDENCY_DESC, a.SCHEV_ETHNICITY_DESC,
b.college, b.COLLEGE_DESC_DISPLAY as COLLEGE_DESC,
a.pell_eligible, a.pell_recipient,

/*a.schev_student_status,
a.FIRST_GENERATION, a.HS_GPA, a.HS_GPA_RANGE, a.HS_GPA_RANGE_ID, a.POST_SEC_SCHOOL_GPA, a.SAT_MATH, a.SAT_RANGE, a.SAT_TOTAL,
a.SAT_TOTAL_RANGE_ID, a.SAT_VERBAL, a.SECONDARY_SCHOOL_PERCENT_RANGE, a.SECONDARY_SCHOOL_PERCENTILE,
a.academic_origin, a.academic_origin_id,*/

/*Annual fall to fall measures*/
sum(a.ret_1) as ret_1

from ret_deg3 a
left join ODSSTU.DAR_CAMPUS b on a.first_college=b.COLLEGE

group by a.cohort, a.person_uid, a.gender, a.gender_desc, a.residency, a.RESIDENCY_DESC, a.SCHEV_ETHNICITY_DESC,
b.college, b.COLLEGE_DESC_DISPLAY, a.pell_eligible, a.pell_recipient/*,

a.schev_student_status,
a.FIRST_GENERATION, a.HS_GPA, a.HS_GPA_RANGE, a.HS_GPA_RANGE_ID, a.POST_SEC_SCHOOL_GPA, a.SAT_MATH, a.SAT_RANGE, a.SAT_TOTAL,
a.SAT_TOTAL_RANGE_ID, a.SAT_VERBAL, a.SECONDARY_SCHOOL_PERCENT_RANGE, a.SECONDARY_SCHOOL_PERCENTILE*/;

quit;


Accepted Solutions
Solution
‎08-17-2017 02:01 PM
Super Contributor
Posts: 497

Re: Group by Query Produces Larger Numbers with More Variables

Posted in reply to DavidPhillips2

The difference is because a.academic_origin, a.academic_origin_id was missing from the group by.  There is a bug where the whole sql statment script should have errored out please fix in furture versions.

View solution in original post


All Replies
Super Contributor
Posts: 497

Re: Group by Query Produces Larger Numbers with More Variables

Posted in reply to DavidPhillips2

I noticed that when I take the join out that the number is correct not sure why the number of variables is causing this though.

Solution
‎08-17-2017 02:01 PM
Super Contributor
Posts: 497

Re: Group by Query Produces Larger Numbers with More Variables

Posted in reply to DavidPhillips2

The difference is because a.academic_origin, a.academic_origin_id was missing from the group by.  There is a bug where the whole sql statment script should have errored out please fix in furture versions.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 64 views
  • 0 likes
  • 1 in conversation