DATA Step, Macro, Functions and more

Percentage greater than a variable sorted by groups

Accepted Solution Solved
Reply
New Contributor
Posts: 2
Accepted Solution

Percentage greater than a variable sorted by groups

[ Edited ]

Hello! I have a data set with individuals' weights over time and state of residence.

 

Name   Year   Weight   State

Ann       2017   150       MD

Ann       2016   160       MD 

Bob       2017   200       MD

Matt      2016   175        TX

etc.

 

What I want to do is create a new variable that tells me the percentage of people (not including the current person) that have a higher weight than the current individual in their state for a given year.

If it was EXCEL I would do =COUNTIFS(B:B,B2,D: D,D2,C:C,">"&C2)/(COUNTIFS(B:B,B2,D: D,D2,)-1).


Accepted Solutions
Solution
4 weeks ago
Super User
Posts: 2,512

Re: Percentage greater than a variable sorted by groups

Posted in reply to Acct_Prof

Another way, with percentages (I missed this part of the question)


proc sql;
  select unique h1.*
               ,sum (h1.WEIGHT < h2.WEIGHT)            as NB_HEAVIER
               ,sum (h1.WEIGHT < h2.WEIGHT)/ count(*)  as PCT_HEAVIER
  from HAVE h1
       left join
       HAVE h2
       on  h1.STATE  = h2.STATE
       and h1.YEAR   = h2.YEAR
  group by h1.NAME, h1.YEAR, h1.STATE;
quit;

 

Name Year Weight State NB_HEAVIER PCT_HEAVIER
Ann 2017 150 MD 3 0.75
Ann 2018 250 MD 0 0
Bob 2017 200 MD 1 0.25
Bob 2018 200 MD 2 0.5
Chris 2017 210 MD 0 0
Chris 2018 210 MD 1 0.25
Matt 2017 160 MD 2 0.5
Matt 2018 160 MD 3 0.75

 

View solution in original post


All Replies
Trusted Advisor
Posts: 1,394

Re: Percentage greater than a variable sorted by groups

Posted in reply to Acct_Prof

Since the within-state-percentile of each individual's weight is the percent that are lighter than the individual, then you want 1 minus that percentile, right?

 

If so, just use proc rank, with the DESCENDING and PERCENT options, and a BY statement (assuming your data are sorted by state). 

Super User
Posts: 2,512

Re: Percentage greater than a variable sorted by groups

Posted in reply to Acct_Prof

Like this?

data HAVE;
input Name$  Year   Weight   State $;
cards;
Ann       2017   150      MD
Matt      2017   160      MD 
Bob       2017   200      MD
Chris     2017   210      MD
Ann       2018   250      MD
Matt      2018   160      MD 
Bob       2018   200      MD
Chris     2018   210      MD
run;

proc sql;
  select unique h1.*
               ,count (h2.NAME) as NB_HEAVIER
  from HAVE h1
       left join
       HAVE h2
       on  h1.STATE  = h2.STATE
       and h1.YEAR   = h2.YEAR
       and h1.WEIGHT < h2.WEIGHT
  group by h1.NAME, h1.YEAR, h1.STATE;
quit;

SAS Output

Name Year Weight State NB_HEAVIER
Ann 2017 150 MD 3
Ann 2018 250 MD 0
Bob 2017 200 MD 1
Bob 2018 200 MD 2
Chris 2017 210 MD 0
Chris 2018 210 MD 1
Matt 2017 160 MD 2
Matt 2018 160 MD 3

 

Solution
4 weeks ago
Super User
Posts: 2,512

Re: Percentage greater than a variable sorted by groups

Posted in reply to Acct_Prof

Another way, with percentages (I missed this part of the question)


proc sql;
  select unique h1.*
               ,sum (h1.WEIGHT < h2.WEIGHT)            as NB_HEAVIER
               ,sum (h1.WEIGHT < h2.WEIGHT)/ count(*)  as PCT_HEAVIER
  from HAVE h1
       left join
       HAVE h2
       on  h1.STATE  = h2.STATE
       and h1.YEAR   = h2.YEAR
  group by h1.NAME, h1.YEAR, h1.STATE;
quit;

 

Name Year Weight State NB_HEAVIER PCT_HEAVIER
Ann 2017 150 MD 3 0.75
Ann 2018 250 MD 0 0
Bob 2017 200 MD 1 0.25
Bob 2018 200 MD 2 0.5
Chris 2017 210 MD 0 0
Chris 2018 210 MD 1 0.25
Matt 2017 160 MD 2 0.5
Matt 2018 160 MD 3 0.75

 

New Contributor
Posts: 2

Re: Percentage greater than a variable sorted by groups

This solution works great. Thank you. 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 80 views
  • 1 like
  • 3 in conversation