Hello All,
I have a dataset that follows active-duty parents over a 15 month period and for several covariates the value , such as rank, may change. What I want to do is determine what rank they held for the majority of those 15 months. Does anyone have an elegant way of doing that?
Thanks,
Brian
P.S. Please see attached example
assuming i understand your req or clarify once you test the below
data have;
input PPID $ Date $ Rank $;
cards;
123456 Jan-18 ES
123456 Feb-18 ES
123456 Mar-18 ES
123456 Apr-18 ES
123456 May-18 ES
123456 Jun-18 ES
123456 Jul-18 EJ
123456 Aug-18 EJ
123456 Sep-18 EJ
123456 Oct-18 EJ
123456 Nov-18 EJ
123456 Dec-18 EJ
123456 Jan-19 EJ
123456 Feb-19 EJ
123456 Mar-19 EJ
654321 Jan-18 OJ
654321 Feb-18 OJ
654321 Mar-18 OJ
654321 Apr-18 OJ
654321 May-18 OJ
654321 Jun-18 OJ
654321 Jul-18 OJ
654321 Aug-18 OJ
654321 Sep-18 OJ
654321 Oct-18 OJ
654321 Nov-18 OJ
654321 Dec-18 OJ
654321 Jan-19 OS
654321 Feb-19 OS
654321 Mar-19 OS
;
proc sql;
create table want as
select ppid,rank
from (select ppid,rank,count(rank) as c from have group by ppid,rank)
group by ppid
having max(c)=c;
quit;
I don't see any attached data?
draycut,
When you have a free moment please see the attached data set.
Thanks, Brian
PROC FREQ will tell you how many months each patient held a specific rank. Then from there you can determine the most frequent rank.
Please post sample Data of what you HAVE and what you WANT
novinosrin,
When you have a free moment please see the attached data set.
Thanks, Brian
assuming i understand your req or clarify once you test the below
data have;
input PPID $ Date $ Rank $;
cards;
123456 Jan-18 ES
123456 Feb-18 ES
123456 Mar-18 ES
123456 Apr-18 ES
123456 May-18 ES
123456 Jun-18 ES
123456 Jul-18 EJ
123456 Aug-18 EJ
123456 Sep-18 EJ
123456 Oct-18 EJ
123456 Nov-18 EJ
123456 Dec-18 EJ
123456 Jan-19 EJ
123456 Feb-19 EJ
123456 Mar-19 EJ
654321 Jan-18 OJ
654321 Feb-18 OJ
654321 Mar-18 OJ
654321 Apr-18 OJ
654321 May-18 OJ
654321 Jun-18 OJ
654321 Jul-18 OJ
654321 Aug-18 OJ
654321 Sep-18 OJ
654321 Oct-18 OJ
654321 Nov-18 OJ
654321 Dec-18 OJ
654321 Jan-19 OS
654321 Feb-19 OS
654321 Mar-19 OS
;
proc sql;
create table want as
select ppid,rank
from (select ppid,rank,count(rank) as c from have group by ppid,rank)
group by ppid
having max(c)=c;
quit;
novinosrin,
That's beautiful! However, when I run the code the Rank variable comes up as "."
Brian
I trust your sample is a good representative of you real? if not, can you repost a more representative one ? Do you have any missing values?
novinosrin,
I just made that data up. There may be missing values.
Brian
I am going with assumptions by adding one row with a new id and missing rank
and notice the where condition in the inner query that filters missing
data have;
input PPID $ Date $ Rank $;
cards;
123456 Jan-18 ES
123456 Feb-18 ES
123456 Mar-18 ES
123456 Apr-18 ES
123456 May-18 ES
123456 Jun-18 ES
123456 Jul-18 EJ
123456 Aug-18 EJ
123456 Sep-18 EJ
123456 Oct-18 EJ
123456 Nov-18 EJ
123456 Dec-18 EJ
123456 Jan-19 EJ
123456 Feb-19 EJ
123456 Mar-19 EJ
654321 Jan-18 OJ
654321 Feb-18 OJ
654321 Mar-18 OJ
654321 Apr-18 OJ
654321 May-18 OJ
654321 Jun-18 OJ
654321 Jul-18 OJ
654321 Aug-18 OJ
654321 Sep-18 OJ
654321 Oct-18 OJ
654321 Nov-18 OJ
654321 Dec-18 OJ
654321 Jan-19 OS
654321 Feb-19 OS
654321 Mar-19 OS
654322 Mar-19 .
;
proc sql;
create table want as
select ppid,rank
from (select ppid,rank,count(rank) as c from have(where=(rank is not missing)) group by ppid,rank)
group by ppid
having max(c)=c;
quit;
PROC MEANS and the MODE function.
First time I think I've known of an actual usage of the mode function that makes sense.
EDIT: if you need multiple modes with ties, then I think you need PROC UNIVARIATE with the ODS output instead.
@BTAinRVA wrote:
Hello All,
I have a dataset that follows active-duty parents over a 15 month period and for several covariates the value , such as rank, may change. What I want to do is determine what rank they held for the majority of those 15 months. Does anyone have an elegant way of doing that?
Thanks,
Brian
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.