Hello All,
I have a dataset that follows active-duty parents over a 15 month period and for several covariates the value , such as rank, may change. What I want to do is determine what rank they held for the majority of those 15 months. Does anyone have an elegant way of doing that?
Thanks,
Brian
P.S. Please see attached example
assuming i understand your req or clarify once you test the below
data have;
input PPID $	Date $	Rank $;
cards;
123456	Jan-18	ES
123456	Feb-18	ES
123456	Mar-18	ES
123456	Apr-18	ES
123456	May-18	ES
123456	Jun-18	ES
123456	Jul-18	EJ
123456	Aug-18	EJ
123456	Sep-18	EJ
123456	Oct-18	EJ
123456	Nov-18	EJ
123456	Dec-18	EJ
123456	Jan-19	EJ
123456	Feb-19	EJ
123456	Mar-19	EJ
654321	Jan-18	OJ
654321	Feb-18	OJ
654321	Mar-18	OJ
654321	Apr-18	OJ
654321	May-18	OJ
654321	Jun-18	OJ
654321	Jul-18	OJ
654321	Aug-18	OJ
654321	Sep-18	OJ
654321	Oct-18	OJ
654321	Nov-18	OJ
654321	Dec-18	OJ
654321	Jan-19	OS
654321	Feb-19	OS
654321	Mar-19	OS
;
proc sql;
create table want as
select ppid,rank
from (select ppid,rank,count(rank) as c from have group by ppid,rank)
group by ppid
having max(c)=c;
quit;I don't see any attached data?
draycut,
When you have a free moment please see the attached data set.
Thanks, Brian
PROC FREQ will tell you how many months each patient held a specific rank. Then from there you can determine the most frequent rank.
Please post sample Data of what you HAVE and what you WANT
novinosrin,
When you have a free moment please see the attached data set.
Thanks, Brian
assuming i understand your req or clarify once you test the below
data have;
input PPID $	Date $	Rank $;
cards;
123456	Jan-18	ES
123456	Feb-18	ES
123456	Mar-18	ES
123456	Apr-18	ES
123456	May-18	ES
123456	Jun-18	ES
123456	Jul-18	EJ
123456	Aug-18	EJ
123456	Sep-18	EJ
123456	Oct-18	EJ
123456	Nov-18	EJ
123456	Dec-18	EJ
123456	Jan-19	EJ
123456	Feb-19	EJ
123456	Mar-19	EJ
654321	Jan-18	OJ
654321	Feb-18	OJ
654321	Mar-18	OJ
654321	Apr-18	OJ
654321	May-18	OJ
654321	Jun-18	OJ
654321	Jul-18	OJ
654321	Aug-18	OJ
654321	Sep-18	OJ
654321	Oct-18	OJ
654321	Nov-18	OJ
654321	Dec-18	OJ
654321	Jan-19	OS
654321	Feb-19	OS
654321	Mar-19	OS
;
proc sql;
create table want as
select ppid,rank
from (select ppid,rank,count(rank) as c from have group by ppid,rank)
group by ppid
having max(c)=c;
quit;novinosrin,
That's beautiful! However, when I run the code the Rank variable comes up as "."
Brian
I trust your sample is a good representative of you real? if not, can you repost a more representative one ? Do you have any missing values?
novinosrin,
I just made that data up. There may be missing values.
Brian
I am going with assumptions by adding one row with a new id and missing rank
and notice the where condition in the inner query that filters missing
data have;
input PPID $	Date $	Rank $;
cards;
123456	Jan-18	ES
123456	Feb-18	ES
123456	Mar-18	ES
123456	Apr-18	ES
123456	May-18	ES
123456	Jun-18	ES
123456	Jul-18	EJ
123456	Aug-18	EJ
123456	Sep-18	EJ
123456	Oct-18	EJ
123456	Nov-18	EJ
123456	Dec-18	EJ
123456	Jan-19	EJ
123456	Feb-19	EJ
123456	Mar-19	EJ
654321	Jan-18	OJ
654321	Feb-18	OJ
654321	Mar-18	OJ
654321	Apr-18	OJ
654321	May-18	OJ
654321	Jun-18	OJ
654321	Jul-18	OJ
654321	Aug-18	OJ
654321	Sep-18	OJ
654321	Oct-18	OJ
654321	Nov-18	OJ
654321	Dec-18	OJ
654321	Jan-19	OS
654321	Feb-19	OS
654321	Mar-19	OS
654322	Mar-19	.
;
proc sql;
create table want as
select ppid,rank
from (select ppid,rank,count(rank) as c from have(where=(rank is not missing)) group by ppid,rank)
group by ppid
having max(c)=c;
quit;
PROC MEANS and the MODE function.
First time I think I've known of an actual usage of the mode function that makes sense.
EDIT: if you need multiple modes with ties, then I think you need PROC UNIVARIATE with the ODS output instead.
@BTAinRVA wrote:
Hello All,
I have a dataset that follows active-duty parents over a 15 month period and for several covariates the value , such as rank, may change. What I want to do is determine what rank they held for the majority of those 15 months. Does anyone have an elegant way of doing that?
Thanks,
Brian
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
