BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
BTAinRVA
Quartz | Level 8

Hello All,

 

I have a dataset that follows active-duty parents over a 15 month period and for several covariates the value , such as rank, may change. What I want to do is determine what rank they held for the majority of those 15 months. Does anyone have an elegant way of doing that?

 

 

Thanks,

 

Brian

 

P.S. Please see attached example

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

assuming i understand your req or clarify once you test the below

 

data have;
input PPID $	Date $	Rank $;
cards;
123456	Jan-18	ES
123456	Feb-18	ES
123456	Mar-18	ES
123456	Apr-18	ES
123456	May-18	ES
123456	Jun-18	ES
123456	Jul-18	EJ
123456	Aug-18	EJ
123456	Sep-18	EJ
123456	Oct-18	EJ
123456	Nov-18	EJ
123456	Dec-18	EJ
123456	Jan-19	EJ
123456	Feb-19	EJ
123456	Mar-19	EJ
654321	Jan-18	OJ
654321	Feb-18	OJ
654321	Mar-18	OJ
654321	Apr-18	OJ
654321	May-18	OJ
654321	Jun-18	OJ
654321	Jul-18	OJ
654321	Aug-18	OJ
654321	Sep-18	OJ
654321	Oct-18	OJ
654321	Nov-18	OJ
654321	Dec-18	OJ
654321	Jan-19	OS
654321	Feb-19	OS
654321	Mar-19	OS
;

proc sql;
create table want as
select ppid,rank
from (select ppid,rank,count(rank) as c from have group by ppid,rank)
group by ppid
having max(c)=c;
quit;

View solution in original post

11 REPLIES 11
PeterClemmensen
Tourmaline | Level 20

I don't see any attached data?

BTAinRVA
Quartz | Level 8

draycut,

 

When you have a free moment please see the attached data set.

 

Thanks,  Brian

PaigeMiller
Diamond | Level 26

PROC FREQ will tell you how many months each patient held a specific rank. Then from there you can determine the most frequent rank.

--
Paige Miller
novinosrin
Tourmaline | Level 20

Please post sample Data of what you HAVE and what you WANT

BTAinRVA
Quartz | Level 8

novinosrin,

 

When you have a free moment please see the attached data set.

 

Thanks,  Brian

novinosrin
Tourmaline | Level 20

assuming i understand your req or clarify once you test the below

 

data have;
input PPID $	Date $	Rank $;
cards;
123456	Jan-18	ES
123456	Feb-18	ES
123456	Mar-18	ES
123456	Apr-18	ES
123456	May-18	ES
123456	Jun-18	ES
123456	Jul-18	EJ
123456	Aug-18	EJ
123456	Sep-18	EJ
123456	Oct-18	EJ
123456	Nov-18	EJ
123456	Dec-18	EJ
123456	Jan-19	EJ
123456	Feb-19	EJ
123456	Mar-19	EJ
654321	Jan-18	OJ
654321	Feb-18	OJ
654321	Mar-18	OJ
654321	Apr-18	OJ
654321	May-18	OJ
654321	Jun-18	OJ
654321	Jul-18	OJ
654321	Aug-18	OJ
654321	Sep-18	OJ
654321	Oct-18	OJ
654321	Nov-18	OJ
654321	Dec-18	OJ
654321	Jan-19	OS
654321	Feb-19	OS
654321	Mar-19	OS
;

proc sql;
create table want as
select ppid,rank
from (select ppid,rank,count(rank) as c from have group by ppid,rank)
group by ppid
having max(c)=c;
quit;
BTAinRVA
Quartz | Level 8

novinosrin,

 

That's beautiful! However, when I run the code the Rank variable comes up as "."

 

Brian

novinosrin
Tourmaline | Level 20

I trust your sample is a good representative of you real? if not, can you repost a more representative one ? Do you have any missing values?

BTAinRVA
Quartz | Level 8

novinosrin,

 

I just made that data up. There may be missing values.

 

Brian

novinosrin
Tourmaline | Level 20

I am going with assumptions by adding one row with a new id and missing rank

and notice the where condition in the inner query that filters missing

 

data have;
input PPID $	Date $	Rank $;
cards;
123456	Jan-18	ES
123456	Feb-18	ES
123456	Mar-18	ES
123456	Apr-18	ES
123456	May-18	ES
123456	Jun-18	ES
123456	Jul-18	EJ
123456	Aug-18	EJ
123456	Sep-18	EJ
123456	Oct-18	EJ
123456	Nov-18	EJ
123456	Dec-18	EJ
123456	Jan-19	EJ
123456	Feb-19	EJ
123456	Mar-19	EJ
654321	Jan-18	OJ
654321	Feb-18	OJ
654321	Mar-18	OJ
654321	Apr-18	OJ
654321	May-18	OJ
654321	Jun-18	OJ
654321	Jul-18	OJ
654321	Aug-18	OJ
654321	Sep-18	OJ
654321	Oct-18	OJ
654321	Nov-18	OJ
654321	Dec-18	OJ
654321	Jan-19	OS
654321	Feb-19	OS
654321	Mar-19	OS
654322	Mar-19	.
;

proc sql;
create table want as
select ppid,rank
from (select ppid,rank,count(rank) as c from have(where=(rank is not missing)) group by ppid,rank)
group by ppid
having max(c)=c;
quit;

Reeza
Super User

PROC MEANS and the MODE function. 

First time I think I've known of an actual usage of the mode function that makes sense.

 

EDIT: if you need multiple modes with ties, then I think you need PROC UNIVARIATE with the ODS output instead. 

 


@BTAinRVA wrote:

Hello All,

 

I have a dataset that follows active-duty parents over a 15 month period and for several covariates the value , such as rank, may change. What I want to do is determine what rank they held for the majority of those 15 months. Does anyone have an elegant way of doing that?

 

 

Thanks,

 

Brian


 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 2131 views
  • 2 likes
  • 5 in conversation