BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
asinusdk
Calcite | Level 5

In a longitudinal data, the frequency can be calculated by proc freq, but what if I want to know whether a subject has a particular response or not over the time in a long format. I think it will be necessary to have a summarized table. 

For example, I have a long format data like below.

 

timeidv1
11no
21no
31yes
41yes
12no
22no
32no
42no
13yes
23no
33no
43

no

 from this table, I want to know how many subjects (id) have at least one response (yes) for v1 over time. 

In this case v1 is shown in id 1 and id 3 so that will be 2. Is it necessary to have a table like below to obtain the results? If it's the case, could you please let me know the programming code? Thanks.

 

idtotal_v1v1_freq
1yes2
2no0
3yes1
1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Thank you @asinusdk for clarifying

 


data have;
input time	id	v1 $;
cards;
1	1	no
2	1	no
3	1	yes
4	1	yes
1	2	no
2	2	no
3	2	no
4	2	no
1	3	yes
2	3	no
3	3	no
4	3	no
;
proc sort data=have out=_have;
by id v1;
run;


data want;
set _have;
by id v1 ;
if first.v1 then v1_freq=1;
else v1_freq+1;
if last.id ;
v1_freq=ifn(v1='yes',v1_freq,0);
keep id v1:;
run;

View solution in original post

13 REPLIES 13
novinosrin
Tourmaline | Level 20
data have;
input time	id	v1 $;
cards;
1	1	no
2	1	no
3	1	yes
4	1	yes
1	2	no
2	2	no
3	2	no
4	2	no
1	3	yes
2	3	no
3	3	no
4	3	no
;

data want;
do v1_freq=1 by 1  until(last.v1);
set have;
by id v1 notsorted;
end;
v1_freq=ifn(v1='yes',v1_freq,0);
keep id v1:;
run;

asinusdk
Calcite | Level 5

I tried the code and got the result like this:

 

idtotal_v1v1_freq
1yes2
1no0
2no0
3yes1
3no0

 

Rather, what I wanted looks like this.

 

idtotal_v1v1_freq
1yes2
2no0
3yes1

 

nodupkey didn't work here.... 

novinosrin
Tourmaline | Level 20

Ok, kindly clarify this.

 

Should a by group(id) have both no and yes, do you want only the counts of yes and ignore no?

asinusdk
Calcite | Level 5
That's true. What I want to know is the proportion of a subject that has at least one YES in v1 over the time period. Thanks.
novinosrin
Tourmaline | Level 20

Thank you @asinusdk for clarifying

 


data have;
input time	id	v1 $;
cards;
1	1	no
2	1	no
3	1	yes
4	1	yes
1	2	no
2	2	no
3	2	no
4	2	no
1	3	yes
2	3	no
3	3	no
4	3	no
;
proc sort data=have out=_have;
by id v1;
run;


data want;
set _have;
by id v1 ;
if first.v1 then v1_freq=1;
else v1_freq+1;
if last.id ;
v1_freq=ifn(v1='yes',v1_freq,0);
keep id v1:;
run;
asinusdk
Calcite | Level 5
Thank you so much. You saved my life.
novinosrin
Tourmaline | Level 20

That was a very minor change. I hope you are able to understand the change and follow the code. The sorting of NO and YES made it easy as YES in collating sequence will always be 2nd to NO and so when you process last.id (last obs of the group), it would always be YES if there were YES and NO in a by group and if it's only NO, it;s of course NO and it;s count.

 

The IFN function simply resets the NO count to zero to meet your requirement. Anyways, feel free to write back to us if you need any help. Cheers!

asinusdk
Calcite | Level 5
Thanks for the explanation! I'm trying to understand the logic.
Reeza
Super User

@asinusdk wrote:
That's true. What I want to know is the proportion of a subject that has at least one YES in v1 over the time period. Thanks.

So you want to be able to say 2/3 patients had at least yes at one point over the time period? If that's the case, you're better off using an indicator flag rather than a count. 

 

data have;
input time	id	v1 $;
cards;
1	1	no
2	1	no
3	1	yes
4	1	yes
1	2	no
2	2	no
3	2	no
4	2	no
1	3	yes
2	3	no
3	3	no
4	3	no
;
proc sort data=have out=_have;
by id descending v1;
run;


data want;
set _have;
by id descending v1 ;
flag=0;
if first.id and v1='yes' then flag=1;
if first.id then output;
run;

proc means data=want mean;
var flag;
run;

The output will be 66.67%, ie 2/3 have yes.

novinosrin
Tourmaline | Level 20

I wanted to have some fun with proc sql

 

data have;
input time	id	v1 $;
cards;
1	1	no
2	1	no
3	1	yes
4	1	yes
1	2	no
2	2	no
3	2	no
4	2	no
1	3	yes
2	3	no
3	3	no
4	3	no
;
proc sql;
create table want as
select distinct id ,v1, sum(v1='yes') as v1_freq
from have
group by id
having count(distinct v1)>=1 and v1='yes' or count(distinct v1)=1 and v1='no';
quit;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 3031 views
  • 0 likes
  • 3 in conversation