Hello all,
I want to data that follows teams across many years of data. I want to calculate both (a) when new members join the team and (b) when members leave the team. For example, if in the year 2007, 2 members are new to the data set that did not exist in the year 2006, I want to count those 2 members. Additionally, if in the year 2007, 1 member is no longer a part of the data set that did exist in the year 2006, I want to count that as well. I have two character variables that I can use to calculate this but I'm unsure of how to write the code, so I'm hoping you can help me.
My data is like so:
TeamID Year TeamMemberName(char var) TeamMemberID(char var)
1 2006 TimJones 11
1 2006 JoeSmith 22
1 2006 BobBell 33
1 2007 TimJones 11
1 2007 JoeSmith 22
1 2007 JayWatt 44
1 2007 JohnJay 55
1 2008 TimJones 11
1 2008 JoeSmith 22
1 2008 JayWatt 44
(except I have the data for many teams)
I want:
TeamID Year NumberofNewMembers NumberthatLeft
1 2007 2 1
1 2008 0 1
Could someone please advise with the code?
Thanks
Ok @r4321 I'm afraid I did this in a hurry as I am running for my midterm exam at my college. This should work, however feel free if there is any changes, I will review later in the evening. Have fun!
data have;
input TeamID Year TeamMemberName $ TeamMemberID $;
datalines;
1 2006 TimJones 11
1 2006 JoeSmith 22
1 2006 BobBell 33
1 2007 TimJones 11
1 2007 JoeSmith 22
1 2007 JayWatt 44
1 2007 JohnJay 55
1 2008 TimJones 11
1 2008 JoeSmith 22
1 2008 JayWatt 44
;
data want;
if _n_=1 then do;
if 0 then set have;
dcl hash h();
h.definekey('teamid','TeamMemberID');
h.definedata('teamid','TeamMemberID');
h.definedone();
end;
_n=0;
do until(last.teamid);
do _n_=1 by 1 until(last.year);
set have end=last;
by teamid year TeamMemberID;
array t(100) $;
if first.year then do;call missing(of t(*)); _n+1;NumberofNewMembers=0;NumberthatLeft=0;end;
if _n=1 then h.add();
else if _n>1 then
do;
if h.check()=0 then do;t(_n_)=TeamMemberID; h.remove();end;
else if h.check() ne 0 then do; NumberofNewMembers+1;t(_n_)=TeamMemberID;end;
if last.year then
do;
NumberthatLeft = h.num_items;h.clear();
do _i=1 to _n_;
h.add(key:teamid,key: t(_i),data:teamid,data:t(_i));
end;
output;
end;
end;
end;
end;
keep TeamID Year NumberofNewMembers NumberthatLeft;
run;
Here's some rough logic that should get you started.
@r4321 wrote:
Hello all,
I want to data that follows teams across many years of data. I want to calculate both (a) when new members join the team and (b) when members leave the team. For example, if in the year 2007, 2 members are new to the data set that did not exist in the year 2006, I want to count those 2 members. Additionally, if in the year 2007, 1 member is no longer a part of the data set that did exist in the year 2006, I want to count that as well. I have two character variables that I can use to calculate this but I'm unsure of how to write the code, so I'm hoping you can help me.
My data is like so:
TeamID Year TeamMemberName(char var) TeamMemberID(char var)
1 2006 TimJones 11
1 2006 JoeSmith 22
1 2006 BobBell 33
1 2007 TimJones 11
1 2007 JoeSmith 22
1 2007 JayWatt 44
1 2007 JohnJay 55
1 2008 TimJones 11
1 2008 JoeSmith 22
1 2008 JayWatt 44
(except I have the data for many teams)
I want:
TeamID Year NumberofNewMembers NumberthatLeft
1 2007 2 1
1 2008 0 1
Could someone please advise with the code?
Thanks
You don't. You identify each persons first year with the FIRST logic.
If you're not familiar with FIRST/LAST logic then you should look into it. Basically it will identify the first/last record for a group automatically, which is essentially what you're doing, if you think about it slightly differently. Because if it's not first or last year, then that means they were here last year and are here next year. But you do need to think about how to report the last year, because everyone in the last year will be identified as last.
Quick demo of first/last
proc sort data=sashelp.class out=class;
by sex age;
run;
data youngest_by_sex;
set class;
by sex;
if first.sex then output;
run;
data oldest_by_sex;
set class;
by sex;
if last.sex then output;
run;
The documentation has an entire chapter that is quite informative.
If you post what you have so far I'm happy to help.
Ok @r4321 I'm afraid I did this in a hurry as I am running for my midterm exam at my college. This should work, however feel free if there is any changes, I will review later in the evening. Have fun!
data have;
input TeamID Year TeamMemberName $ TeamMemberID $;
datalines;
1 2006 TimJones 11
1 2006 JoeSmith 22
1 2006 BobBell 33
1 2007 TimJones 11
1 2007 JoeSmith 22
1 2007 JayWatt 44
1 2007 JohnJay 55
1 2008 TimJones 11
1 2008 JoeSmith 22
1 2008 JayWatt 44
;
data want;
if _n_=1 then do;
if 0 then set have;
dcl hash h();
h.definekey('teamid','TeamMemberID');
h.definedata('teamid','TeamMemberID');
h.definedone();
end;
_n=0;
do until(last.teamid);
do _n_=1 by 1 until(last.year);
set have end=last;
by teamid year TeamMemberID;
array t(100) $;
if first.year then do;call missing(of t(*)); _n+1;NumberofNewMembers=0;NumberthatLeft=0;end;
if _n=1 then h.add();
else if _n>1 then
do;
if h.check()=0 then do;t(_n_)=TeamMemberID; h.remove();end;
else if h.check() ne 0 then do; NumberofNewMembers+1;t(_n_)=TeamMemberID;end;
if last.year then
do;
NumberthatLeft = h.num_items;h.clear();
do _i=1 to _n_;
h.add(key:teamid,key: t(_i),data:teamid,data:t(_i));
end;
output;
end;
end;
end;
end;
keep TeamID Year NumberofNewMembers NumberthatLeft;
run;
Very interesting Question.
data have;
input TeamID Year TeamMemberName $ TeamMemberID $;
datalines;
1 2006 TimJones 11
1 2006 JoeSmith 22
1 2006 BobBell 33
1 2007 TimJones 11
1 2007 JoeSmith 22
1 2007 JayWatt 44
1 2007 JohnJay 55
1 2008 TimJones 11
1 2008 JoeSmith 22
1 2008 JayWatt 44
;
data have;
set have;
by teamid year;
if first.teamid then group=0;
group+first.year;
run;
data want;
if _n_=1 then do;
if 0 then set have;
declare hash h();
h.definekey('teammemberid');
h.definedone();
end;
NumberofNewMembers=0;
do until(last.year);
set have ;
by teamid year;
if h.check() ne 0 then NumberofNewMembers+1;
rc=h.remove();
end;
if group ne 1 then do;
NumberthatLeft=h.num_items;
output;
end;
h.clear();
do until(last.year);
set have ;
by teamid year;
h.ref();
end;
keep TeamID Year NumberofNewMembers NumberthatLeft;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.