BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
r4321
Pyrite | Level 9

Hello all, 

 

I want to data that follows teams across many years of data. I want to calculate both (a) when new members join the team and (b) when members leave the team. For example, if in the year 2007, 2 members are new to the data set that did not exist in the year 2006, I want to count those 2 members. Additionally, if in the year 2007, 1 member is no longer a part of the data set that did exist in the year 2006, I want to count that as well. I have two character variables that I can use to calculate this but I'm unsure of how to write the code, so I'm hoping you can help me. 

 

My data is like so: 

 

TeamID        Year            TeamMemberName(char var)      TeamMemberID(char var)

1                  2006            TimJones                                      11

1                  2006            JoeSmith                                       22

1                  2006            BobBell                                         33

1                  2007            TimJones                                      11

1                  2007            JoeSmith                                       22

1                  2007            JayWatt                                         44

1                  2007            JohnJay                                         55

1                  2008            TimJones                                      11

1                  2008            JoeSmith                                       22

1                  2008            JayWatt                                         44

 

(except I have the data for many teams)

 

 

I want: 

 

TeamID   Year     NumberofNewMembers     NumberthatLeft

1             2007         2                                            1          

 

1             2008         0                                            1

 

 

 

 

Could someone please advise with the code?

 

Thanks

 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Ok @r4321 I'm afraid I did this in a  hurry as I am running for my midterm exam at my college. This should work, however feel free if there is any changes, I will review later in the evening. Have fun!

data have;
input TeamID        Year            TeamMemberName $     TeamMemberID $;
datalines;
1                  2006            TimJones                                      11
1                  2006            JoeSmith                                       22
1                  2006            BobBell                                         33
1                  2007            TimJones                                      11
1                  2007            JoeSmith                                       22
1                  2007            JayWatt                                         44
1                  2007            JohnJay                                         55
1                  2008            TimJones                                      11
1                  2008            JoeSmith                                       22
1                  2008            JayWatt                                         44
;

data want;
if _n_=1 then do;
if 0 then set have;
  dcl hash h();
  h.definekey('teamid','TeamMemberID');
  h.definedata('teamid','TeamMemberID');
  h.definedone();
  end;
  _n=0;
  do until(last.teamid);
  	do _n_=1 by 1 until(last.year);
  		set have end=last;
  		by teamid year TeamMemberID;
  		array t(100) $;
  		if first.year then do;call missing(of t(*)); _n+1;NumberofNewMembers=0;NumberthatLeft=0;end;
  		if _n=1 then h.add();
  		else if _n>1 then 
			do;
  				if h.check()=0 then do;t(_n_)=TeamMemberID; h.remove();end;
  				else if h.check() ne 0 then do; NumberofNewMembers+1;t(_n_)=TeamMemberID;end;
				if last.year then 
				do; 
					NumberthatLeft = h.num_items;h.clear();
					do _i=1 to _n_;
					h.add(key:teamid,key: t(_i),data:teamid,data:t(_i));
					end;
					output;
				end;
			end;
	end;
  end;
keep TeamID   Year     NumberofNewMembers     NumberthatLeft;
run;

View solution in original post

5 REPLIES 5
Reeza
Super User
  • Sort by Team Member and Year
  • Flag each persons first year using FIRST. type logic
  • Otherwise, person is flagged as returning. 
  • Flag each persons last year as LEFT (how to deal with last year of data?)
  • Summarize values using PROC MEANS or SQL. 

 

Here's some rough logic that should get you started. 

 


@r4321 wrote:

Hello all, 

 

I want to data that follows teams across many years of data. I want to calculate both (a) when new members join the team and (b) when members leave the team. For example, if in the year 2007, 2 members are new to the data set that did not exist in the year 2006, I want to count those 2 members. Additionally, if in the year 2007, 1 member is no longer a part of the data set that did exist in the year 2006, I want to count that as well. I have two character variables that I can use to calculate this but I'm unsure of how to write the code, so I'm hoping you can help me. 

 

My data is like so: 

 

TeamID        Year            TeamMemberName(char var)      TeamMemberID(char var)

1                  2006            TimJones                                      11

1                  2006            JoeSmith                                       22

1                  2006            BobBell                                         33

1                  2007            TimJones                                      11

1                  2007            JoeSmith                                       22

1                  2007            JayWatt                                         44

1                  2007            JohnJay                                         55

1                  2008            TimJones                                      11

1                  2008            JoeSmith                                       22

1                  2008            JayWatt                                         44

 

(except I have the data for many teams)

 

 

I want: 

 

TeamID   Year     NumberofNewMembers     NumberthatLeft

1             2007         2                                            1          

 

1             2008         0                                            1

 

 

 

 

Could someone please advise with the code?

 

Thanks

 


 

r4321
Pyrite | Level 9
I can follow the logic, but the main thing I'm having difficulty with specifically is how to write the code that compares the current year to the previous year to find (a) the number of new team members and (b) the number of team members that left.

Thanks
Reeza
Super User

You don't. You identify each persons first year with the FIRST logic. 

 

If you're not familiar with FIRST/LAST logic then you should look into it. Basically it will identify the first/last record for a group automatically, which is essentially what you're doing, if you think about it slightly differently. Because if it's not first or last year, then that means they were here last year and are here next year. But you do need to think about how to report the last year, because everyone in the last year will be identified as last. 

 

Quick demo of first/last

 

proc sort data=sashelp.class out=class;
by sex age;
run;

data youngest_by_sex;
set class;
by sex;

if first.sex then output;
run;

data oldest_by_sex;
set class;
by sex;

if last.sex then output;
run;

The documentation has an entire chapter that is quite informative.

http://documentation.sas.com/?docsetId=lrcon&docsetTarget=n138da4gme3zb7n1nifpfhqv7clq.htm&docsetVer...

 

If you post what you have so far I'm happy to help. 

 

 

novinosrin
Tourmaline | Level 20

Ok @r4321 I'm afraid I did this in a  hurry as I am running for my midterm exam at my college. This should work, however feel free if there is any changes, I will review later in the evening. Have fun!

data have;
input TeamID        Year            TeamMemberName $     TeamMemberID $;
datalines;
1                  2006            TimJones                                      11
1                  2006            JoeSmith                                       22
1                  2006            BobBell                                         33
1                  2007            TimJones                                      11
1                  2007            JoeSmith                                       22
1                  2007            JayWatt                                         44
1                  2007            JohnJay                                         55
1                  2008            TimJones                                      11
1                  2008            JoeSmith                                       22
1                  2008            JayWatt                                         44
;

data want;
if _n_=1 then do;
if 0 then set have;
  dcl hash h();
  h.definekey('teamid','TeamMemberID');
  h.definedata('teamid','TeamMemberID');
  h.definedone();
  end;
  _n=0;
  do until(last.teamid);
  	do _n_=1 by 1 until(last.year);
  		set have end=last;
  		by teamid year TeamMemberID;
  		array t(100) $;
  		if first.year then do;call missing(of t(*)); _n+1;NumberofNewMembers=0;NumberthatLeft=0;end;
  		if _n=1 then h.add();
  		else if _n>1 then 
			do;
  				if h.check()=0 then do;t(_n_)=TeamMemberID; h.remove();end;
  				else if h.check() ne 0 then do; NumberofNewMembers+1;t(_n_)=TeamMemberID;end;
				if last.year then 
				do; 
					NumberthatLeft = h.num_items;h.clear();
					do _i=1 to _n_;
					h.add(key:teamid,key: t(_i),data:teamid,data:t(_i));
					end;
					output;
				end;
			end;
	end;
  end;
keep TeamID   Year     NumberofNewMembers     NumberthatLeft;
run;
Ksharp
Super User

Very interesting Question.

 

data have;
input TeamID        Year            TeamMemberName $     TeamMemberID $;
datalines;
1                  2006            TimJones                                      11
1                  2006            JoeSmith                                       22
1                  2006            BobBell                                         33
1                  2007            TimJones                                      11
1                  2007            JoeSmith                                       22
1                  2007            JayWatt                                         44
1                  2007            JohnJay                                         55
1                  2008            TimJones                                      11
1                  2008            JoeSmith                                       22
1                  2008            JayWatt                                         44
;
data have;
 set have;
 by teamid year;
 if first.teamid then group=0;
 group+first.year;
run;
data want;
 if _n_=1 then do;
  if 0 then set have;
  declare hash h();
  h.definekey('teammemberid');
  h.definedone();
 end;
 

NumberofNewMembers=0;
do until(last.year);
 set have ;
 by teamid year;
 if  h.check() ne 0 then NumberofNewMembers+1;    
 rc=h.remove();
end;

if group ne 1 then do;
 NumberthatLeft=h.num_items;
 output;
end;
h.clear();

do until(last.year);
 set have ;
 by teamid year;
 h.ref();
end;
 keep TeamID   Year     NumberofNewMembers     NumberthatLeft;
run;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1287 views
  • 2 likes
  • 4 in conversation