Hi all,
I have a (probably) simple question.
I havea panel data that looks like:
ID | Year |
1 | 2004 |
1 | 2005 |
1 | 2006 |
1 | 2007 |
2 | 2004 |
2 | 2005 |
2 | 2006 |
2 | 2007 |
2 | 2008 |
2 | 2009 |
What I want to do is... I want to ignore all panelists that had year 2009.
Can someone please tell me how I can do this?
Basically, I want SAS to go through each panelist and indicate (or remove) those that have 2009 year.
Thanks,
C
There are a number of ways to do it. I happen to like to use the datastep. One way to do it with a datastep would be:
data panel;
input ID Year income;
cards;
1 2004 50000
1 2005 60000
1 2006 70000
1 2007 80000
2 2004 60000
2 2005 70000
2 2006 80000
2 2007 90000
2 2008 100000
2 2009 110000
3 2004 60000
3 2005 70000
3 2006 80000
3 2007 90000
3 2008 100000
;
data want;
do until (last.id);
set panel;
by id;
if first.id then do;
n=0;
average_salary=0;
end;
if year eq 2009 then drop=1;
average_salary+income;
n+1;
end;
do until (last.id);
set panel;
by id;
if drop ne 1 then do;
if first.id then
average_salary=average_salary/n;
output;
end;
end;
run;
second question-
let's say I have a variable "income" for each panlist, and it takes a different value each year for each panelist. how can i calculate average income, across years, for each panelist?
thanks,
C
Can any panelist have a year greater than 2009?
yes. as long as 2009 is not there.
There are a number of ways to do it. I happen to like to use the datastep. One way to do it with a datastep would be:
data panel;
input ID Year income;
cards;
1 2004 50000
1 2005 60000
1 2006 70000
1 2007 80000
2 2004 60000
2 2005 70000
2 2006 80000
2 2007 90000
2 2008 100000
2 2009 110000
3 2004 60000
3 2005 70000
3 2006 80000
3 2007 90000
3 2008 100000
;
data want;
do until (last.id);
set panel;
by id;
if first.id then do;
n=0;
average_salary=0;
end;
if year eq 2009 then drop=1;
average_salary+income;
n+1;
end;
do until (last.id);
set panel;
by id;
if drop ne 1 then do;
if first.id then
average_salary=average_salary/n;
output;
end;
end;
run;
thank you, sir!
SQL is easy for this situation.
data panel; input ID Year income; cards; 1 2004 50000 1 2005 60000 1 2006 70000 1 2007 80000 2 2004 60000 2 2005 70000 2 2006 80000 2 2007 90000 2 2008 100000 2 2009 110000 3 2004 60000 3 2005 70000 3 2006 80000 3 2007 90000 3 2008 100000 ; run; proc sql; create table want as select *,avg(income) as avg_income from panel group by id having sum(year=2009)=0;quit;
Ksharp
thanks, Ksharp. this almost did the job. except, I need to take average including .'s. In other words, the average is taken only using non-blank values. I want to divide total by (#blank + #non-blank).
thanks,
C
You could still use KSharp's suggested method, but you would have to include the calculation of average. e.g.,
data panel;
input ID Year income;
cards;
1 2004 50000
1 2005 .
1 2006 .
1 2007 50000
2 2004 60000
2 2005 70000
2 2006 80000
2 2007 90000
2 2008 100000
2 2009 110000
3 2004 60000
3 2005 70000
3 2006 80000
3 2007 90000
3 2008 100000
;
run;
proc sql;
create table want as
select *,sum(income)/count(*) as avg_income
from panel
group by id
having sum(year=2009)=0;
quit;
I am not sure I understand this requirement. Are you saying that if a panelist did not report an income for a year you want to treat that as $0?
Another method that I like to use that requires fewer copies of the data is create a selection criteria that can be used in a where statement.
So for your example I would get a list of id values and use it to subset the data.
proc sql noprint ;
select distinct id into :id2009 separate by ' ' from have where year=2009;
quit;
proc means data=have;
where id not in (&id2009);
var income;
run;
Art has already given you the correct code.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.