Hello,
I have a large panel data set, I would like to find a new value in VAR compared with all prior observations within-group BY FIRM, YEAR
DATA HAVE;
INPUT (FIRM YEAR VAR) (:$8.);
CARDS;
A 2000 dog
A 2001 dog
A 2001 cat
A 2002 dog
A 2002 bird
A 2003 dog
A 2003 chicken
A 2004 chicken
A 2004 mice
A 2005 pig
A 2005 mice
A 2006 raccoon
A 2006 pig
B 2000 fries
B 2001 burger
B 2001 steak
B 2002 burger
B 2002 fries
B 2003 steak
B 2003 pasta
B 2004 pasta
B 2004 cheese
B 2005 rice
B 2005 meatloaf
B 2006 pickle
B 2006 chili
;
run;
This is what I want:
DATA WANT;
INPUT (FIRM YEAR VAR) (:$8.);
CARDS;
A 2000 dog
A 2001 cat
A 2002 bird
A 2003 chicken
A 2004 mice
A 2005 pig
A 2006 raccoon
B 2000 fries
B 2001 burger
B 2001 steak
B 2003 pasta
B 2004 cheese
B 2005 rice
B 2005 meatloaf
B 2006 pickle
B 2006 chili
;
run;
So, for each FIRM in a given YEAR, I want to find the new value in VAR.
How to define it as 'new'? Compare VAR in a given YEAR with all observations of VAR in all prior YEAR
For example, to find the new VAR for A in 2004, I will compare VAR in 2004 with VAR in 2003, 2002, 2001, and 2000.
to find the new values of VAR for A in 2006, I will compare VAR in 2006 with VAR in 2005, 2004, 2003, 2002, 2001, and 2000.
Same logic for B.
I wonder can I do this with SQL and without a loop macro?
Thank you very much. Appreciate your help.
What role does the Firm variable play in this process? You do not mention it at all.
Hi,
FIRM is the ID variable, YEAR is the time variable, VAR is the variable I look for new value
If you don't insist on the order in "want", try
proc sort data=have out=want nodupkey;
by firm var;
run;
Then the first step: How do I get the dataset WANT?
Hello @yanshuai,
With PROC SQL you can do something like this:
proc sql;
create table want as
select firm, min(year) as year, var
from have
group by 1,3
order by 1,2;
quit;
Note, however, that the ORDER BY clause does not necessarily maintain the sort order within a (FIRM, YEAR) BY group, for example, the order of "burger" and "steak" for FIRM="B" and YEAR="2001". To maintain this sort order you could introduce a temporary sequential number and use it as a sort key:
data _tmp / view=_tmp;
set have;
by firm year;
_seqno=_n_;
run;
proc sql;
create table want(drop=_s) as
select firm, min(year) as year, var, min(_seqno) as _s
from _tmp
group by 1,3
order by 1,4;
drop view _tmp;
quit;
Interesting.
Your code should be what I expect. Very beautiful. How could I not think of this? lol
Thank you very much
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.