Hello,
I am working on a dataset (see below). The dataset list the components of a product and other relevant information.
Dataset
FirmID ProductID BeginYear MarketYear Components
10001 AB259T 1996 1998 B30 | A
10001 AB259T 1996 1998 V55 | U
10001 AA386G 1999 2001 T92 | B
10001 AA386G 1999 2001 R26 | L
10001 AA386G 1999 2001 V55 | U
10002 CT746B 2007 2009 U38 | P
10002 CT746B 2007 2009 X48 | O
10002 CT746B 2007 2009 K93 | E
10002 RU239F 2008 2010 O28 | L
10002 RU239F 2008 2010 A47 | W
10003 TY569I 2008 2010 Q15 | P
10003 GC992M 2009 2011 W91 | O
10003 GC992M 2009 2011 Q15 | P
All the components of each firm consist of a component base. This component base is updated annually as firms introduce more products. For example, the component base for Firm 10001 in 1996 only includes two components ( B30 | A and V55 | U) and it has four components in 1999 ( B30 | A, V55 | U, T92 | B and R26 | L).
I would like to know how many incremental products and radical products each firm has for each year.
If all the components of a new product do not overlap with the existing component base, it will be considered as radical products. For example, RU239F is a radical product for firm 10002 because none of its components overlaps with its existing component base.
If one of the components of a new product overlaps with the existing component base, it will be considered as incremental products. For example, AA386G is an incremental product for firm 10001 because one of its components (V55 | U) already exists in its component base. So as GC992M for firm 10003.
Can anyone tell me what code do I need to use? Thanks.
Assuming I understand your question and Data has been sorted by FirmID ProductID BeginYear ;
data have;
input
FirmID ProductID $ BeginYear MarketYear Components & $20. ;
cards;
10001 AB259T 1996 1998 B30 | A
10001 AB259T 1996 1998 V55 | U
10001 AA386G 1999 2001 T92 | B
10001 AA386G 1999 2001 R26 | L
10001 AA386G 1999 2001 V55 | U
10002 CT746B 2007 2009 U38 | P
10002 CT746B 2007 2009 X48 | O
10002 CT746B 2007 2009 K93 | E
10002 RU239F 2008 2010 O28 | L
10002 RU239F 2008 2010 A47 | W
10003 TY569I 2008 2010 Q15 | P
10003 GC992M 2009 2011 W91 | O
10003 GC992M 2009 2011 Q15 | P
;
data want;
if _n_=1 then do;
if 0 then set have;
declare hash h();
h.definekey(' Components');
h.definedone();
end;
do until(last.BeginYear);
set have;
by FirmID ProductID BeginYear notsorted;
if first.FirmID then h.clear();
if h.check()=0 then found=1;
if not missing(Components) then h.replace();
end;
do until(last.BeginYear);
set have;
by FirmID ProductID BeginYear notsorted;
flag=ifc(found,'incremental','radical ');
output;
end;
run;
Do you have a license for SAS/OR?
PROC BOM may be helpful here.
I do not think I have the license for SAS/OR.
Is there any other way to get the result? It is unnecessary to use loop. Any code is fine as long as it can get the expected result.
If you don't mind SQL:
proc sql;
create table intro as
select
*,
min(beginYear) as introYear
from have
group by firmId, components;
create table want as
select
firmId,
beginYear,
count(distinct case when beginYear > introYear then productId else "" end) as nbIncremental,
count(distinct productId) - calculated nbIncremental as nbRadical
from intro
group by firmId, beginYear;
quit;
Expected results for the example data would certainly help comfort my understanding of OP's question. My understanding is that if at least one of its components has been introduced before a product begins then the product is incremental; otherwise, the product is radical. That's what my query is counting.
Thanks for your answer. Your understanding is correct.
Hi PG,
Thanks for your code. I have one more question.
When a component is first used and this component shows up more than once in the same year, I would like to compare the number in the firmID. The product with the smallest number will be considered as radical and the remainings will be considered as incremental. The format of the product ID is consistent such as AA#########B. The digits of the number are multiple.
For example, suppose firm 10001 have two more products (AA425T and AA427B) in 1999 and D42 | U is a new component. Because 425 is less than 427, AA425T is regarded as radical innovation while AA427B is considered as incremental innovation.
FirmID ProductID BeginYear MarketYear Components
10001 AB259T 1996 1998 B30 | A
10001 AB259T 1996 1998 V55 | U
10001 AA386G 1999 2001 T92 | B
10001 AA386G 1999 2001 R26 | L
10001 AA386G 1999 2001 V55 | U
10001 AA425T 1999 2001 D42 | U
10001 AA427B 1999 2002 D42 | U
10002 CT746B 2007 2009 U38 | P
10002 CT746B 2007 2009 X48 | O
10002 CT746B 2007 2009 K93 | E
10002 RU239F 2008 2010 O28 | L
10002 RU239F 2008 2010 A47 | W
10003 TY569I 2008 2010 Q15 | P
10003 GC992M 2009 2011 W91 | O
10003 GC992M 2009 2011 Q15 | P
What should I add in the current code? Thanks!
Assuming I understand your question and Data has been sorted by FirmID ProductID BeginYear ;
data have;
input
FirmID ProductID $ BeginYear MarketYear Components & $20. ;
cards;
10001 AB259T 1996 1998 B30 | A
10001 AB259T 1996 1998 V55 | U
10001 AA386G 1999 2001 T92 | B
10001 AA386G 1999 2001 R26 | L
10001 AA386G 1999 2001 V55 | U
10002 CT746B 2007 2009 U38 | P
10002 CT746B 2007 2009 X48 | O
10002 CT746B 2007 2009 K93 | E
10002 RU239F 2008 2010 O28 | L
10002 RU239F 2008 2010 A47 | W
10003 TY569I 2008 2010 Q15 | P
10003 GC992M 2009 2011 W91 | O
10003 GC992M 2009 2011 Q15 | P
;
data want;
if _n_=1 then do;
if 0 then set have;
declare hash h();
h.definekey(' Components');
h.definedone();
end;
do until(last.BeginYear);
set have;
by FirmID ProductID BeginYear notsorted;
if first.FirmID then h.clear();
if h.check()=0 then found=1;
if not missing(Components) then h.replace();
end;
do until(last.BeginYear);
set have;
by FirmID ProductID BeginYear notsorted;
flag=ifc(found,'incremental','radical ');
output;
end;
run;
Thanks for your reply, Ksharp!
Based on your code, it only shows whether the product is incremental or radical. How can I count the number of incremental or radical products for each firm in each year? Thanks.
Can you post the output you need ?
FirmID BeginYear nb_Incremental nb_Radical
10001 1996 1 1
10001 1999 1 1
10002 2007 0 1
10002 2008 0 1
10003 2008 0 1
10003 2009 1 0
nb_Incremental = number of Incremental Products
nb_Radical = number of Radical Products
Thank you, Ksharp!
I don't understand why the first obs have both Incremental and Radical product ? since it is only have one product .
Sorry, it is a typo. The number of incremental products should be 0 in 1996 for Firm 10001. The output below is the correct one.
FirmID BeginYear nb_Incremental nb_Radical
10001 1996 0 1
10001 1999 1 0
10002 2007 0 1
10002 2008 0 1
10003 2008 0 1
10003 2009 1 0
nb_Incremental = number of Incremental Products
nb_Radical = number of Radical Products
Actually you could tweak my code a little bit and get what you want.
data have;
input
FirmID ProductID $ BeginYear MarketYear Components & $20. ;
cards;
10001 AB259T 1996 1998 B30 | A
10001 AB259T 1996 1998 V55 | U
10001 AA386G 1999 2001 T92 | B
10001 AA386G 1999 2001 R26 | L
10001 AA386G 1999 2001 V55 | U
10002 CT746B 2007 2009 U38 | P
10002 CT746B 2007 2009 X48 | O
10002 CT746B 2007 2009 K93 | E
10002 RU239F 2008 2010 O28 | L
10002 RU239F 2008 2010 A47 | W
10003 TY569I 2008 2010 Q15 | P
10003 GC992M 2009 2011 W91 | O
10003 GC992M 2009 2011 Q15 | P
;
data want;
if _n_=1 then do;
if 0 then set have;
declare hash h();
h.definekey(' Components');
h.definedone();
end;
do until(last.BeginYear);
set have;
by FirmID ProductID BeginYear notsorted;
if first.FirmID then h.clear();
if h.check()=0 then found=1;
if not missing(Components) then h.replace();
end;
do until(last.BeginYear);
set have;
by FirmID ProductID BeginYear notsorted;
flag=ifc(found,'incremental','radical ');
output;
end;
run;
data want2;
set want;
by firmid beginyear notsorted;
if first.beginyear;
v=1;
keep firmid beginyear flag v;
run;
proc transpose data=want2 out=want3(drop=_:);
by firmid beginyear notsorted;
id flag;
var v;
run;
proc stdize data=want3 out=want4 missing=0 reponly;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.