BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dapenDaniel
Obsidian | Level 7

Hello,

 

I have two datasets. The first one records the products that companies produce each year and the components for each product ("Categories" column is only for convenience and it is not in my real dataset). The second file is a component base for each year. It is updated annually as firms introduce more products. The second file has been sorted by year and components.

 

If all the components of a new product do not overlap with the existing component base, it will be considered as radical products. For example, Firm 10001 introduced AC449P in 1993 and its components were Y38 and I92, which do not exist in the component base before 1993 in file 2. Thus AC449P is a radical product.

 

If one of the components of a new product overlaps with the existing component base, it will be considered as incremental products. For example, Firm 10001 introduced AA386G in 1992 and its components were B30, V55, and T92. B30 and V55 exist in the component base before 1992 (both are in the component base of 1991). So AA386G is an incremental product.

 

I would like to know how many incremental products and radical products that each firm has for each year. The output that I want is below. In case you need the file, I also attached the excel file.

 

 

File 1  

FirmIDProductIDBeginYearComponentsCategories
10001AA386G1992B30Incremental
10001AA386G1992V55
10001AA386G1992T92
10001AB259T1993R26Incremental
10001AB259T1993V55
10001AB259T1993U38
10001AC449P1993Y38radical
10001AC449P1993I92
10002CT746B1991X48radical
10002CT746B1991K93
10002CT746B1991O28
10002RU239F1992A47Incremental
10002RU239F1992Q15
10002DH469E1993W91Incremental
10002DH469E1993Q15
10003CN726Q1991E55radical
10003SZ116R1992E55Incremental
10003SZ116R1992W87
10003TU692E1992N67Incremental
10003TU692E1992I92


File 2

YearComponents
1990H32
1990N67
1990P99
1990R88
1991A47
1991B30
1991E55
1991K93
1991O28
1991Q15
1991V55
1991W87
1991X48
1992C11
1992T92
1992W91
1992Y67
1993I92
1993R26
1993S23
1993U38
1993Y38

 

Output

 

FirmIDBeginYearnb_Incrementalnb_Radical
10001199210
10001199311
10002199101
10002199210
10002199310
10003199101
10003199220
1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data products;
  infile datalines truncover dlm=' ';
  input FirmID $ ProductID $ BeginYear Components $;
  datalines;
10001 AA386G 1992 B30
10001 AA386G 1992 V55
10001 AA386G 1992 T92
10001 AB259T 1993 R26
10001 AB259T 1993 V55
10001 AB259T 1993 U38
10001 AC449P 1993 Y38
10001 AC449P 1993 I92
10002 CT746B 1991 X48
10002 CT746B 1991 K93
10002 CT746B 1991 O28
10002 RU239F 1992 A47
10002 RU239F 1992 Q15
10002 DH469E 1993 W91
10002 DH469E 1993 Q15
10003 CN726Q 1991 E55
10003 SZ116R 1992 E55
10003 SZ116R 1992 W87
10003 TU692E 1992 N67
10003 TU692E 1992 I92
;

data components;
  infile datalines truncover dlm=' ';
  input Year Components $;
  datalines;
1990 H32
1990 N67
1990 P99
1990 R88
1991 A47
1991 B30
1991 E55
1991 K93
1991 O28
1991 Q15
1991 V55
1991 W87
1991 X48
1992 C11
1992 T92
1992 W91
1992 Y67
1993 I92
1993 R26
1993 S23
1993 U38
1993 Y38
;
proc sql noprint;
 select min(year) into : start from  components;
quit;

data temp;
if _n_=1 then do;
 if 0 then set components;
 declare hash h(dataset:'components',hashexp:20);
 h.definekey('Year',' Components');
 h.definedone();
end;
set products;
do year=&start to BeginYear-1;
 if h.check()=0 then do;found=1;leave;end;
end;
drop year;
run;

proc sql;
create table temp2 as
select 	FirmID,BeginYear,ProductID,ifn(sum(found)>0,1,0) as flag
 from temp
  group by FirmID,BeginYear,ProductID ;

create table want as
select FirmID,BeginYear,sum(flag) as  nb_Incremental,sum(flag=0) as nb_Radical
 from temp2
  group by FirmID,BeginYear	;
quit;

View solution in original post

3 REPLIES 3
Patrick
Opal | Level 21

Here an approach using SQL.

data products;
  infile datalines truncover dlm=' ';
  input FirmID $ ProductID $ BeginYear Components $;
  datalines;
10001 AA386G 1992 B30
10001 AA386G 1992 V55
10001 AA386G 1992 T92
10001 AB259T 1993 R26
10001 AB259T 1993 V55
10001 AB259T 1993 U38
10001 AC449P 1993 Y38
10001 AC449P 1993 I92
10002 CT746B 1991 X48
10002 CT746B 1991 K93
10002 CT746B 1991 O28
10002 RU239F 1992 A47
10002 RU239F 1992 Q15
10002 DH469E 1993 W91
10002 DH469E 1993 Q15
10003 CN726Q 1991 E55
10003 SZ116R 1992 E55
10003 SZ116R 1992 W87
10003 TU692E 1992 N67
10003 TU692E 1992 I92
;

data components;
  infile datalines truncover dlm=' ';
  input Year Components $;
  datalines;
1990 H32
1990 N67
1990 P99
1990 R88
1991 A47
1991 B30
1991 E55
1991 K93
1991 O28
1991 Q15
1991 V55
1991 W87
1991 X48
1992 C11
1992 T92
1992 W91
1992 Y67
1993 I92
1993 R26
1993 S23
1993 U38
1993 Y38
;

proc sql;
  select 
    FirmID,
    BeginYear,
    sum(radical_flg=0) as nb_Incremental,
    sum(radical_flg) as nb_Radical
  from
    (
      select 
        p.FirmID, 
        p.ProductID,
        p.BeginYear,
        sum(missing(c.components))=count(*) as radical_flg
      from 
        products p left join components c
        on p.components=c.components and p.BeginYear > c.Year
      group by FirmID, BeginYear, ProductID
    )
  group by FirmID, BeginYear
  ;
quit;

Capture.JPG

Ksharp
Super User
data products;
  infile datalines truncover dlm=' ';
  input FirmID $ ProductID $ BeginYear Components $;
  datalines;
10001 AA386G 1992 B30
10001 AA386G 1992 V55
10001 AA386G 1992 T92
10001 AB259T 1993 R26
10001 AB259T 1993 V55
10001 AB259T 1993 U38
10001 AC449P 1993 Y38
10001 AC449P 1993 I92
10002 CT746B 1991 X48
10002 CT746B 1991 K93
10002 CT746B 1991 O28
10002 RU239F 1992 A47
10002 RU239F 1992 Q15
10002 DH469E 1993 W91
10002 DH469E 1993 Q15
10003 CN726Q 1991 E55
10003 SZ116R 1992 E55
10003 SZ116R 1992 W87
10003 TU692E 1992 N67
10003 TU692E 1992 I92
;

data components;
  infile datalines truncover dlm=' ';
  input Year Components $;
  datalines;
1990 H32
1990 N67
1990 P99
1990 R88
1991 A47
1991 B30
1991 E55
1991 K93
1991 O28
1991 Q15
1991 V55
1991 W87
1991 X48
1992 C11
1992 T92
1992 W91
1992 Y67
1993 I92
1993 R26
1993 S23
1993 U38
1993 Y38
;
proc sql noprint;
 select min(year) into : start from  components;
quit;

data temp;
if _n_=1 then do;
 if 0 then set components;
 declare hash h(dataset:'components',hashexp:20);
 h.definekey('Year',' Components');
 h.definedone();
end;
set products;
do year=&start to BeginYear-1;
 if h.check()=0 then do;found=1;leave;end;
end;
drop year;
run;

proc sql;
create table temp2 as
select 	FirmID,BeginYear,ProductID,ifn(sum(found)>0,1,0) as flag
 from temp
  group by FirmID,BeginYear,ProductID ;

create table want as
select FirmID,BeginYear,sum(flag) as  nb_Incremental,sum(flag=0) as nb_Radical
 from temp2
  group by FirmID,BeginYear	;
quit;

Ksharp
Super User

For your second question.


data temp;
if _n_=1 then do;
 if 0 then set components;
 declare hash h(dataset:'components',hashexp:20);
 h.definekey('Year',' Components');
 h.definedone();
end;
set products;
do year=BeginYear-5 to BeginYear-1;   /*<------------*/
 if h.check()=0 then do;found=1;leave;end;
end;
drop year;
run;

proc sql;
create table temp2 as
select 	FirmID,BeginYear,ProductID,ifn(sum(found)>0,1,0) as flag
 from temp
  group by FirmID,BeginYear,ProductID ;

create table want as
select FirmID,BeginYear,sum(flag) as  nb_Incremental,sum(flag=0) as nb_Radical
 from temp2
  group by FirmID,BeginYear	;
quit;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 717 views
  • 0 likes
  • 3 in conversation