Hello,
I would like to pick up the top 40% innovative company of each 'PERSON_CTRY_CODE'.
By using the example,
1. If the data obtained has a decimal point, then add one. For example, the AD country includes 14 companies, so the top40% is 5.6 companies. in this case, I would like to includes 6 companies rather than 5.
2. For ZA country, which includes 10 companies. The 4th company has 8116 innovations which is as same as the number of innovation as the 5th company. so I expect to obtain both of them.
3. I expect to have a Flag for picked company.
the result is expected to look like following.
| PERSON_CTRY_CODE | company | sum_innovation | Seq | Top40% | Flag_Top40 | note | 
| AD | L'OREAL | 7925 | 1 | 1 | ||
| AD | VOLKSWAGEN | 4909 | 2 | 1 | ||
| AD | BAKER HUGHES | 4546 | 3 | 1 | ||
| AD | ABB (ASEA BROWN BOVERI) | 2822 | 4 | 1 | ||
| AD | CARL FREUDENBERG | 1481 | 5 | 1 | ||
| AD | ASTRA | 587 | 6 | 0 | ||
| AD | GRUENENTHAL | 435 | 7 | 0 | ||
| AD | KOLON INDUSTRIES | 305 | 8 | 0 | ||
| AD | BRITISH AMERICAN TOBACCO (INVESTMENTS) | 233 | 9 | 0 | ||
| AD | EVONIK ROEHM | 221 | 10 | 0 | ||
| AD | SEIKAGAKU CORPORATION | 106 | 11 | 0 | ||
| AD | BAYER ANIMAL HEALTH | 23 | 12 | 0 | ||
| AD | M-U-T MASCHINEN-UMWELTTECHNIK-TRANSPORTANLAGEN | 15 | 13 | 0 | ||
| AD | CONTROLLED THERAPEUTICS (SCOTLAND) | 6 | 14 | 5.6 | 0 | 5.6=14*0.4, I would like to have 6 rather than 5 | 
| ZA | IBM (INTERNATIONAL BUSINESS MACHINES CORPORATION) | 143820 | 1 | 1 | ||
| ZA | SIEMENS | 34929 | 2 | 1 | ||
| ZA | TELEFONAKTIEBOLAGET LM ERICSSON (PUBL) | 14629 | 3 | 1 | ||
| ZA | 3M COMPANY (MINNESOTA MINING AND MANUFACTURING COMPANY) | 8116 | 4 | 1 | ||
| ZA | BAYER | 8116 | 5 | 1 | I would like to keep both of them if the 5th company has the same number of sum_innovation as the 4 th company. | |
| ZA | YAZAKI CORPORATION | 5307 | 6 | 0 | ||
| ZA | ROHM & HAAS COMPANY | 3816 | 7 | 0 | ||
| ZA | UNILEVER | 3380 | 8 | 0 | ||
| ZA | L'AIR LIQUIDE SOCIETE ANONYME POUR L'ETUDE ET L'EXPLOITATION DES PROCEDES GEORGES CLAUDE | 2403 | 9 | 0 | ||
| ZA | COLGATE-PALMOLIVE COMPANY | 1934 | 10 | 4 | 0 | 
Could you please give me some suggestion about this?
Thanks in advance.
Since your input is already sorted by CODE desc SUM_INN, you can use a nested DoW-loop to count SUM_INN dupes as one:
data have ;                                                                                                                                                                                                                                                     
  input code $ comp $ sum_inn ;                                                                                                                                                                                                                                 
  cards ;                                                                                                                                                                                                                                                       
AD   LOREAL        7925                                                                                                                                                                                                                                         
AD   VW            4909                                                                                                                                                                                                                                         
AD   BAKER         4546                                                                                                                                                                                                                                         
AD   ABB           2822                                                                                                                                                                                                                                         
AD   CARL          1481                                                                                                                                                                                                                                         
AD   ASTRA          587                                                                                                                                                                                                                                         
AD   GRUEN          435                                                                                                                                                                                                                                         
AD   KOLON          305                                                                                                                                                                                                                                         
AD   BAT            233                                                                                                                                                                                                                                         
AD   EVONIK         221                                                                                                                                                                                                                                         
AD   SEIKA          106                                                                                                                                                                                                                                         
AD   BAYER           23                                                                                                                                                                                                                                         
AD   M-U-T           15                                                                                                                                                                                                                                         
AD   CTHERA           6                                                                                                                                                                                                                                         
ZA   IBM         143820                                                                                                                                                                                                                                         
ZA   SIEMENS      34929                                                                                                                                                                                                                                         
ZA   TELEFON      14629                                                                                                                                                                                                                                         
ZA   3MCOMP        8116                                                                                                                                                                                                                                         
ZA   BAYER         8116                                                                                                                                                                                                                                         
ZA   YAZAKI        5307                                                                                                                                                                                                                                         
ZA   ROHMHAAS      3816                                                                                                                                                                                                                                         
ZA   UNILEVER      3380                                                                                                                                                                                                                                         
ZA   LAIR          2403                                                                                                                                                                                                                                         
ZA   COLGATE       1934                                                                                                                                                                                                                                         
;                                                                                                                                                                                                                                                               
run ;                                                                                                                                                                                                                                                           
                                                                                                                                                                                                                                                                
data want (drop = _:) ;                                                                                                                                                                                                                                         
  do _q = 1 by 1 until (last.code) ;                                                                                                                                                                                                                            
    do until (last.sum_inn) ;                                                                                                                                                                                                                                   
      set have ;                                                                                                                                                                                                                                                
      by code descending sum_inn ;                                                                                                                                                                                                                              
    end ;                                                                                                                                                                                                                                                       
  end ;                                                                                                                                                                                                                                                         
  top40 = ceil (_q * 0.4) ;                                                                                                                                                                                                                                     
  do _n_ = 1 to _q ;                                                                                                                                                                                                                                            
    flag = _n_ <= top40 ;                                                                                                                                                                                                                                       
    do until (last.sum_inn) ;                                                                                                                                                                                                                                   
      set have ;                                                                                                                                                                                                                                                
      by code descending sum_inn ;                                                                                                                                                                                                                              
      output ;                                                                                                                                                                                                                                                  
    end ;                                                                                                                                                                                                                                                       
  end ;                                                                                                                                                                                                                                                         
run ;                                                     
Kind regards
Paul D.
Since your input is already sorted by CODE desc SUM_INN, you can use a nested DoW-loop to count SUM_INN dupes as one:
data have ;                                                                                                                                                                                                                                                     
  input code $ comp $ sum_inn ;                                                                                                                                                                                                                                 
  cards ;                                                                                                                                                                                                                                                       
AD   LOREAL        7925                                                                                                                                                                                                                                         
AD   VW            4909                                                                                                                                                                                                                                         
AD   BAKER         4546                                                                                                                                                                                                                                         
AD   ABB           2822                                                                                                                                                                                                                                         
AD   CARL          1481                                                                                                                                                                                                                                         
AD   ASTRA          587                                                                                                                                                                                                                                         
AD   GRUEN          435                                                                                                                                                                                                                                         
AD   KOLON          305                                                                                                                                                                                                                                         
AD   BAT            233                                                                                                                                                                                                                                         
AD   EVONIK         221                                                                                                                                                                                                                                         
AD   SEIKA          106                                                                                                                                                                                                                                         
AD   BAYER           23                                                                                                                                                                                                                                         
AD   M-U-T           15                                                                                                                                                                                                                                         
AD   CTHERA           6                                                                                                                                                                                                                                         
ZA   IBM         143820                                                                                                                                                                                                                                         
ZA   SIEMENS      34929                                                                                                                                                                                                                                         
ZA   TELEFON      14629                                                                                                                                                                                                                                         
ZA   3MCOMP        8116                                                                                                                                                                                                                                         
ZA   BAYER         8116                                                                                                                                                                                                                                         
ZA   YAZAKI        5307                                                                                                                                                                                                                                         
ZA   ROHMHAAS      3816                                                                                                                                                                                                                                         
ZA   UNILEVER      3380                                                                                                                                                                                                                                         
ZA   LAIR          2403                                                                                                                                                                                                                                         
ZA   COLGATE       1934                                                                                                                                                                                                                                         
;                                                                                                                                                                                                                                                               
run ;                                                                                                                                                                                                                                                           
                                                                                                                                                                                                                                                                
data want (drop = _:) ;                                                                                                                                                                                                                                         
  do _q = 1 by 1 until (last.code) ;                                                                                                                                                                                                                            
    do until (last.sum_inn) ;                                                                                                                                                                                                                                   
      set have ;                                                                                                                                                                                                                                                
      by code descending sum_inn ;                                                                                                                                                                                                                              
    end ;                                                                                                                                                                                                                                                       
  end ;                                                                                                                                                                                                                                                         
  top40 = ceil (_q * 0.4) ;                                                                                                                                                                                                                                     
  do _n_ = 1 to _q ;                                                                                                                                                                                                                                            
    flag = _n_ <= top40 ;                                                                                                                                                                                                                                       
    do until (last.sum_inn) ;                                                                                                                                                                                                                                   
      set have ;                                                                                                                                                                                                                                                
      by code descending sum_inn ;                                                                                                                                                                                                                              
      output ;                                                                                                                                                                                                                                                  
    end ;                                                                                                                                                                                                                                                       
  end ;                                                                                                                                                                                                                                                         
run ;                                                     
Kind regards
Paul D.
Hi Paul,
Interesting technique. I may be wrong, but my understanding was that the fifth record in code 'ZA' should be flagged only because its SUM_INN value is tied with the last observation satisfying the top-40 criterion. For example, if the first three records were tied as well, say
ZA IBM 34929 ZA SIEMENS 34929 ZA TELEFON 34929
your code would count only _q=7 distinct SUM_INN values, compute top40=3 (rather than 4) and flag the first 6 observations (rather than 5) from code 'ZA', including 'YAZAKI' (SUM_INN=5307). So, YAZAKI's top-40 flag would be earned on the basis of accidental coincidences of numbers related to other companies.
Without nested loops this could be avoided:
data want (drop = _:) ;
  do _q = 1 by 1 until (last.code) ;
    set have ;
    by code descending sum_inn ;
  end ;
  _top40 = ceil (_q * 0.4) ;
  flag_top40 = 1 ;
  do _n_ = 1 to _q ;
    set have ;
    by code descending sum_inn ;
    output ;
    if _n_ >= _top40 & last.sum_inn then flag_top40 = 0 ;
  end ;
run ;
					
				
			
			
				
			
			
			
			
			
			
			
		Hi @FreelanceReinh,
You may very well be right, in which case your variant does the real trick.
My understanding was that the OP wanted the ties to be counted as one but I can be wrong ... if the OP would like to clarify, I hope we'll see a note to this effect here.
Kind regards
Paul D.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.