BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Alexxxxxxx
Pyrite | Level 9

Hello,

 

I would like to pick up the top 40% innovative company of each 'PERSON_CTRY_CODE'.

 

By using the example,

 

1. If the data obtained has a decimal point, then add one. For example, the AD country includes 14 companies, so the top40% is 5.6 companies. in this case, I would like to includes 6 companies rather than 5.

2. For ZA country, which includes 10 companies. The 4th company has 8116 innovations which is as same as the number of innovation as the 5th company. so I expect to obtain both of them.

3. I expect to have a Flag for picked company.

 

the result is expected to look like following.

PERSON_CTRY_CODEcompanysum_innovationSeqTop40%Flag_Top40note
ADL'OREAL79251 1 
ADVOLKSWAGEN49092 1 
ADBAKER HUGHES45463 1 
ADABB (ASEA BROWN BOVERI)28224 1 
ADCARL FREUDENBERG14815 1 
ADASTRA5876 0 
ADGRUENENTHAL4357 0 
ADKOLON INDUSTRIES3058 0 
ADBRITISH AMERICAN TOBACCO (INVESTMENTS)2339 0 
ADEVONIK ROEHM22110 0 
ADSEIKAGAKU CORPORATION10611 0 
ADBAYER ANIMAL HEALTH2312 0 
ADM-U-T MASCHINEN-UMWELTTECHNIK-TRANSPORTANLAGEN1513 0 
ADCONTROLLED THERAPEUTICS (SCOTLAND)6145.605.6=14*0.4, I would like to have 6 rather than 5
ZAIBM (INTERNATIONAL BUSINESS MACHINES CORPORATION)1438201 1 
ZASIEMENS349292 1 
ZATELEFONAKTIEBOLAGET LM ERICSSON (PUBL)146293 1 
ZA3M COMPANY (MINNESOTA MINING AND MANUFACTURING COMPANY)81164 1 
ZABAYER81165 1I would like to keep both of them if the 5th company has the same number of sum_innovation as the 4 th company.
ZAYAZAKI CORPORATION53076 0 
ZAROHM & HAAS COMPANY38167 0 
ZAUNILEVER33808 0 
ZAL'AIR LIQUIDE SOCIETE ANONYME POUR L'ETUDE ET L'EXPLOITATION DES PROCEDES GEORGES CLAUDE24039 0 
ZACOLGATE-PALMOLIVE COMPANY19341040 

 

Could you please give me some suggestion about this?

Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
hashman
Ammonite | Level 13

@Alexxxxxxx:

Since your input is already sorted by CODE desc SUM_INN, you can use a nested DoW-loop to count SUM_INN dupes as one:

data have ;                                                                                                                                                                                                                                                     
  input code $ comp $ sum_inn ;                                                                                                                                                                                                                                 
  cards ;                                                                                                                                                                                                                                                       
AD   LOREAL        7925                                                                                                                                                                                                                                         
AD   VW            4909                                                                                                                                                                                                                                         
AD   BAKER         4546                                                                                                                                                                                                                                         
AD   ABB           2822                                                                                                                                                                                                                                         
AD   CARL          1481                                                                                                                                                                                                                                         
AD   ASTRA          587                                                                                                                                                                                                                                         
AD   GRUEN          435                                                                                                                                                                                                                                         
AD   KOLON          305                                                                                                                                                                                                                                         
AD   BAT            233                                                                                                                                                                                                                                         
AD   EVONIK         221                                                                                                                                                                                                                                         
AD   SEIKA          106                                                                                                                                                                                                                                         
AD   BAYER           23                                                                                                                                                                                                                                         
AD   M-U-T           15                                                                                                                                                                                                                                         
AD   CTHERA           6                                                                                                                                                                                                                                         
ZA   IBM         143820                                                                                                                                                                                                                                         
ZA   SIEMENS      34929                                                                                                                                                                                                                                         
ZA   TELEFON      14629                                                                                                                                                                                                                                         
ZA   3MCOMP        8116                                                                                                                                                                                                                                         
ZA   BAYER         8116                                                                                                                                                                                                                                         
ZA   YAZAKI        5307                                                                                                                                                                                                                                         
ZA   ROHMHAAS      3816                                                                                                                                                                                                                                         
ZA   UNILEVER      3380                                                                                                                                                                                                                                         
ZA   LAIR          2403                                                                                                                                                                                                                                         
ZA   COLGATE       1934                                                                                                                                                                                                                                         
;                                                                                                                                                                                                                                                               
run ;                                                                                                                                                                                                                                                           
                                                                                                                                                                                                                                                                
data want (drop = _:) ;                                                                                                                                                                                                                                         
  do _q = 1 by 1 until (last.code) ;                                                                                                                                                                                                                            
    do until (last.sum_inn) ;                                                                                                                                                                                                                                   
      set have ;                                                                                                                                                                                                                                                
      by code descending sum_inn ;                                                                                                                                                                                                                              
    end ;                                                                                                                                                                                                                                                       
  end ;                                                                                                                                                                                                                                                         
  top40 = ceil (_q * 0.4) ;                                                                                                                                                                                                                                     
  do _n_ = 1 to _q ;                                                                                                                                                                                                                                            
    flag = _n_ <= top40 ;                                                                                                                                                                                                                                       
    do until (last.sum_inn) ;                                                                                                                                                                                                                                   
      set have ;                                                                                                                                                                                                                                                
      by code descending sum_inn ;                                                                                                                                                                                                                              
      output ;                                                                                                                                                                                                                                                  
    end ;                                                                                                                                                                                                                                                       
  end ;                                                                                                                                                                                                                                                         
run ;                                                     

Kind regards

Paul D. 

View solution in original post

3 REPLIES 3
hashman
Ammonite | Level 13

@Alexxxxxxx:

Since your input is already sorted by CODE desc SUM_INN, you can use a nested DoW-loop to count SUM_INN dupes as one:

data have ;                                                                                                                                                                                                                                                     
  input code $ comp $ sum_inn ;                                                                                                                                                                                                                                 
  cards ;                                                                                                                                                                                                                                                       
AD   LOREAL        7925                                                                                                                                                                                                                                         
AD   VW            4909                                                                                                                                                                                                                                         
AD   BAKER         4546                                                                                                                                                                                                                                         
AD   ABB           2822                                                                                                                                                                                                                                         
AD   CARL          1481                                                                                                                                                                                                                                         
AD   ASTRA          587                                                                                                                                                                                                                                         
AD   GRUEN          435                                                                                                                                                                                                                                         
AD   KOLON          305                                                                                                                                                                                                                                         
AD   BAT            233                                                                                                                                                                                                                                         
AD   EVONIK         221                                                                                                                                                                                                                                         
AD   SEIKA          106                                                                                                                                                                                                                                         
AD   BAYER           23                                                                                                                                                                                                                                         
AD   M-U-T           15                                                                                                                                                                                                                                         
AD   CTHERA           6                                                                                                                                                                                                                                         
ZA   IBM         143820                                                                                                                                                                                                                                         
ZA   SIEMENS      34929                                                                                                                                                                                                                                         
ZA   TELEFON      14629                                                                                                                                                                                                                                         
ZA   3MCOMP        8116                                                                                                                                                                                                                                         
ZA   BAYER         8116                                                                                                                                                                                                                                         
ZA   YAZAKI        5307                                                                                                                                                                                                                                         
ZA   ROHMHAAS      3816                                                                                                                                                                                                                                         
ZA   UNILEVER      3380                                                                                                                                                                                                                                         
ZA   LAIR          2403                                                                                                                                                                                                                                         
ZA   COLGATE       1934                                                                                                                                                                                                                                         
;                                                                                                                                                                                                                                                               
run ;                                                                                                                                                                                                                                                           
                                                                                                                                                                                                                                                                
data want (drop = _:) ;                                                                                                                                                                                                                                         
  do _q = 1 by 1 until (last.code) ;                                                                                                                                                                                                                            
    do until (last.sum_inn) ;                                                                                                                                                                                                                                   
      set have ;                                                                                                                                                                                                                                                
      by code descending sum_inn ;                                                                                                                                                                                                                              
    end ;                                                                                                                                                                                                                                                       
  end ;                                                                                                                                                                                                                                                         
  top40 = ceil (_q * 0.4) ;                                                                                                                                                                                                                                     
  do _n_ = 1 to _q ;                                                                                                                                                                                                                                            
    flag = _n_ <= top40 ;                                                                                                                                                                                                                                       
    do until (last.sum_inn) ;                                                                                                                                                                                                                                   
      set have ;                                                                                                                                                                                                                                                
      by code descending sum_inn ;                                                                                                                                                                                                                              
      output ;                                                                                                                                                                                                                                                  
    end ;                                                                                                                                                                                                                                                       
  end ;                                                                                                                                                                                                                                                         
run ;                                                     

Kind regards

Paul D. 

FreelanceReinh
Jade | Level 19

@hashman:

Hi Paul,

Interesting technique. I may be wrong, but my understanding was that the fifth record in code 'ZA' should be flagged only because its SUM_INN value is tied with the last observation satisfying the top-40 criterion. For example, if the first three records were tied as well, say

ZA   IBM          34929
ZA   SIEMENS      34929
ZA   TELEFON      34929

your code would count only _q=7 distinct SUM_INN values, compute top40=3 (rather than 4) and flag the first 6 observations (rather than 5) from code 'ZA', including 'YAZAKI' (SUM_INN=5307). So, YAZAKI's top-40 flag would be earned on the basis of accidental coincidences of numbers related to other companies.

 

Without nested loops this could be avoided:

data want (drop = _:) ;
  do _q = 1 by 1 until (last.code) ;
    set have ;
    by code descending sum_inn ;
  end ;
  _top40 = ceil (_q * 0.4) ;
  flag_top40 = 1 ;
  do _n_ = 1 to _q ;
    set have ;
    by code descending sum_inn ;
    output ;
    if _n_ >= _top40 & last.sum_inn then flag_top40 = 0 ;
  end ;
run ;
hashman
Ammonite | Level 13

Hi @FreelanceReinh,

 

You may very well be right, in which case your variant does the real trick.

My understanding was that the OP wanted the ties to be counted as one but I can be wrong ... if the OP would like to clarify, I hope we'll see a note to this effect here. 

 

Kind regards

Paul D.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 518 views
  • 1 like
  • 3 in conversation