BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Cruise
Ammonite | Level 13

Hi guys,  

 

My data HAVE has variables to indicate the number of certain categories of facilities found in each ZIP-code area. For instance, num_PCB means number of facilities known to discharge PCBs to an environment for each ZIP codes and same applies to the rest of num_: variables. My objective is to count the number of different categories of facilities per ZIP-code. The desired variable is 'n_diff_cat'. 

 

Data shows that num_V, num_B, num_c exist in ZIP-code 10013 but not num_PCB and num_P. Therefore, 'n_diff_cat' would take value 3.     

 

DATA Z.HAVE; 
INPUT zip num_PCB num_P num_V num_B num_C n_diff_cat;
CARDS;
10013 0 0 7 1 4 3 
10031 0 0 1 0 0 1
10037 0 0 1 0 1 2
10040 0 0 3 0 3 2
10301 1 1 1 0 1 4
10309 1 1 1 0 0 3
10312 0 0 2 1 1 3
10314 1 1 3 0 2 4
10461 1 1 0 0 0 2
10474 0 0 6 0 1 2
10475 0 0 3 1 1 3
10504 0 0 1 0 1 2
10505 0 0 2 0 2 2
10506 0 0 6 1 5 3
10509 0 0 3 0 2 2
10512 0 0 3 0 3 2
10516 0 0 0 0 0 0
10519 1 1 0 0 0 2
10520 0 0 3 0 2 2
10543 0 0 5 0 3 2
;

 Thanks,

I appreciate your time!

1 ACCEPTED SOLUTION

Accepted Solutions
gamotte
Rhodochrosite | Level 12

Hello,

 

data have; 
    input zip num_PCB num_P num_V num_B num_C;
    array nums(*) num_:;

    n_diff_cat=0;

    do i=1 to dim(nums);
        n_diff_cat=n_diff_cat+(nums(i)>0);
    end;

    drop i;

    cards;
10013 0 0 7 1 4 
10031 0 0 1 0 0
10037 0 0 1 0 1
10040 0 0 3 0 3
10301 1 1 1 0 1
10309 1 1 1 0 0
10312 0 0 2 1 1
10314 1 1 3 0 2
10461 1 1 0 0 0
10474 0 0 6 0 1
10475 0 0 3 1 1
10504 0 0 1 0 1
10505 0 0 2 0 2
10506 0 0 6 1 5
10509 0 0 3 0 2
10512 0 0 3 0 3
10516 0 0 0 0 0
10519 1 1 0 0 0
10520 0 0 3 0 2
10543 0 0 5 0 3
;
run;

View solution in original post

16 REPLIES 16
gamotte
Rhodochrosite | Level 12

Hello,

 

data have; 
    input zip num_PCB num_P num_V num_B num_C;
    array nums(*) num_:;

    n_diff_cat=0;

    do i=1 to dim(nums);
        n_diff_cat=n_diff_cat+(nums(i)>0);
    end;

    drop i;

    cards;
10013 0 0 7 1 4 
10031 0 0 1 0 0
10037 0 0 1 0 1
10040 0 0 3 0 3
10301 1 1 1 0 1
10309 1 1 1 0 0
10312 0 0 2 1 1
10314 1 1 3 0 2
10461 1 1 0 0 0
10474 0 0 6 0 1
10475 0 0 3 1 1
10504 0 0 1 0 1
10505 0 0 2 0 2
10506 0 0 6 1 5
10509 0 0 3 0 2
10512 0 0 3 0 3
10516 0 0 0 0 0
10519 1 1 0 0 0
10520 0 0 3 0 2
10543 0 0 5 0 3
;
run;
Cruise
Ammonite | Level 13
THANK YOU VERY MUCH. WORKED OUT.
gamotte
Rhodochrosite | Level 12

Less readable but shorter :

 

data have; 
    input zip num_PCB num_P num_V num_B num_C;
    array nums(*) num_:;

    n_diff_cat=dim(nums)-count(cats(of nums(*)),'0');

    cards;
10013 0 0 7 1 4 
10031 0 0 1 0 0
10037 0 0 1 0 1
10040 0 0 3 0 3
10301 1 1 1 0 1
10309 1 1 1 0 0
10312 0 0 2 1 1
10314 1 1 3 0 2
10461 1 1 0 0 0
10474 0 0 6 0 1
10475 0 0 3 1 1
10504 0 0 1 0 1
10505 0 0 2 0 2
10506 0 0 6 1 5
10509 0 0 3 0 2
10512 0 0 3 0 3
10516 0 0 0 0 0
10519 1 1 0 0 0
10520 0 0 3 0 2
10543 0 0 5 0 3
;
novinosrin
Tourmaline | Level 20
data have_without_loop; 
    input zip num_PCB num_P num_V num_B num_C;
    array nums(*) num_:;
	n_diff_cat=lengthn(compress(cats(of nums(*)),'0 '));
    cards;
10013 0 0 7 1 4 
10031 0 0 1 0 0
10037 0 0 1 0 1
10040 0 0 3 0 3
10301 1 1 1 0 1
10309 1 1 1 0 0
10312 0 0 2 1 1
10314 1 1 3 0 2
10461 1 1 0 0 0
10474 0 0 6 0 1
10475 0 0 3 1 1
10504 0 0 1 0 1
10505 0 0 2 0 2
10506 0 0 6 1 5
10509 0 0 3 0 2
10512 0 0 3 0 3
10516 0 0 0 0 0
10519 1 1 0 0 0
10520 0 0 3 0 2
10543 0 0 5 0 3
;
run;
gamotte
Rhodochrosite | Level 12

I tried such a solution but was stumped by the unit length of empty strings.

novinosrin
Tourmaline | Level 20

🙂  if may i suggest a small nit for the assignment vs sum in your loop, the below is an equivalent 

 

 n_diff_cat=0;

    do i=1 to dim(nums);
        n_diff_cat+(nums(i)>0);
    end;

 

Cruise
Ammonite | Level 13
i really like this solution.
gamotte
Rhodochrosite | Level 12

Yes, but there is an implict retain to be aware of.

I usually use the longer form when i don't take advantage of the retain.

novinosrin
Tourmaline | Level 20

yes and true, however in your case implicit retain is obsolete due to your initial assignment 

 n_diff_cat=0;

for every iteration of the datastep

 

 

FreelanceReinh
Jade | Level 19

Hi @gamotte and @novinosrin,

 

I wouldn't know why "the number of certain categories of facilities found in" a ZIP-code area cannot be greater than 9. In this case, however, both alternative suggestions involving the CATS function would fail in general:

  • dim(nums)-count(cats(of nums(*)),'0') subtracts too much for values such as 10, 20, etc.
  • lengthn(compress(cats(of nums(*)),'0 ')) counts, e.g., 11, 12, etc. as two values.

The "APP function" approach does not seem to have this weakness.

novinosrin
Tourmaline | Level 20

Nice catch boss @FreelanceReinh  My oh my , never expected that level of diligence. Added to my notes!!! and Thanks a lot indeed

gamotte
Rhodochrosite | Level 12

You raise a good point 😉

novinosrin
Tourmaline | Level 20

Sir @FreelanceReinh Logic comes to mind in  weird locations as  I am away from sas right now I am at the Depaul cafe and so haven't tested this, but since it struck my mind, I just want to get it out there. Will test when I go back to the lab  🙂

 

n_diff_cat=countw(compress(catx(' ',of nums(*)),'0'));

novinosrin
Tourmaline | Level 20
data have_without_loop(drop=_t); 
    input zip num_PCB num_P num_V num_B num_C;
    _t = put (0, rb8.) ;
    array nums(*) num_:;
    n_diff_cat = n(of nums(*))-count(put(peekclong (addrlong(nums[1]), 40), $40.),_t) ;
    cards;
10013 0 0 7 1 4 
10031 0 0 1 0 0
10037 0 0 1 0 1
10040 0 0 3 0 3
10301 1 1 1 0 1
10309 1 1 1 0 0
10312 0 0 2 1 1
10314 1 1 3 0 2
10461 1 1 0 0 0
10474 0 0 6 0 1
10475 0 0 3 1 1
10504 0 0 1 0 1
10505 0 0 2 0 2
10506 0 0 6 1 5
10509 0 0 3 0 2
10512 0 0 3 0 3
10516 0 0 0 0 0
10519 1 1 0 0 0
10520 0 0 3 0 2
10543 0 0 5 0 3
;
run;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 16 replies
  • 4170 views
  • 12 likes
  • 4 in conversation