BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Cruise
Ammonite | Level 13

Hi guys,  

 

My data HAVE has variables to indicate the number of certain categories of facilities found in each ZIP-code area. For instance, num_PCB means number of facilities known to discharge PCBs to an environment for each ZIP codes and same applies to the rest of num_: variables. My objective is to count the number of different categories of facilities per ZIP-code. The desired variable is 'n_diff_cat'. 

 

Data shows that num_V, num_B, num_c exist in ZIP-code 10013 but not num_PCB and num_P. Therefore, 'n_diff_cat' would take value 3.     

 

DATA Z.HAVE; 
INPUT zip num_PCB num_P num_V num_B num_C n_diff_cat;
CARDS;
10013 0 0 7 1 4 3 
10031 0 0 1 0 0 1
10037 0 0 1 0 1 2
10040 0 0 3 0 3 2
10301 1 1 1 0 1 4
10309 1 1 1 0 0 3
10312 0 0 2 1 1 3
10314 1 1 3 0 2 4
10461 1 1 0 0 0 2
10474 0 0 6 0 1 2
10475 0 0 3 1 1 3
10504 0 0 1 0 1 2
10505 0 0 2 0 2 2
10506 0 0 6 1 5 3
10509 0 0 3 0 2 2
10512 0 0 3 0 3 2
10516 0 0 0 0 0 0
10519 1 1 0 0 0 2
10520 0 0 3 0 2 2
10543 0 0 5 0 3 2
;

 Thanks,

I appreciate your time!

1 ACCEPTED SOLUTION

Accepted Solutions
gamotte
Rhodochrosite | Level 12

Hello,

 

data have; 
    input zip num_PCB num_P num_V num_B num_C;
    array nums(*) num_:;

    n_diff_cat=0;

    do i=1 to dim(nums);
        n_diff_cat=n_diff_cat+(nums(i)>0);
    end;

    drop i;

    cards;
10013 0 0 7 1 4 
10031 0 0 1 0 0
10037 0 0 1 0 1
10040 0 0 3 0 3
10301 1 1 1 0 1
10309 1 1 1 0 0
10312 0 0 2 1 1
10314 1 1 3 0 2
10461 1 1 0 0 0
10474 0 0 6 0 1
10475 0 0 3 1 1
10504 0 0 1 0 1
10505 0 0 2 0 2
10506 0 0 6 1 5
10509 0 0 3 0 2
10512 0 0 3 0 3
10516 0 0 0 0 0
10519 1 1 0 0 0
10520 0 0 3 0 2
10543 0 0 5 0 3
;
run;

View solution in original post

16 REPLIES 16
gamotte
Rhodochrosite | Level 12

Hello,

 

data have; 
    input zip num_PCB num_P num_V num_B num_C;
    array nums(*) num_:;

    n_diff_cat=0;

    do i=1 to dim(nums);
        n_diff_cat=n_diff_cat+(nums(i)>0);
    end;

    drop i;

    cards;
10013 0 0 7 1 4 
10031 0 0 1 0 0
10037 0 0 1 0 1
10040 0 0 3 0 3
10301 1 1 1 0 1
10309 1 1 1 0 0
10312 0 0 2 1 1
10314 1 1 3 0 2
10461 1 1 0 0 0
10474 0 0 6 0 1
10475 0 0 3 1 1
10504 0 0 1 0 1
10505 0 0 2 0 2
10506 0 0 6 1 5
10509 0 0 3 0 2
10512 0 0 3 0 3
10516 0 0 0 0 0
10519 1 1 0 0 0
10520 0 0 3 0 2
10543 0 0 5 0 3
;
run;
Cruise
Ammonite | Level 13
THANK YOU VERY MUCH. WORKED OUT.
gamotte
Rhodochrosite | Level 12

Less readable but shorter :

 

data have; 
    input zip num_PCB num_P num_V num_B num_C;
    array nums(*) num_:;

    n_diff_cat=dim(nums)-count(cats(of nums(*)),'0');

    cards;
10013 0 0 7 1 4 
10031 0 0 1 0 0
10037 0 0 1 0 1
10040 0 0 3 0 3
10301 1 1 1 0 1
10309 1 1 1 0 0
10312 0 0 2 1 1
10314 1 1 3 0 2
10461 1 1 0 0 0
10474 0 0 6 0 1
10475 0 0 3 1 1
10504 0 0 1 0 1
10505 0 0 2 0 2
10506 0 0 6 1 5
10509 0 0 3 0 2
10512 0 0 3 0 3
10516 0 0 0 0 0
10519 1 1 0 0 0
10520 0 0 3 0 2
10543 0 0 5 0 3
;
novinosrin
Tourmaline | Level 20
data have_without_loop; 
    input zip num_PCB num_P num_V num_B num_C;
    array nums(*) num_:;
	n_diff_cat=lengthn(compress(cats(of nums(*)),'0 '));
    cards;
10013 0 0 7 1 4 
10031 0 0 1 0 0
10037 0 0 1 0 1
10040 0 0 3 0 3
10301 1 1 1 0 1
10309 1 1 1 0 0
10312 0 0 2 1 1
10314 1 1 3 0 2
10461 1 1 0 0 0
10474 0 0 6 0 1
10475 0 0 3 1 1
10504 0 0 1 0 1
10505 0 0 2 0 2
10506 0 0 6 1 5
10509 0 0 3 0 2
10512 0 0 3 0 3
10516 0 0 0 0 0
10519 1 1 0 0 0
10520 0 0 3 0 2
10543 0 0 5 0 3
;
run;
gamotte
Rhodochrosite | Level 12

I tried such a solution but was stumped by the unit length of empty strings.

novinosrin
Tourmaline | Level 20

🙂  if may i suggest a small nit for the assignment vs sum in your loop, the below is an equivalent 

 

 n_diff_cat=0;

    do i=1 to dim(nums);
        n_diff_cat+(nums(i)>0);
    end;

 

Cruise
Ammonite | Level 13
i really like this solution.
gamotte
Rhodochrosite | Level 12

Yes, but there is an implict retain to be aware of.

I usually use the longer form when i don't take advantage of the retain.

novinosrin
Tourmaline | Level 20

yes and true, however in your case implicit retain is obsolete due to your initial assignment 

 n_diff_cat=0;

for every iteration of the datastep

 

 

FreelanceReinh
Jade | Level 19

Hi @gamotte and @novinosrin,

 

I wouldn't know why "the number of certain categories of facilities found in" a ZIP-code area cannot be greater than 9. In this case, however, both alternative suggestions involving the CATS function would fail in general:

  • dim(nums)-count(cats(of nums(*)),'0') subtracts too much for values such as 10, 20, etc.
  • lengthn(compress(cats(of nums(*)),'0 ')) counts, e.g., 11, 12, etc. as two values.

The "APP function" approach does not seem to have this weakness.

novinosrin
Tourmaline | Level 20

Nice catch boss @FreelanceReinh  My oh my , never expected that level of diligence. Added to my notes!!! and Thanks a lot indeed

gamotte
Rhodochrosite | Level 12

You raise a good point 😉

novinosrin
Tourmaline | Level 20

Sir @FreelanceReinh Logic comes to mind in  weird locations as  I am away from sas right now I am at the Depaul cafe and so haven't tested this, but since it struck my mind, I just want to get it out there. Will test when I go back to the lab  🙂

 

n_diff_cat=countw(compress(catx(' ',of nums(*)),'0'));

novinosrin
Tourmaline | Level 20
data have_without_loop(drop=_t); 
    input zip num_PCB num_P num_V num_B num_C;
    _t = put (0, rb8.) ;
    array nums(*) num_:;
    n_diff_cat = n(of nums(*))-count(put(peekclong (addrlong(nums[1]), 40), $40.),_t) ;
    cards;
10013 0 0 7 1 4 
10031 0 0 1 0 0
10037 0 0 1 0 1
10040 0 0 3 0 3
10301 1 1 1 0 1
10309 1 1 1 0 0
10312 0 0 2 1 1
10314 1 1 3 0 2
10461 1 1 0 0 0
10474 0 0 6 0 1
10475 0 0 3 1 1
10504 0 0 1 0 1
10505 0 0 2 0 2
10506 0 0 6 1 5
10509 0 0 3 0 2
10512 0 0 3 0 3
10516 0 0 0 0 0
10519 1 1 0 0 0
10520 0 0 3 0 2
10543 0 0 5 0 3
;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 16 replies
  • 2124 views
  • 12 likes
  • 4 in conversation