BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
touwen_k
Quartz | Level 8

dear community,

I have a quesiton if you would like to select only specific parks (ACAD and ARCH) and count them, and the other parks not. What is wrong in this code below? pls advise.

touwen_k_0-1643642330071.png

regards, 

Karolina

1 ACCEPTED SOLUTION

Accepted Solutions
smantha
Lapis Lazuli | Level 10

As Paige has explained first. or last. on a by statement returns a 0 or 1 indicating true or false. If your objective is to count the records as you are reading the data set you can use the code below. However if your objective is to get counts of each by group then you can use any of the procs that SAS has.

 

data new_ds;
set species_sort end=last;
by parkcode;
retain count;
if first.parkcode then do; count=0; end;
count+1;
if last.parkcode then do;
put parkcode= count=;
if last then do;
put _n_=;
end;
run;

View solution in original post

5 REPLIES 5
PaigeMiller
Diamond | Level 26
if first.parkcode in ("ACAD","ARCH") then ...

will not work, because the special variables that begin with first. are always either 0 or 1. They are zero when it is not the first observation of a BY variable, and it is 1 if it is the first obsesrvation of a BY variable. So whether it is 0 or 1, it will never be in ("ACAD","ARCH")

 

What you really want to do is this:

 

proc freq data=pg2.np_species (where=(parkcode in ('ACAD','ARCH')));
    tables parkcode;
run;

 

Advice: don't try to write your own DATA step code for common activities like counting, computing sums and means and standard deviations, finding minimums and maximums, and so forth. SAS has written many PROCs where they have already programmed these things, then they debug the code, verify that it works, and these PROCs have been proven in a bazillion and 14 real world applications. If you write your own DATA step code, then you have to vouch for the correctness of the results, and as you can see, you also have to overcome the difficulties of programming these things in a DATA step.

 

From now on, please do not post your code as a screen capture, post your code as text, in the code box as I have done, by clicking on the "little running man" icon. Also, please, from now on, when your code doesn't work, show us the entire LOG for the step that isn't working (not selected parts of the log) as text, by clicking on the </> icon and pasting your log text into that window.

 

Insert Log Icon in SAS Communities.png

--
Paige Miller
smantha
Lapis Lazuli | Level 10

As Paige has explained first. or last. on a by statement returns a 0 or 1 indicating true or false. If your objective is to count the records as you are reading the data set you can use the code below. However if your objective is to get counts of each by group then you can use any of the procs that SAS has.

 

data new_ds;
set species_sort end=last;
by parkcode;
retain count;
if first.parkcode then do; count=0; end;
count+1;
if last.parkcode then do;
put parkcode= count=;
if last then do;
put _n_=;
end;
run;

touwen_k
Quartz | Level 8

Thank you all for your advices. I have now code that works, it counts the occurences for parks ACAD and ARCH and for other parks the count is zero.

data count;
set species_sort;
by parkcode;
if first.parkcode and parkcode in ('ACAD', 'ARCH') then
count=0;
count+1;
if parkcode not in ('ACAD', 'ARCH') then count=0;
if last.parkcode then output;
run;
PaigeMiller
Diamond | Level 26

@touwen_k again I recommend you use PROC FREQ here instead of struggling with DATA step code.

--
Paige Miller
jbilenas
Obsidian | Level 7

Karolina,

You don't need to sort the data or count the parks you are looking for in a data step to get totals for (ACAD and ARCH) when you use reporting procedures and include a where clause you have all the info. PROC FREQ maybe easier as shown in PaigeMiller reply.

 

I like to use PROC TABULATE. What I did was to make a small data set with park names and ran the tabulate:

 

PROC TABULATE DATA=parks (WHERE=(park IN("ACAD" "ARCH"))) NOSEPS MISSING;
  CLASS park;
  TABLE PARK ALL
        ,
        N
        /RTS=20 ROW=FLOAT MISSTEXT=' ';
RUN;  

OUTPUT:

jbilenas_0-1643650792359.png

I am a Tabulate and FORMAT geek so I also created the following code that includes all parks; results for ACAD, ARCH, and all other parks combined:

 

PROC FORMAT;
  VALUE $PARK (MULTILABEL NOTSORTED)
    'ARCH', 'ACAD' = 'TOTAL for ARCH and ACAD'
    'ARCH' = 'ARCH'
    'ACAD' = 'ACAD'
     OTHER = 'OTHER PARKS'
  ;
  PICTURE pct(round) 
    low -<0   = '009.00%' (PREFIX='-')
    0   -high = '009.00%' (PREFIX=' ')
  ; 
RUN;

PROC TABULATE DATA=parks NOSEPS MISSING;
  CLASS park/MLF ORDER=DATA;
  FORMAT park $PARK.;
  TABLE PARK ALL
        ,
        N
        PCTN = '%'*F=pct.
        /RTS=30 ROW=FLOAT MISSTEXT=' ';
RUN; 

 

OUTPUT:

jbilenas_1-1643651149603.png

Full code should be attached.

 

Jonas V. Bilenas

Jonas V. Bilenas

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 628 views
  • 3 likes
  • 4 in conversation