BookmarkSubscribeRSS Feed
ryanhupfeld
Calcite | Level 5

I am a fisheries biologist and am new to programming and SAS but would like to become more proficient at this.  One analysis that we use often is Catch Per Unit of Effort (CPUE) for each species for numerous different efforts (time).  I am having trouble figuring out how to add zero catches.  

 

As you can see it is pretty simple calculation CPUE= # fish caught in each species / minutes.

However, as you can see below in effort(barcodeID) A1234 a sauger was caught, but in B1234 and C1234 one was not caught.  So when we calculate a mean CPUE of all the runs together the zero catches need to be included into the analysis. I need some help with developing code to include a catch of zero for each effort(barcodeID) and species when it is not present in the data.  I would also like to calculate CPUE based on different sized fish as well.  So CPUE of Walleye and Sauger <200 and CPUE of Walleye and Sauger >200.

 

Hopefully that is clear enough.

Thanks in advance!

 

Here is how my data is set up:

BarcodeID       Date                Minutes        Species      Length  

A1234             5/15/2020         15.00          Walleye        180

A1234             5/15/2020         15.00          Walleye        350

A1234             5/15/2020         15.00          Sauger         175

B1234             5/15/2020         15.00          Walleye        400

B1234             5/15/2020         15.00          Walleye        500

C1234             5/15/2020         15.00          NoFish           .           

2 REPLIES 2
PGStats
Opal | Level 21

You could expand your data to include zero catches this way:

 

data have;
input BarcodeID $ Date :mmddyy. Minutes Species :$20. Length;
format date yymmdd10.;
datalines;  
A1234             5/15/2020         15.00          Walleye        180
A1234             5/15/2020         15.00          Walleye        350
A1234             5/15/2020         15.00          Sauger         175
B1234             5/15/2020         15.00          Walleye        400
B1234             5/15/2020         15.00          Walleye        500
C1234             5/15/2020         15.00          NoFish           .   
;

proc sql;
select 
    a.barCodeId,
    a.date,
    a.minutes,
    b.Species,
    c.length,
    not missing(c.length) as number
from 
    (select unique barcodeId, date, minutes from have) as a cross join
    (select unique species from have where species ne "NoFish") as b left join
    have as c on a.barcodeId=c.barcodeId and b.species=c.species
order by barcodeId, date, species, length;
quit;
BarcodeID Date Minutes Species Length number
A1234 2020-05-15 15 Sauger 175 1
A1234 2020-05-15 15 Walleye 180 1
A1234 2020-05-15 15 Walleye 350 1
B1234 2020-05-15 15 Sauger . 0
B1234 2020-05-15 15 Walleye 400 1
B1234 2020-05-15 15 Walleye 500 1
C1234 2020-05-15 15 Sauger . 0
C1234 2020-05-15 15 Walleye . 0
PG
mkeintz
PROC Star

If the data are sorted by barcodeid, then this data step will also work:

 

data have;
  input BarcodeID :$5.   Date :mmddyy10.  Minutes Species :$7.   Length  ;
datalines;
A1234             5/15/2020         15.00          Walleye        180
A1234             5/15/2020         15.00          Walleye        350
A1234             5/15/2020         15.00          Sauger         175
B1234             5/15/2020         15.00          Walleye        400
B1234             5/15/2020         15.00          Walleye        500
C1234             5/15/2020         15.00          NoFish           .
run;

data want (drop=_:);
  array species_array {2} $7 _temporary_ ("Walleye", "Sauger");
  _species_list=catx(' ',of species_array{*});

  /* Read and output catches, and pare down _species_list accordingly*/
  do until (last.barcodeid);
    set have;
    by barcodeid;
    _species_list=left(tranwrd(_species_list,trim(species),' '));
	count=1;
    if species^='NoFish' then output;
  end;

  /* At end of barcodeid, use _species_list to output needed dummy records */
  count=0;
  do while (_species_list^=' ');
    length=.;
	species=scan(_species_list,1,' ');
	output;
	_species_list=left(tranwrd(_species_list,trim(species),' '));
  end;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 639 views
  • 2 likes
  • 3 in conversation