BookmarkSubscribeRSS Feed
ryanhupfeld
Calcite | Level 5

I am a fisheries biologist and am new to programming and SAS but would like to become more proficient at this.  One analysis that we use often is Catch Per Unit of Effort (CPUE) for each species for numerous different efforts (time).  I am having trouble figuring out how to add zero catches.  

 

As you can see it is pretty simple calculation CPUE= # fish caught in each species / minutes.

However, as you can see below in effort(barcodeID) A1234 a sauger was caught, but in B1234 and C1234 one was not caught.  So when we calculate a mean CPUE of all the runs together the zero catches need to be included into the analysis. I need some help with developing code to include a catch of zero for each effort(barcodeID) and species when it is not present in the data.  I would also like to calculate CPUE based on different sized fish as well.  So CPUE of Walleye and Sauger <200 and CPUE of Walleye and Sauger >200.

 

Hopefully that is clear enough.

Thanks in advance!

 

Here is how my data is set up:

BarcodeID       Date                Minutes        Species      Length  

A1234             5/15/2020         15.00          Walleye        180

A1234             5/15/2020         15.00          Walleye        350

A1234             5/15/2020         15.00          Sauger         175

B1234             5/15/2020         15.00          Walleye        400

B1234             5/15/2020         15.00          Walleye        500

C1234             5/15/2020         15.00          NoFish           .           

2 REPLIES 2
PGStats
Opal | Level 21

You could expand your data to include zero catches this way:

 

data have;
input BarcodeID $ Date :mmddyy. Minutes Species :$20. Length;
format date yymmdd10.;
datalines;  
A1234             5/15/2020         15.00          Walleye        180
A1234             5/15/2020         15.00          Walleye        350
A1234             5/15/2020         15.00          Sauger         175
B1234             5/15/2020         15.00          Walleye        400
B1234             5/15/2020         15.00          Walleye        500
C1234             5/15/2020         15.00          NoFish           .   
;

proc sql;
select 
    a.barCodeId,
    a.date,
    a.minutes,
    b.Species,
    c.length,
    not missing(c.length) as number
from 
    (select unique barcodeId, date, minutes from have) as a cross join
    (select unique species from have where species ne "NoFish") as b left join
    have as c on a.barcodeId=c.barcodeId and b.species=c.species
order by barcodeId, date, species, length;
quit;
BarcodeID Date Minutes Species Length number
A1234 2020-05-15 15 Sauger 175 1
A1234 2020-05-15 15 Walleye 180 1
A1234 2020-05-15 15 Walleye 350 1
B1234 2020-05-15 15 Sauger . 0
B1234 2020-05-15 15 Walleye 400 1
B1234 2020-05-15 15 Walleye 500 1
C1234 2020-05-15 15 Sauger . 0
C1234 2020-05-15 15 Walleye . 0
PG
mkeintz
PROC Star

If the data are sorted by barcodeid, then this data step will also work:

 

data have;
  input BarcodeID :$5.   Date :mmddyy10.  Minutes Species :$7.   Length  ;
datalines;
A1234             5/15/2020         15.00          Walleye        180
A1234             5/15/2020         15.00          Walleye        350
A1234             5/15/2020         15.00          Sauger         175
B1234             5/15/2020         15.00          Walleye        400
B1234             5/15/2020         15.00          Walleye        500
C1234             5/15/2020         15.00          NoFish           .
run;

data want (drop=_:);
  array species_array {2} $7 _temporary_ ("Walleye", "Sauger");
  _species_list=catx(' ',of species_array{*});

  /* Read and output catches, and pare down _species_list accordingly*/
  do until (last.barcodeid);
    set have;
    by barcodeid;
    _species_list=left(tranwrd(_species_list,trim(species),' '));
	count=1;
    if species^='NoFish' then output;
  end;

  /* At end of barcodeid, use _species_list to output needed dummy records */
  count=0;
  do while (_species_list^=' ');
    length=.;
	species=scan(_species_list,1,' ');
	output;
	_species_list=left(tranwrd(_species_list,trim(species),' '));
  end;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1204 views
  • 2 likes
  • 3 in conversation