I am a fisheries biologist and am new to programming and SAS but would like to become more proficient at this. One analysis that we use often is Catch Per Unit of Effort (CPUE) for each species for numerous different efforts (time). I am having trouble figuring out how to add zero catches.
As you can see it is pretty simple calculation CPUE= # fish caught in each species / minutes.
However, as you can see below in effort(barcodeID) A1234 a sauger was caught, but in B1234 and C1234 one was not caught. So when we calculate a mean CPUE of all the runs together the zero catches need to be included into the analysis. I need some help with developing code to include a catch of zero for each effort(barcodeID) and species when it is not present in the data. I would also like to calculate CPUE based on different sized fish as well. So CPUE of Walleye and Sauger <200 and CPUE of Walleye and Sauger >200.
Hopefully that is clear enough.
Thanks in advance!
Here is how my data is set up:
BarcodeID Date Minutes Species Length
A1234 5/15/2020 15.00 Walleye 180
A1234 5/15/2020 15.00 Walleye 350
A1234 5/15/2020 15.00 Sauger 175
B1234 5/15/2020 15.00 Walleye 400
B1234 5/15/2020 15.00 Walleye 500
C1234 5/15/2020 15.00 NoFish .
You could expand your data to include zero catches this way:
data have;
input BarcodeID $ Date :mmddyy. Minutes Species :$20. Length;
format date yymmdd10.;
datalines;
A1234 5/15/2020 15.00 Walleye 180
A1234 5/15/2020 15.00 Walleye 350
A1234 5/15/2020 15.00 Sauger 175
B1234 5/15/2020 15.00 Walleye 400
B1234 5/15/2020 15.00 Walleye 500
C1234 5/15/2020 15.00 NoFish .
;
proc sql;
select
a.barCodeId,
a.date,
a.minutes,
b.Species,
c.length,
not missing(c.length) as number
from
(select unique barcodeId, date, minutes from have) as a cross join
(select unique species from have where species ne "NoFish") as b left join
have as c on a.barcodeId=c.barcodeId and b.species=c.species
order by barcodeId, date, species, length;
quit;
BarcodeID | Date | Minutes | Species | Length | number |
---|---|---|---|---|---|
A1234 | 2020-05-15 | 15 | Sauger | 175 | 1 |
A1234 | 2020-05-15 | 15 | Walleye | 180 | 1 |
A1234 | 2020-05-15 | 15 | Walleye | 350 | 1 |
B1234 | 2020-05-15 | 15 | Sauger | . | 0 |
B1234 | 2020-05-15 | 15 | Walleye | 400 | 1 |
B1234 | 2020-05-15 | 15 | Walleye | 500 | 1 |
C1234 | 2020-05-15 | 15 | Sauger | . | 0 |
C1234 | 2020-05-15 | 15 | Walleye | . | 0 |
If the data are sorted by barcodeid, then this data step will also work:
data have;
input BarcodeID :$5. Date :mmddyy10. Minutes Species :$7. Length ;
datalines;
A1234 5/15/2020 15.00 Walleye 180
A1234 5/15/2020 15.00 Walleye 350
A1234 5/15/2020 15.00 Sauger 175
B1234 5/15/2020 15.00 Walleye 400
B1234 5/15/2020 15.00 Walleye 500
C1234 5/15/2020 15.00 NoFish .
run;
data want (drop=_:);
array species_array {2} $7 _temporary_ ("Walleye", "Sauger");
_species_list=catx(' ',of species_array{*});
/* Read and output catches, and pare down _species_list accordingly*/
do until (last.barcodeid);
set have;
by barcodeid;
_species_list=left(tranwrd(_species_list,trim(species),' '));
count=1;
if species^='NoFish' then output;
end;
/* At end of barcodeid, use _species_list to output needed dummy records */
count=0;
do while (_species_list^=' ');
length=.;
species=scan(_species_list,1,' ');
output;
_species_list=left(tranwrd(_species_list,trim(species),' '));
end;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.