I am trying to emulate R functions in SAS so I know how to manipulate data in both. Below is the code to get to the data before applying the function I want to emulate.
In R here is what it does: Takes the factors, in this case Seafood types, checks the total value of all the data, in this case Production, and changes all except the highest n factor levels to "Other". I included a picture from R at the end of this post to show. At the start there are 7 different Seafood types. The function changes them to Freshwater, Pelagic, Demersal, Other, Other, Other, Other. Since it is only 7 I could do this manually relatively easily, but I am sure I will run into a case where there are too many to do manually.
Is there a succinct way to do this in SAS?
* Get data;
filename test1234 url "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-10-12/seafood-and-fish-production-thousand-tonnes.csv";
data production ;
infile test1234 dsd truncover firstobs=2 ;
input entity :$40. code :$8. year
Pelagic Crustaceans Cephalopods Demersal Freshwater Molluscs Other_Marine;
run;
*clean up and filter;
proc sql;
create table production3 as
select *
from production
where ENTITY not in ('Entity', 'World') and not missing(Code)
having year=max(year);
quit;
* pivot_longer;
proc transpose data=production3 out=long_production (rename = (_name_ = Seafood col1=Production));
by Entity Year Code;
var Crustaceans--Other_Marine;
run;
* Remove non zero;
proc sql;
create table production_case2 as
select *
from long_production
where Production > 0;
quit;
Top table is before the function (note the different Seafood levels), bottom table is after (note all the "other")
Showing totals to help understanding of the function I am trying to emulate. Freshwater, Pelagic, Demersal are the top 3 when looking at total production. All others should be changed to "Other"
I think I figured out the random digit underlining, no commas or separator for thousands, millions and such. 🤔
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.