- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am trying to emulate R functions in SAS so I know how to manipulate data in both. Below is the code to get to the data before applying the function I want to emulate.
In R here is what it does: Takes the factors, in this case Seafood types, checks the total value of all the data, in this case Production, and changes all except the highest n factor levels to "Other". I included a picture from R at the end of this post to show. At the start there are 7 different Seafood types. The function changes them to Freshwater, Pelagic, Demersal, Other, Other, Other, Other. Since it is only 7 I could do this manually relatively easily, but I am sure I will run into a case where there are too many to do manually.
Is there a succinct way to do this in SAS?
* Get data;
filename test1234 url "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-10-12/seafood-and-fish-production-thousand-tonnes.csv";
data production ;
infile test1234 dsd truncover firstobs=2 ;
input entity :$40. code :$8. year
Pelagic Crustaceans Cephalopods Demersal Freshwater Molluscs Other_Marine;
run;
*clean up and filter;
proc sql;
create table production3 as
select *
from production
where ENTITY not in ('Entity', 'World') and not missing(Code)
having year=max(year);
quit;
* pivot_longer;
proc transpose data=production3 out=long_production (rename = (_name_ = Seafood col1=Production));
by Entity Year Code;
var Crustaceans--Other_Marine;
run;
* Remove non zero;
proc sql;
create table production_case2 as
select *
from long_production
where Production > 0;
quit;
Top table is before the function (note the different Seafood levels), bottom table is after (note all the "other")
Showing totals to help understanding of the function I am trying to emulate. Freshwater, Pelagic, Demersal are the top 3 when looking at total production. All others should be changed to "Other"
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I think I figured out the random digit underlining, no commas or separator for thousands, millions and such. 🤔