BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Indescribled
Obsidian | Level 7

I am trying to emulate R functions in SAS so I know how to manipulate data in both. Below is the code to get to the data before applying the function I want to emulate.

 

In R here is what it does: Takes the factors, in this case Seafood types, checks the total value of all the data, in this case Production, and changes all except the highest n factor levels to "Other". I included a picture from R at the end of this post to show. At the start there are 7 different Seafood types. The function changes them to Freshwater, Pelagic, Demersal, Other, Other, Other, Other. Since it is only 7 I could do this manually relatively easily, but I am sure I will run into a case where there are too many to do manually. 

 

Is there a succinct way to do this in SAS? 

 

* Get data;
filename test1234 url "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-10-12/seafood-and-fish-production-thousand-tonnes.csv";

data production ;
  infile test1234 dsd truncover firstobs=2 ;
  input entity :$40. code :$8. year
        Pelagic Crustaceans Cephalopods Demersal Freshwater Molluscs Other_Marine;
run;

*clean up and filter;
proc sql;
	create table production3 as
	select *
	from production 
	where ENTITY not in ('Entity', 'World') and not missing(Code) 
	having year=max(year);
quit;

* pivot_longer;
proc transpose data=production3 out=long_production (rename = (_name_ = Seafood col1=Production));
	by Entity Year Code;
	var Crustaceans--Other_Marine;
run;

* Remove non zero;
proc sql;
	create table production_case2 as
	select *
	from long_production
	where Production > 0;
quit;

 

Top table is before the function (note the different Seafood levels), bottom table is after (note all the "other")

Indescribled_0-1634314054931.png

Showing totals to help understanding of the function I am trying to emulate. Freshwater, Pelagic, Demersal are the top 3 when looking at total production. All others should be changed to "Other"

Indescribled_1-1634314844739.png

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
A custom format driven by a data set can accomplish this, so not as succinct as the R code but doable.

View solution in original post

2 REPLIES 2
Reeza
Super User
A custom format driven by a data set can accomplish this, so not as succinct as the R code but doable.
ballardw
Super User

I think I figured out the random digit underlining, no commas or separator for thousands, millions and such. 🤔

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 546 views
  • 3 likes
  • 3 in conversation