BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Indescribled
Obsidian | Level 7

I am trying to emulate R functions in SAS so I know how to manipulate data in both. Below is the code to get to the data before applying the function I want to emulate.

 

In R here is what it does: Takes the factors, in this case Seafood types, checks the total value of all the data, in this case Production, and changes all except the highest n factor levels to "Other". I included a picture from R at the end of this post to show. At the start there are 7 different Seafood types. The function changes them to Freshwater, Pelagic, Demersal, Other, Other, Other, Other. Since it is only 7 I could do this manually relatively easily, but I am sure I will run into a case where there are too many to do manually. 

 

Is there a succinct way to do this in SAS? 

 

* Get data;
filename test1234 url "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-10-12/seafood-and-fish-production-thousand-tonnes.csv";

data production ;
  infile test1234 dsd truncover firstobs=2 ;
  input entity :$40. code :$8. year
        Pelagic Crustaceans Cephalopods Demersal Freshwater Molluscs Other_Marine;
run;

*clean up and filter;
proc sql;
	create table production3 as
	select *
	from production 
	where ENTITY not in ('Entity', 'World') and not missing(Code) 
	having year=max(year);
quit;

* pivot_longer;
proc transpose data=production3 out=long_production (rename = (_name_ = Seafood col1=Production));
	by Entity Year Code;
	var Crustaceans--Other_Marine;
run;

* Remove non zero;
proc sql;
	create table production_case2 as
	select *
	from long_production
	where Production > 0;
quit;

 

Top table is before the function (note the different Seafood levels), bottom table is after (note all the "other")

Indescribled_0-1634314054931.png

Showing totals to help understanding of the function I am trying to emulate. Freshwater, Pelagic, Demersal are the top 3 when looking at total production. All others should be changed to "Other"

Indescribled_1-1634314844739.png

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
A custom format driven by a data set can accomplish this, so not as succinct as the R code but doable.

View solution in original post

2 REPLIES 2
Reeza
Super User
A custom format driven by a data set can accomplish this, so not as succinct as the R code but doable.
ballardw
Super User

I think I figured out the random digit underlining, no commas or separator for thousands, millions and such. 🤔

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 431 views
  • 3 likes
  • 3 in conversation