BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Indescribled
Obsidian | Level 7

I am trying to emulate R functions in SAS so I know how to manipulate data in both. Below is the code to get to the data before applying the function I want to emulate.

 

In R here is what it does: Takes the factors, in this case Seafood types, checks the total value of all the data, in this case Production, and changes all except the highest n factor levels to "Other". I included a picture from R at the end of this post to show. At the start there are 7 different Seafood types. The function changes them to Freshwater, Pelagic, Demersal, Other, Other, Other, Other. Since it is only 7 I could do this manually relatively easily, but I am sure I will run into a case where there are too many to do manually. 

 

Is there a succinct way to do this in SAS? 

 

* Get data;
filename test1234 url "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-10-12/seafood-and-fish-production-thousand-tonnes.csv";

data production ;
  infile test1234 dsd truncover firstobs=2 ;
  input entity :$40. code :$8. year
        Pelagic Crustaceans Cephalopods Demersal Freshwater Molluscs Other_Marine;
run;

*clean up and filter;
proc sql;
	create table production3 as
	select *
	from production 
	where ENTITY not in ('Entity', 'World') and not missing(Code) 
	having year=max(year);
quit;

* pivot_longer;
proc transpose data=production3 out=long_production (rename = (_name_ = Seafood col1=Production));
	by Entity Year Code;
	var Crustaceans--Other_Marine;
run;

* Remove non zero;
proc sql;
	create table production_case2 as
	select *
	from long_production
	where Production > 0;
quit;

 

Top table is before the function (note the different Seafood levels), bottom table is after (note all the "other")

Indescribled_0-1634314054931.png

Showing totals to help understanding of the function I am trying to emulate. Freshwater, Pelagic, Demersal are the top 3 when looking at total production. All others should be changed to "Other"

Indescribled_1-1634314844739.png

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
A custom format driven by a data set can accomplish this, so not as succinct as the R code but doable.

View solution in original post

2 REPLIES 2
Reeza
Super User
A custom format driven by a data set can accomplish this, so not as succinct as the R code but doable.
ballardw
Super User

I think I figured out the random digit underlining, no commas or separator for thousands, millions and such. 🤔

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1222 views
  • 3 likes
  • 3 in conversation