10-16-2021
Indescribled
Obsidian | Level 7
Member since
08-08-2021
- 16 Posts
- 13 Likes Given
- 1 Solutions
- 0 Likes Received
-
Latest posts by Indescribled
Subject Views Posted 730 10-15-2021 12:22 PM 976 10-14-2021 06:05 PM 1062 10-14-2021 02:10 PM 1052 10-07-2021 06:01 PM 1071 10-07-2021 12:24 PM 1141 10-07-2021 03:00 AM 1025 10-06-2021 04:42 PM 1025 10-06-2021 04:41 PM 1115 10-06-2021 12:15 AM 1955 09-30-2021 12:06 AM -
Activity Feed for Indescribled
- Liked Re: Emulate fct_lump from R / Change all except top n factors to "Other" based on total va for ballardw. 10-16-2021 07:24 PM
- Posted Emulate fct_lump from R / Change all except top n factors to "Other" based on total value on SAS Programming. 10-15-2021 12:22 PM
- Liked Re: Trouble with proc transpose / emulate pivot_longer from R for Tom. 10-14-2021 06:32 PM
- Liked Re: Trouble with proc transpose / emulate pivot_longer from R for Reeza. 10-14-2021 06:12 PM
- Liked Re: Trouble with proc transpose / emulate pivot_longer from R for Reeza. 10-14-2021 06:12 PM
- Posted Re: Trouble with proc transpose / emulate pivot_longer from R on SAS Programming. 10-14-2021 06:05 PM
- Posted Trouble with proc transpose / emulate pivot_longer from R on SAS Programming. 10-14-2021 02:10 PM
- Posted Re: Choropleth map - FIPS data in plotting not recognized on SAS Programming. 10-07-2021 06:01 PM
- Posted Re: Choropleth map - FIPS data in plotting not recognized on SAS Programming. 10-07-2021 12:24 PM
- Posted Choropleth map - FIPS data in plotting not recognized on SAS Programming. 10-07-2021 03:00 AM
- Posted Re: Error while making choropleth map: ERROR: File MAPS.US.DATA does not exist. on SAS Programming. 10-06-2021 04:42 PM
- Posted Re: Error while making choropleth map: ERROR: File MAPS.US.DATA does not exist. on SAS Programming. 10-06-2021 04:41 PM
- Posted Error while making choropleth map: ERROR: File MAPS.US.DATA does not exist. on SAS Programming. 10-06-2021 12:15 AM
- Liked Re: How to make an area plot in SAS? for ChrisHemedinger. 09-30-2021 10:36 AM
- Liked Re: How to make an area plot in SAS? for acordes. 09-30-2021 10:36 AM
- Liked Re: How to make an area plot in SAS? for acordes. 09-30-2021 10:36 AM
- Liked Re: How to make an area plot in SAS? for ChrisHemedinger. 09-30-2021 10:35 AM
- Posted How to make an area plot in SAS? on SAS Programming. 09-30-2021 12:06 AM
- Posted SGplot x axis ticks without lumping or losing data on SAS Programming. 09-23-2021 01:43 AM
- Liked Re: Frequency table with multiple variables - emulate R code for Astounding. 09-22-2021 08:15 PM
-
Posts I Liked
Subject Likes Author Latest Post 3 1 1 1 1
10-15-2021
12:22 PM
I am trying to emulate R functions in SAS so I know how to manipulate data in both. Below is the code to get to the data before applying the function I want to emulate.
In R here is what it does: Takes the factors, in this case Seafood types, checks the total value of all the data, in this case Production, and changes all except the highest n factor levels to "Other". I included a picture from R at the end of this post to show. At the start there are 7 different Seafood types. The function changes them to Freshwater, Pelagic, Demersal, Other, Other, Other, Other. Since it is only 7 I could do this manually relatively easily, but I am sure I will run into a case where there are too many to do manually.
Is there a succinct way to do this in SAS?
* Get data;
filename test1234 url "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-10-12/seafood-and-fish-production-thousand-tonnes.csv";
data production ;
infile test1234 dsd truncover firstobs=2 ;
input entity :$40. code :$8. year
Pelagic Crustaceans Cephalopods Demersal Freshwater Molluscs Other_Marine;
run;
*clean up and filter;
proc sql;
create table production3 as
select *
from production
where ENTITY not in ('Entity', 'World') and not missing(Code)
having year=max(year);
quit;
* pivot_longer;
proc transpose data=production3 out=long_production (rename = (_name_ = Seafood col1=Production));
by Entity Year Code;
var Crustaceans--Other_Marine;
run;
* Remove non zero;
proc sql;
create table production_case2 as
select *
from long_production
where Production > 0;
quit;
Top table is before the function (note the different Seafood levels), bottom table is after (note all the "other")
Showing totals to help understanding of the function I am trying to emulate. Freshwater, Pelagic, Demersal are the top 3 when looking at total production. All others should be changed to "Other"
... View more
10-14-2021
06:05 PM
That is a default feature of the Tidyverse R package. The underlining is to help identify big numbers I think, it ends where a comma would.
... View more
10-14-2021
02:10 PM
I am working on the Tidy Tuesday data for this week about seafood. Below is reproducible code. It works fine up until PROC TRANSPOSE, I am having trouble getting the syntax correct. Could anyone help me figure out the correct way to do this? I included a picture example from R at the end for what I am trying to do. * Get data 1;
filename test1234 url "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-10-12/seafood-and-fish-production-thousand-tonnes.csv";
proc import out=production datafile=test1234 dbms=csv replace;
guessingrows = max;
getnames=no;
run;
* Delete row 1 because getnames=no put the column names in the first row;
proc sql;
delete from production
where VAR1 = "Entity";
quit;
* Rename columns;
data production;
set production (rename=(Var1=Entity
Var2=Code
Var3=Year
Var4=Pelagic
Var5=Crustaceans
Var6=Cephalopods
Var7=Demersal
Var8=Freshwater
Var9=Molluscs
Var10=Other_Marine));
run;
* Delete all rows that have blank Code. Could not figure out how to filter it in the data step with where;
proc sql;
delete from production
where Code = "";
quit;
* Filter Entity to remove World and set year to max. Would prefer to do something like Year = max(Year);
data production;
set production;
where Entity ^= 'World' AND Year = '2013';
run;
* ISSUES START BELOW HERE;
* Now the data is ready to pivot. Seems like proc transpose does this;
proc transpose data=production out=long_production;
by Entity-Year;
var Crustaceans-Other_Marine;
run;
* ERROR: CRUSTACEANS does not have a numeric suffix.;
proc transpose data=production out=long_production;
by Year;
var Crustaceans-Other_Marine;
run;
* ERROR: CRUSTACEANS does not have a numeric suffix.;
proc transpose data=production out=long_production;
by Code;
var Crustaceans-Other_Marine;
run; Example of what I am trying to do, but this is from R.
... View more
10-07-2021
06:01 PM
After a lot of trial and error I managed to get it to work. Here is the full working code and explanation if anyone else runs across this in the future: The issue I seemed to be having was that choromap mapid=??? NEEDS to be "state". Even if I call it fips and the contents of the column are the state fips ID, the column MUST be named state. I named it fips and it did not work or even recognize that it existed. In the end I merged my data with the data from the plot example page because I knew for sure the plot example data worked, so all I had to do was get my state level data merged into it. An issue arose where they both had "State" as a column name, so I had to rename the nurses data state, because as mentioned before, the fips ID must be in a column named "state". This was frustrating to figure out, but now that I know how it works it should be easier in the future. * Get data 1;
filename test1234 url "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-10-05/nurses.csv";
proc import out=nurses datafile=test1234 dbms=csv replace;
guessingrows = max;
run;
data nurses;
set nurses;
where Year = 2020;
run;
* Renaming state column to something else to try to merge later with another dataset that has a STATE column too;
data nurses2;
set nurses (rename=(State=StateNameFull));
run; PROC SQL;
create table work.nurse_fips3 as
select t1.StateNameFull,
t1.'Total Employed RN'n,
t2.STATENAME,
T2.STATE,
T2.STATECODE
from nurses2 t1
left join sashelp.us_data t2 on (t1.StateNameFull = t2.statename);
quit; data states;
set maps.states;
if state ^in(2,15,72);
x = -x * 45/atan(1);
y = y * 45/atan(1);
run;
data plot_data;
set maps.uscenter;
if state ^in(2,15,72) and ocean^='Y';
long = -long;
statename = fipstate(state);
run;
title 'Total Employed RN by State, 2020';
proc sgmap mapdata=states
maprespdata=work.nurse_fips3
plotdata=plot_data;
esrimap
url='http://services.arcgisonline.com/arcgis/rest/services/
Canvas/World_Light_Gray_Base';
choromap 'Total Employed RN'n / mapid=state density=2
name='choro';
text x=long y=lat text=statename /
textattrs=(size=6pt);
gradlegend 'choro' / title='Total Employed RN, 2020'
extractscale;
run;
quit;
... View more
10-07-2021
12:24 PM
Long response, but I tried to answer each of your questions! Here is the output of proc contents
data = maps.states;
run; When I ran PROC PRINT to look for the fips variable, I first checked the dataset in the example I linked to see how it was done there. The "state" variable is the FIPS id of the state, but the column title is state instead of fips. I assumed that meant that SAS required the FIPS number to plot it. Below is the output of proc print
data=sashelp.us_data;
run; I ran the code below to confirm that my newly created table had the same format as the data used in the example. proc print
data=work.nurse_fips;
run; Then, in the plot procedure, I replaced maprespdata= to my new data instead of the population data set choromap to my data instead of the population data, instead of the population data in the example mapid= my new fips variable, which was previously called "state" in the example The data I am working with is only state level, there is no county data. Here is: proc contents
data = maps.uscenter;
run; Here are the exact errors/logs
... View more
10-07-2021
03:00 AM
I am trying to take the Tidy Tuesday data for this week and turn it into a choropleth map of the US. I think everything works correctly up until PROC SGMAP at which time I get this error: ERROR: Variable FIPS not found I checked via PROC PRINT to make sure fips is there, and it seems to be, so I am unsure of what to do next. * Get data 1;
filename test1234 url "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-10-05/nurses.csv";
proc import out=nurses datafile=test1234 dbms=csv replace;
guessingrows = max;
run;
* Filter to 2020 only;
data nurses;
set nurses;
where Year = 2020;
run;
* Solution here to get FIPS data with state name
https://communities.sas.com/t5/SAS-Programming/Convert-Full-State-Name-to-Abbreviation/td-p/738723;
data fips;
fmtname='$FIPS';
length fips 8 label $2 start $20 ;
do fips=1 to 95;
start=fipnamel(fips);
if start ne 'Invalid Code' then do;
label=fipstate(fips);
output;
end;
end;
run;
proc format cntlin=fips ; run;
data test;
set fips;
statecode=put(start,$fips.);
run;
* Merge nurses data with fips data;
proc sql;
create table work.nurse_fips as
select t1.State,
t1.'Total Employed RN'n,
t2.fips,
t2.start
from nurses t1
left join work.fips t2 on (t1.State = t2.start);
quit;
* Confirming output to make sure fips column exists due to error;
proc print
data=work.nurse_fips;
run;
* Plot template
* https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/grmapref/n01dh0rm86m35un12p8j7dz9ugog.htm;
data states;
set maps.states;
if state ^in(2,15,72);
x = -x * 45/atan(1);
y = y * 45/atan(1);
run;
data plot_data;
set maps.uscenter;
if state ^in(2,15,72) and ocean^='Y';
long = -long;
statename = fipstate(state);
run;
* New error is: "ERROR: Variable FIPS not found";
proc sgmap mapdata=states
maprespdata=work.nurse_fips
plotdata=plot_data;
esrimap
url='http://services.arcgisonline.com/arcgis/rest/services/
Canvas/World_Light_Gray_Base';
choromap 'Total Employed RN'n / mapid=fips density=2
name='choro';
text x=long y=lat text=statename /
textattrs=(size=6pt);
gradlegend 'choro' / title='Total Employed RN, 2020'
extractscale;
run;
quit;
... View more
10-06-2021
04:42 PM
I am using SAS studio, so there is no install. Can I get a MAPS library while using SAS studio?
... View more
10-06-2021
04:41 PM
I am using SAS studio so I have nothing installed.
... View more
10-06-2021
12:15 AM
I am trying to make a choropleth map with the Tidy Tuesday data for this week, but I am getting an error: ERROR: File MAPS.US.DATA does not exist. Below is my code. What do I need to change to make MAPS.US.DATA exist? I am using SAS studio if it is relevant. * Get data 1;
filename test1234 url "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-10-05/nurses.csv";
proc import out=nurses datafile=test1234 dbms=csv replace;
guessingrows = max;
run;
* Map attempt 1
Error "ERROR: File MAPS.US.DATA does not exist."
https://v8doc.sas.com/sashtml/gref/z9n04-ex.htm;
libname reflib 'SAS-data-library';
libname maps 'SAS-data-library';
proc gmap map=maps.us data=nurses;
where Year = 2020;
id State;
choro 'Total Employed RN'n / coutline=gray;
run;
quit; This is a reference of what I am attempting to make a simple version of:
... View more
09-30-2021
12:06 AM
I am trying to make an area plot in SAS. At the bottom is an image of the plot in question from R. The data I am using comes from Tidy Tuesday for this week - https://github.com/rfordatascience/tidytuesday/blob/master/data/2021/2021-09-28/readme.md Here is my reproducible code to get the data to the point of plotting, but the plot is not at all what I had intended and I am having trouble finding resources about SAS area plots so I am hoping someone here can assist. I would prefer proc sgplot, but any way to accomplish it is fine as long as it is not too complicated. * Get data 1;
filename test1234 url "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-28/papers.csv";
proc import out=papers datafile=test1234 dbms=csv replace;
guessingrows = max;
run;
* Get data 2;
filename test1234 url "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-28/programs.csv";
proc import out=programs datafile=test1234 dbms=csv replace;
guessingrows = max;
run;
* Get data 3;
filename test1234 url "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-28/paper_authors.csv";
proc import out=paper_authors datafile=test1234 dbms=csv replace;
guessingrows = max;
run;
* Get data 4;
filename test1234 url "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-28/paper_programs.csv";
proc import out=paper_programs datafile=test1234 dbms=csv replace;
guessingrows = max;
run;
* Sort before merging;
proc sort data=papers;
by paper;
run;
proc sort data = paper_authors;
by paper;
run;
proc sort data = paper_programs;
by paper;
run;
* Merge;
data joined_df;
merge papers paper_authors paper_programs;
by paper;
run;
* Sort again;
proc sort data=joined_df;
by program;
run;
proc sort data = programs;
by program;
run;
* Merge 2;
data joined_df_2;
merge joined_df programs;
by program;
run;
* Count;
proc freq data=joined_df_2;
tables year * program_category / out=summary;
run;
* Checking how the filter works and data looks;
proc print data=summary;
where percent > 0 AND program_category ^= 'NA';
run; Below is my attempt to plot it, but it is not at all what I intended. * Plot attempt;
proc gplot data=summary;
where percent > 0 AND program_category ^= 'NA';
plot count*year / overlay
areas = 3;
run;
quit; Here is an example of the plot I want to make. This is from R.
... View more
09-23-2021
01:43 AM
I am trying to change my x axis ticks, but every time I do it either lumps the data into the new ticks or drops all data on years with no tick instead of just changing the labels. What is the correct way to change axis ticks without combining or losing my data? This is what I have: filename test1234 url "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-21/nominees.csv";
proc import out=nominees datafile=test1234 dbms=csv replace;
guessingrows = max;
run;
proc sgplot data=nominees;
where year > 1000;
vbar year / group = type stat=freq groupdisplay=stack;
xaxis label = "Year" values=(1960 to 2020 by 5);
run; This is what I want the x axis and plot layout to look like (image from R, but I cannot do it in SAS to show is the issue):
... View more
09-22-2021
06:25 PM
I am having trouble making a frequency table with multiple variables. My goal is to have a basic frequency table to make a stacked bar chart of nominees and winners by year. Here is the R code I want to emulate in SAS. library(tidyverse)
nominees <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-21/nominees.csv')
nominees %>%
count(year, type) This is what the output looks like. Winner and nominee counts by year. # A tibble: 122 x 3
year type n
<dbl> <chr> <int>
1 1957 Nominee 16
2 1957 Winner 4
3 1958 Nominee 32
4 1958 Winner 8
5 1959 Nominee 16
6 1959 Winner 4
7 1961 Nominee 7
8 1961 Winner 3
9 1962 Nominee 8
10 1962 Winner 2 Here is what I have in SAS so far: * Get data;
filename test1234 temp;
proc http
url="https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-21/nominees.csv"
method="GET"
out=test1234;
run;
proc import out=nominees datafile=test1234 dbms=csv replace;
guessingrows = max;
run;
* Count;
proc freq data=nominees;
tables year * type;
run; This mostly accomplishes what I want, but there is way too much in the output. Here is a picture of the output with areas I have no need for crossed out. How can I get the SAS output to be much more minimal than it currently is? Or is SAS able to make a stacked bar chart by year with nominee and winner from this frequency table?
... View more
09-08-2021
06:19 PM
Thanks for the explanation, that helps me understand how missing values work. I tried running the code without any filtering for the missing values, but I still get an error. Specifically: ERROR: The VBOX variable must be numeric. I assumed this error was from the missing values. Perhaps there is some other error that is causing it? EDIT: I found the solution after the responder edited his post. This line is the correct solution "Second, SAS will not let you change the type of a variable from character to numeric or vice versa. To get a numeric value, assuming duration is character, you should create a new variable as you can't use the old one." As someone coming from R you can change one back and forth, so this was not intuitive for me in SAS. Here is the corrected code that works if anyone stumbles on this in the future: * Get data 1;
filename test1234 temp;
proc http
url="https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-07/pit_stops.csv"
method="GET"
out=test1234;
run;
proc import out=pit_stops datafile=test1234 dbms=csv replace;
guessingrows = max;
run;
* Get data 2;
filename racecsv temp;
proc http
url="https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-07/races.csv"
method="GET"
out=racecsv;
run;
proc import out=races datafile=racecsv dbms=csv replace;
guessingrows = max;
run;
* Merge/join the data;
* Sorting is required in SAS to merge correctly;
proc sort data=pit_stops;
by raceId;
run;
proc sort data=races;
by raceId;
run;
data pit_stopdf;
merge pit_stops races;
by raceId;
run;
* Drop all(?) blank rows;
data pit_stopdf;
set pit_stopdf;
where year > 2010;
run;
* Change duration to numeric;
data pit_stopdf;
set pit_stopdf;
duration_numeric = input(duration, comma8.);
run;
* Plot the data;
proc sgplot data=pit_stopdf;
vbox duration_numeric / category=year;
title 'Formula1 pit stop duration';
xaxis label = "Year";
yaxis label = "Duration";
run;
title;
... View more
09-08-2021
05:16 PM
I am working through Tidy Tuesday data to learn how to do stuff I know in R in SAS. I am having trouble converting values from characters to numeric. After doing the conversion, there are still periods in the data. I tried filtering to values greater than 0 or filtering where the values are not ".", but neither worked. How can I filter these issue values out so I can make a boxplot? Also, if you have any efficiency tips to make my code smoother/better I would also welcome those. Below code is should be reproducible. * Get data 1;
filename test1234 temp;
proc http
url="https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-07/pit_stops.csv"
method="GET"
out=test1234;
run;
proc import out=pit_stops datafile=test1234 dbms=csv replace;
guessingrows = max;
run;
* Get data 2;
filename racecsv temp;
proc http
url="https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-07/races.csv"
method="GET"
out=racecsv;
run;
proc import out=races datafile=racecsv dbms=csv replace;
guessingrows = max;
run;
* Merge/join the data;
* Sorting is required in SAS to merge correctly;
proc sort data=pit_stops;
by raceId;
run;
proc sort data=races;
by raceId;
run;
data pit_stopdf;
merge pit_stops races;
by raceId;
run;
* Drop all(?) blank rows;
data pit_stopdf;
set pit_stopdf;
where year > 2010;
run;
* Change duration to numeric;
* Something is still wrong, there are values with just . in the duration data;
data pit_stopdf;
set pit_stopdf;
duration = input(duration, comma5.3);
run;
* Attempt 1 to remove "." from duration;
* This did not work;
data pit_stopsdf;
set pit_stopsdf;
where duration > 0;
run;
* Attempt 2 to remove ".";
* Also did not work;
data pit_stopsdf;
set pit_stopsdf;
where duration ^= ".";
run;
* Plot the data, currently errors;
proc sgplot data=pit_stopdf;
vbox duration / category=year;
title 'Formula1 pit stop duration';
xaxis label = "Year";
yaxis label = "Duration";
run;
title;
... View more
09-01-2021
11:04 AM
I usually mess with this Tidy Tuesday data in R, but I want to try it in SAS this week as practice. I am having some trouble importing it into SAS Studio. Data page - https://github.com/rfordatascience/tidytuesday/blob/master/data/2021/2021-08-31/readme.md Direct link to csv - https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-08-31/bird_baths.csv I have a few questions. Below is my code. 1 - Why am I getting an error trying to import the data directly from my PC after downloading the file? 2 - I uploaded the file directly to SAS Studio and was able to interact with it. 3 - I tried to import the data directly from the web CSV and I do not get an error, but when I try to print it to see if it exists I do get an error. How do I know if it correctly downloaded and what do I need to change in the code to be able to print the data? * 1 ERROR: Physical file does not exist, /pbr/biconfig/940/Lev1/SASApp/C:/Users/Jason/Documents/R/Tidy Tuesday/BirdBaths/birdbaths.csv.
ERROR: Import unsuccessful. See SAS Log for details.;
proc import file="C:/Users/Jason/Documents/R/Tidy Tuesday/BirdBaths/birdbaths.csv"
out=birdbathsPC
dbms=csv;
run;
* 2
* THIS WORKS
* Need to figure out how to do it without uploading to SAS;
proc import file="/home/u59127222/SP4R/birdbaths.csv"
out=birdbaths
dbms=csv;
run;
proc print data=birdbaths (obs=6);
run;
* 3
* Testing downloading directly from the web
* https://github.com/rfordatascience/tidytuesday/blob/master/data/2021/2021-08-31/readme.md
* https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-08-31/bird_baths.csv;
* https://blogs.sas.com/content/sasdummy/2017/05/07/download-convert-csv-files/;
filename probly temp;
proc http
url="https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-08-31/bird_baths.csv"
method="GET"
out=probly;
run;
* ERROR: File WORK.PROBLY.DATA does not exist;
proc print data=WORK.probly (obs=6);
run; * ERROR: File WORK.PROBLY.DATA does not exist; proc print data=probly (obs=6); run;
... View more