BookmarkSubscribeRSS Feed
sfletc15
Calcite | Level 5

Hey guys. New here and I have no clue if I'm asking this in the right place but I need help. Bear with me as I try to explain this.

 

So I'm working on a research project to analyze and rank the most contaminated areas in the state of Georgia. I managed to make a list of all the counties in Georgia that contain 1 or more toxic sites by compiling the data found in Georgia's EPA Hazardous Site Inventory. From this, I made a Top 5 list with which I will need help. So my goal is to analyze and rank which pollutants are most commonly found in the Top 5 county list. I would like to make a Top 10 list of pollutants from this analysis. So to do that, I would have to go through each individual site for all 5 counties and make a list of pollutants. For example, Site A for Cobb County might have 12 pollutants, while Site B has only 5. So Cobb will have 17 total and then I'd move on to the next county. However each pollutant in each site is further categorized by soil or groundwater. So it can be found in soil, groundwater, or both. 

 

Link to Hazardous Site Inventory: http://epd.georgia.gov/sites/epd.georgia.gov/files/2017HSI/10145.pdf 

 

My professor introduced me to someone in the Stats Department who was more knowledgeable in SAS. He was able to draw out a quick sketch of what the table should probably look like. Then he further went on to say that SAS would be the perfect program to crank out this kind of table and using proc freq is the way to go. He said that SAS has the ability to create both of these tables in the sketch. I'm just confused on what the code should be for both of them. I know proc.freq is used for the second one but I think it would need some tweaking that I'm not familiar with. For the code to make the first table to include all the variables, I have no clue on what the code would be.

20180331_171658.jpg

 

Edit: Thought it would be more clear to see, but Pol = Pollutant. S and W means Soil and Water (Groundwater), respectively. For S and W, he used 0 and 1 to indicate whether the pollutant was found in soil or groundwater, just to be clear. Kinda like how 0 and 1 is used to indicate Gender.

 

So I'm assuming I'll have to make an excel spreadsheet containing the sites, counties, pollutants, and then upload this into the SAS library to create the tables, correct? 

 

Sorry if this sounds all over the place. I'm just really confused on how to get started on this. Any help would be REALLY appreciated.

2 REPLIES 2
Miracle
Barite | Level 11

Hi there.

Seems like you have been advised to do a 4-way table with Proc freq.

This The FREQ Procedure will be useful to you.

You can use import wizards from the File menu in SAS if you are not confident to write code depending on how your input data file is like  (.xls  or .csv or .txt or stata or spss or other formats) to read your data into SAS. However, csv  is preferred because it will generate the code for you which you can reuse after loading the data into SAS.

Good luck.

 

 

 

PGStats
Opal | Level 21

Sounds like your top table could be compiled with something like this:

 

proc sql;
create table polCounts as
select
    County,
    Site,
    Pollutant,
    sum(matrix="Soil") as nbSoil,
    sum(matrix="Groundwater") as nbWater
from myData
group by County, Site, Pollutant;
quit; 

where myTable would be your input data table with columns County, Site, Pollutant, and Matrix ("Soil" or "Groundwater")

 

 

replace sum(matrix="Soil") as nbSoil with sum(matrix="Soil") > 0 as nbSoil if you only want zeros and ones.

PG

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 831 views
  • 0 likes
  • 3 in conversation