turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- ODS and Base Reporting
- /
- Generating percentages or ratios from two categori...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-25-2017 02:07 PM

I need to output percentages in SAS v9.4 that fall above a certain threshold from two categorical variables in a dataset - state and year. I first produced the following table that has all of the data:

proc freq data=example;

tables statefip * yr/ nopercent nocol norow missing out=test;

run;

Now I would like to calculate a percentage from year to year by state. There will be 10 years worth of data. For example, ratio1 = (Count of statefip variable from 2013)/(Count of statefip variable from 2012). If ratio1 is greater than a threshold (e.g., 80%), I would keep the data. If not, it would be coded as "missing".

Is that possible to do in SAS?

In addition, is there a way to do this in a macro?

Any suggestions is greatly appreciated!

Accepted Solutions

Solution

04-27-2017
07:18 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to tipscode

04-26-2017 03:37 PM

Not sure if this solves your problem, but here's some code that dynamically divides all present years based on your previously stated criteria.

```
data have;
infile cards missover;
length state $ 100 _2012 _2013 _2014 _2015 8.;
input state _2012 _2013 _2014 _2015;
cards;
Alabama 9653 10264 9557 11152
Alaska 6273 6476 6423 6576
Arizona 24199 21635 22835 20865
;
run;
PROC SQL;
CREATE TABLE ALLYEARS AS
SELECT MONOTONIC() AS ROWID, * FROM DICTIONARY.COLUMNS WHERE MEMNAME = 'HAVE' AND NAME LIKE '%20%';
QUIT;
PROC SQL;
SELECT TRIM(ALLYEARS.NAME) INTO: ALLNAMES SEPARATED BY "," FROM ALLYEARS ;
QUIT;
PROC SQL;
SELECT TRIM(ALLYEARS.NAME)||"/"||TRIM(ALLYEARS2.NAME)||" AS "||TRIM(ALLYEARS.NAME)||"_"||TRIM(ALLYEARS2.NAME)||" "
INTO: ALLDIVS SEPARATED BY "," FROM ALLYEARS
, ALLYEARS AS ALLYEARS2
WHERE ALLYEARS.ROWID > ALLYEARS2.ROWID;
QUIT;
%PUT &ALLDIVS;
PROC SQL;
CREATE TABLE PROPORTION AS
SELECT state, &ALLNAMES, &ALLDIVS
FROM HAVE;
QUIT;
```

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to tipscode

04-25-2017 02:10 PM

Depending on how your data were structured:

```
PROC SQL;
CREATE TABLE PROPORTION AS
SELECT STATENAME, SUM(DISTINCT CASE WHEN STATEFLIP2013 = 1 THEN 1 ELSE 0 END)/SUM(DISTINCT CASE WHEN STATEFLIP2012 = 1 THEN 1 ELSE 0 END)
FROM MYTABLE
GROUP BY STATENAME;
QUIT;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to thomp7050

04-25-2017 03:23 PM

Apologies - I should have clarified with an illustrative table.

I currently am using proc tabulate to generate the following:

STATEFIP | 2012 | 2013 | 2014 | 2015 |

Alabama | 9,653 | 10,264 | 9,557 | 11,152 |

Alaska | 6,273 | 6,476 | 6,423 | 6,576 |

Arizona | 24,199 | 21,635 | 22,835 | 20,865 |

I would like to generate the following table:

STATEFIP | 2012 | 2013 | 2014 | 2015 | 2013/2012 | 2014/2013 | 2015/2014 |

Alabama | 9,653 | 10,264 | 9,557 | 11,152 | 106.33% | 93.11% | 116.69% |

Alaska | 6,273 | 6,476 | 6,423 | 6,576 | 103.24% | 99.18% | 102.38% |

Arizona | 24,199 | 21,635 | 22,835 | 20,865 | 89.40% | 105.55% | 91.37% |

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to tipscode

04-25-2017 03:30 PM

Thanks for the specification!

My solution, for your review:

```
data have;
infile cards missover;
length state $ 100 _2012 _2013 _2014 _2015 8.;
input state _2012 _2013 _2014 _2015;
cards;
Alabama 9653 10264 9557 11152
Alaska 6273 6476 6423 6576
Arizona 24199 21635 22835 20865
;
run;
PROC SQL;
CREATE TABLE PROPORTION AS
SELECT state, _2012, _2013, _2014, _2015
, _2013/_2012 "2013/2012" FORMAT 8.4
, _2014/_2013 "2014/2013" FORMAT 8.4
, _2015/_2014 "2015/2014" FORMAT 8.4
FROM HAVE;
QUIT;
```

Patrick

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to tipscode

04-25-2017 04:27 PM

The other option, if you want this dynamic would be to use PROC REPORT with a compute column.

PS. I'm moving this thread to reporting forum as it more pertains to reporting structure.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to tipscode

04-26-2017 09:48 AM

Thank you both for your timely suggestions!

I would like to do this dynamically for future reports. Here is the code that I have thus far for your comments and suggestions:

1) Creating a macro variable across years:

%LET CYR1 = %EVAL(2015); */For later iterations, one only have to update this year

%LET CYR2 = %EVAL(&CYR1-1);

%LET CYR3 = %EVAL(&CYR1-2);

%LET CYR4 = %EVAL(&CYR1-3);

2) I needed to concatenate and then subset two large datasets that will have new variables by the years:

DATA test;

SET dataset1 dataset2;

state=stnamel(statefip);

IF yr IN (&CYR1 &CYR2 &CYR3 &CYR4);

YR1 = yr IN (&CYR1); */This creates counts per year, but I also see that it creates a binary variable where YR1 can be 0 or 1, I need to figure out how to take out the values where YR1=0

YR2 = yr IN (&CYR2);

YR3 = yr IN (&CYR3);

YR4 = yr IN (&CYR4);

RUN;

3) Right now I am thinking about using Proc Tabulate to calculate the proportions one column at a time. The code below is in dev because I need to figure out how to take out values where YR1=0, YR2=0, YR3=0, YR4=0. Can I use PCTN in the proc tabulate below? I don't think I can use PCTNSUM.

PROC TABULATE DATA = test;

CLASS state;

VAR YR3 YR4;

TABLE state, YR3 YR4 YR3*PCTN<YR4>='Ratio 2013/2012'*f=8.2;

Where YR3='1' and YR4='1'; */Here is the logic but probably won't be a line of code

RUN;

I would like to successfully execute 3), then set a conditional statement that if any ratio calculated, (e.g., ratio 2013/2012) is lt 70%, then drop the value or convert it to missing "."

Thank you again.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to tipscode

04-26-2017 01:38 PM

I'm confused with the direction you're taking here.

Why are you using macro variables? You should be designing this to not need the years in the data at all.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

04-26-2017 03:09 PM

My main challenge has been to create counts of each year for the states, then generating new variables that has the proportion of the counts for states. So far I have not found a way to do this in a data step instead of a proc means/freq/tabulate.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to tipscode

04-26-2017 03:26 PM

What does your original data look like to start off with? So far you've only shown the aggregate data.

Solution

04-27-2017
07:18 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to tipscode

04-26-2017 03:37 PM

Not sure if this solves your problem, but here's some code that dynamically divides all present years based on your previously stated criteria.

```
data have;
infile cards missover;
length state $ 100 _2012 _2013 _2014 _2015 8.;
input state _2012 _2013 _2014 _2015;
cards;
Alabama 9653 10264 9557 11152
Alaska 6273 6476 6423 6576
Arizona 24199 21635 22835 20865
;
run;
PROC SQL;
CREATE TABLE ALLYEARS AS
SELECT MONOTONIC() AS ROWID, * FROM DICTIONARY.COLUMNS WHERE MEMNAME = 'HAVE' AND NAME LIKE '%20%';
QUIT;
PROC SQL;
SELECT TRIM(ALLYEARS.NAME) INTO: ALLNAMES SEPARATED BY "," FROM ALLYEARS ;
QUIT;
PROC SQL;
SELECT TRIM(ALLYEARS.NAME)||"/"||TRIM(ALLYEARS2.NAME)||" AS "||TRIM(ALLYEARS.NAME)||"_"||TRIM(ALLYEARS2.NAME)||" "
INTO: ALLDIVS SEPARATED BY "," FROM ALLYEARS
, ALLYEARS AS ALLYEARS2
WHERE ALLYEARS.ROWID > ALLYEARS2.ROWID;
QUIT;
%PUT &ALLDIVS;
PROC SQL;
CREATE TABLE PROPORTION AS
SELECT state, &ALLNAMES, &ALLDIVS
FROM HAVE;
QUIT;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to thomp7050

04-27-2017 07:23 AM - edited 04-27-2017 07:58 AM

Thank you so much! This works.

The original datasets have two variables, statefip (the full name of the US state that will be converted into abbreviated two letter states) as well as yr as a 8 digit numeric variable, with values including 2012, 2013, 2014, 2015.

Below is the data that can be derived from these original variable via proc freq:

Can proc sql follow an output dataset generated by proc tabulate to calculate the proportions without using cards? Is this not possible because state is a character variable?

Probably not ideal, but I can also export this into Excel, calculate the proportions, and then import the Excel as a SAS dataset.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to tipscode

04-27-2017 08:03 AM

Cards and the first step is ONLY to generate a sample data set to work with. Replace HAVE with the name of your dataset.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to tipscode

04-27-2017 09:30 AM

Right, what Reeza said. Regardless of how you process your initial dataset (e.g. proc freq, with/without ODS output saved, or calculating them using proc sql, import via excel, etc.) then you can run this procedure on your resulting dataset.