BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tipscode
Obsidian | Level 7

I need to output percentages in SAS v9.4 that fall above a certain threshold from two categorical variables in a dataset - state and year. I first produced the following table that has all of the data:

 

    proc freq data=example;
    tables statefip * yr/ nopercent nocol norow missing out=test;
    run;

 

Now I would like to calculate a percentage from year to year by state. There will be 10 years worth of data. For example, ratio1 = (Count of statefip variable from 2013)/(Count of statefip variable from 2012). If ratio1 is greater than a threshold (e.g., 80%), I would keep the data. If not, it would be coded as "missing".


Is that possible to do in SAS?

 

In addition, is there a way to do this in a macro?

 

Any suggestions is greatly appreciated!

1 ACCEPTED SOLUTION

Accepted Solutions
thomp7050
Pyrite | Level 9

Not sure if this solves your problem, but here's some code that dynamically divides all present years based on your previously stated criteria.

 

data have;
infile cards missover;
length state $ 100 _2012 _2013 _2014 _2015 8.; 
input state _2012 _2013 _2014 _2015;
cards; 
Alabama 9653 10264 9557 11152
Alaska 6273 6476 6423 6576
Arizona 24199 21635 22835 20865
;
run;

PROC SQL;
CREATE TABLE ALLYEARS AS
SELECT MONOTONIC() AS ROWID, * FROM DICTIONARY.COLUMNS WHERE MEMNAME = 'HAVE' AND NAME LIKE '%20%';
QUIT;

PROC SQL;
SELECT TRIM(ALLYEARS.NAME) INTO: ALLNAMES SEPARATED BY "," FROM ALLYEARS ;
QUIT;

PROC SQL;
SELECT TRIM(ALLYEARS.NAME)||"/"||TRIM(ALLYEARS2.NAME)||" AS  "||TRIM(ALLYEARS.NAME)||"_"||TRIM(ALLYEARS2.NAME)||"  " 
INTO: ALLDIVS SEPARATED BY "," FROM ALLYEARS
, ALLYEARS AS ALLYEARS2
WHERE ALLYEARS.ROWID > ALLYEARS2.ROWID;
QUIT;

%PUT &ALLDIVS;

PROC SQL;
CREATE TABLE PROPORTION AS
SELECT state, &ALLNAMES, &ALLDIVS
FROM HAVE;
QUIT;

View solution in original post

12 REPLIES 12
thomp7050
Pyrite | Level 9

Depending on how your data were structured:

 

PROC SQL;
CREATE TABLE PROPORTION AS
SELECT STATENAME, SUM(DISTINCT CASE WHEN STATEFLIP2013 = 1 THEN 1 ELSE 0 END)/SUM(DISTINCT CASE WHEN STATEFLIP2012 = 1 THEN 1 ELSE 0 END) 
FROM MYTABLE
GROUP BY STATENAME;
QUIT;
tipscode
Obsidian | Level 7

Apologies - I should have clarified with an illustrative table.

 

I currently am using proc tabulate to generate the following:

 

STATEFIP2012201320142015
Alabama 9,653 10,264 9,557 11,152
Alaska 6,273 6,476 6,423 6,576
Arizona 24,199 21,635 22,835 20,865

 

I would like to generate the following table:

STATEFIP20122013201420152013/20122014/20132015/2014
Alabama 9,653 10,264 9,557 11,152106.33%93.11%116.69%
Alaska 6,273 6,476 6,423 6,576103.24%99.18%102.38%
Arizona 24,199 21,635 22,835 20,86589.40%105.55%91.37%
thomp7050
Pyrite | Level 9

Thanks for the specification!

 

My solution, for your review:

 

data have;
infile cards missover;
length state $ 100 _2012 _2013 _2014 _2015 8.; 
input state _2012 _2013 _2014 _2015;
cards; 
Alabama 9653 10264 9557 11152
Alaska 6273 6476 6423 6576
Arizona 24199 21635 22835 20865
;
run;

PROC SQL;
CREATE TABLE PROPORTION AS
SELECT state, _2012, _2013, _2014, _2015
, _2013/_2012  "2013/2012" FORMAT 8.4 
, _2014/_2013  "2014/2013" FORMAT 8.4 
, _2015/_2014  "2015/2014" FORMAT 8.4 
FROM HAVE;
QUIT;

Patrick

Reeza
Super User

The other option, if you want this dynamic would be to use PROC REPORT with a compute column.

 

PS. I'm moving this thread to reporting forum as it more pertains to reporting structure.

tipscode
Obsidian | Level 7

Thank you both for your timely suggestions!

 

I would like to do this dynamically for future reports. Here is the code that I have thus far for your comments and suggestions:


1) Creating a macro variable across years:

    %LET CYR1 = %EVAL(2015); */For later iterations, one only have to update this year
    %LET CYR2 = %EVAL(&CYR1-1);
    %LET CYR3 = %EVAL(&CYR1-2);
    %LET CYR4 = %EVAL(&CYR1-3); 

 

2) I needed to concatenate and then subset two large datasets that will have new variables by the years:

     DATA test;
     SET dataset1 dataset2;
     state=stnamel(statefip);
     IF yr IN (&CYR1 &CYR2 &CYR3 &CYR4);
     YR1 = yr IN (&CYR1); */This creates counts per year, but I also see that it creates a binary variable where YR1 can be 0 or 1, I      need to figure out how to take out the values where YR1=0
     YR2 = yr IN (&CYR2);
     YR3 = yr IN (&CYR3);
     YR4 = yr IN (&CYR4);

     RUN;

 

3) Right now I am thinking about using Proc Tabulate to calculate the proportions one column at a time. The code below is in dev because I need to figure out how to take out values where YR1=0, YR2=0, YR3=0, YR4=0. Can I use PCTN in the proc tabulate below? I don't think I can use PCTNSUM.

 

    PROC TABULATE DATA = test;
    CLASS state;
    VAR YR3 YR4;
    TABLE state, YR3 YR4 YR3*PCTN<YR4>='Ratio 2013/2012'*f=8.2;
    Where YR3='1' and YR4='1'; */Here is the logic but probably won't be a line of code
    RUN;

 

I would like to successfully execute 3), then set a conditional statement that if any ratio calculated, (e.g., ratio 2013/2012) is lt 70%, then drop the value or convert it to missing "."

 

Thank you again.

Reeza
Super User

I'm confused with the direction you're taking here. 

Why are you using macro variables? You should be designing this to not need the years in the data at all. 

 

 

tipscode
Obsidian | Level 7

My main challenge has been to create counts of each year for the states, then generating new variables that has the proportion of the counts for states. So far I have not found a way to do this in a data step instead of a proc means/freq/tabulate.

Reeza
Super User

What does your original data look like to start off with? So far you've only shown the aggregate data.

thomp7050
Pyrite | Level 9

Not sure if this solves your problem, but here's some code that dynamically divides all present years based on your previously stated criteria.

 

data have;
infile cards missover;
length state $ 100 _2012 _2013 _2014 _2015 8.; 
input state _2012 _2013 _2014 _2015;
cards; 
Alabama 9653 10264 9557 11152
Alaska 6273 6476 6423 6576
Arizona 24199 21635 22835 20865
;
run;

PROC SQL;
CREATE TABLE ALLYEARS AS
SELECT MONOTONIC() AS ROWID, * FROM DICTIONARY.COLUMNS WHERE MEMNAME = 'HAVE' AND NAME LIKE '%20%';
QUIT;

PROC SQL;
SELECT TRIM(ALLYEARS.NAME) INTO: ALLNAMES SEPARATED BY "," FROM ALLYEARS ;
QUIT;

PROC SQL;
SELECT TRIM(ALLYEARS.NAME)||"/"||TRIM(ALLYEARS2.NAME)||" AS  "||TRIM(ALLYEARS.NAME)||"_"||TRIM(ALLYEARS2.NAME)||"  " 
INTO: ALLDIVS SEPARATED BY "," FROM ALLYEARS
, ALLYEARS AS ALLYEARS2
WHERE ALLYEARS.ROWID > ALLYEARS2.ROWID;
QUIT;

%PUT &ALLDIVS;

PROC SQL;
CREATE TABLE PROPORTION AS
SELECT state, &ALLNAMES, &ALLDIVS
FROM HAVE;
QUIT;
tipscode
Obsidian | Level 7

Thank you so much! This works.

 

The original datasets have two variables, statefip (the full name of the US state that will be converted into abbreviated two letter states) as well as yr as a 8 digit numeric variable, with values including 2012, 2013, 2014, 2015. 

 

Below is the data that can be derived from these original variable via proc freq:

Capture.PNG 

Can proc sql follow an output dataset generated by proc tabulate to calculate the proportions without using cards?  Is this not possible because state is a character variable?

 

Probably not ideal, but I can also export this into Excel, calculate the proportions, and then import the Excel as a SAS dataset.

Reeza
Super User

Cards and the first step is ONLY to generate a sample data set to work with. Replace HAVE with the name of your dataset. 

thomp7050
Pyrite | Level 9

Right, what Reeza said.  Regardless of how you process your initial dataset (e.g. proc freq, with/without ODS output saved, or calculating them using proc sql, import via excel, etc.) then you can run this procedure on your resulting dataset. 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 3170 views
  • 3 likes
  • 3 in conversation