<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Creating a prevalence rate variable in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Creating-a-prevalence-rate-variable/m-p/680240#M205540</link>
    <description>&lt;P&gt;Hello all! I am a new user to SAS and was trying to manipulate a dataset to produce tables with prevalence rates.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have several variables like disease status (1,0), age (binned into categories), location (various sites), and year (binned by calender year). I was trying to create a single variable that would represent the prevalence rate so I could easily produce 2x2 tables with prevalence rate by various other vars (year, age, location etc).&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In my data step I was trying to do this the following way:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;data prevalence;&lt;BR /&gt;set work.import;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;length Hivstatus $ 10;&lt;BR /&gt;if hivpos =1 then Hivstatus = "Positive";&lt;BR /&gt;if hivpos =0 then Hivstatus = "Negtive";&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;if hivpos =1 then Hivpositive = "HIV Positive";&lt;BR /&gt;if hivpos =0 then Hivnegative = "HIV Negative";&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;if hivpos in (0,1) then HIVtotal = "All Cases";&lt;BR /&gt;if hivpos = 1 then HIVcases = "HIV Positive Cases";&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;HIVprevelence = HIVcases/HIVtotal;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;It ends up producing a variable "HIVprevelence" where all the data are listed as "missing". Im not sure if there the code is incorrect or if I am thinking about doing this the wrong way. It very difficult to get prevalence rates with multiple variables using proc freq.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any help appreciated!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data prevalence;
set work.import;

length Hivstatus $ 15;
if hivpos =1 then Hivstatus = "Positive";
if hivpos =0 then Hivstatus = "Negtive";

if hivpos =1 then Hivpositive = "HIV Positive";
if hivpos =0 then Hivnegative = "HIV Negative";

if hivpos in (0,1) then HIVtotal = "All Cases";
if hivpos = 1 then HIVcases = "HIV Positive Cases";

HIVprevelence = HIVcases/HIVtotal; 

run; &lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Sat, 29 Aug 2020 19:08:22 GMT</pubDate>
    <dc:creator>u49592411</dc:creator>
    <dc:date>2020-08-29T19:08:22Z</dc:date>
    <item>
      <title>Creating a prevalence rate variable</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Creating-a-prevalence-rate-variable/m-p/680240#M205540</link>
      <description>&lt;P&gt;Hello all! I am a new user to SAS and was trying to manipulate a dataset to produce tables with prevalence rates.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have several variables like disease status (1,0), age (binned into categories), location (various sites), and year (binned by calender year). I was trying to create a single variable that would represent the prevalence rate so I could easily produce 2x2 tables with prevalence rate by various other vars (year, age, location etc).&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In my data step I was trying to do this the following way:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;data prevalence;&lt;BR /&gt;set work.import;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;length Hivstatus $ 10;&lt;BR /&gt;if hivpos =1 then Hivstatus = "Positive";&lt;BR /&gt;if hivpos =0 then Hivstatus = "Negtive";&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;if hivpos =1 then Hivpositive = "HIV Positive";&lt;BR /&gt;if hivpos =0 then Hivnegative = "HIV Negative";&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;if hivpos in (0,1) then HIVtotal = "All Cases";&lt;BR /&gt;if hivpos = 1 then HIVcases = "HIV Positive Cases";&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;HIVprevelence = HIVcases/HIVtotal;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;It ends up producing a variable "HIVprevelence" where all the data are listed as "missing". Im not sure if there the code is incorrect or if I am thinking about doing this the wrong way. It very difficult to get prevalence rates with multiple variables using proc freq.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any help appreciated!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data prevalence;
set work.import;

length Hivstatus $ 15;
if hivpos =1 then Hivstatus = "Positive";
if hivpos =0 then Hivstatus = "Negtive";

if hivpos =1 then Hivpositive = "HIV Positive";
if hivpos =0 then Hivnegative = "HIV Negative";

if hivpos in (0,1) then HIVtotal = "All Cases";
if hivpos = 1 then HIVcases = "HIV Positive Cases";

HIVprevelence = HIVcases/HIVtotal; 

run; &lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 29 Aug 2020 19:08:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Creating-a-prevalence-rate-variable/m-p/680240#M205540</guid>
      <dc:creator>u49592411</dc:creator>
      <dc:date>2020-08-29T19:08:22Z</dc:date>
    </item>
    <item>
      <title>Re: Creating a prevalence rate variable</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Creating-a-prevalence-rate-variable/m-p/680267#M205547</link>
      <description>&lt;P&gt;You try to do a numeric operation with character values. Maxim 2: Read Your Log. You will find the NOTE's about conversion and invalid data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I guess(!) you want to sum all cases vs. all positive cases:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
select
  sum(hivpos) as hivcases,
  count(hivpos) as hivtotal,
  (calculated hivcases / calculated hivtotal) as hivprevalence
from work.import;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;(assuming that hivpos is either zero, one or missing)&lt;/P&gt;
&lt;P&gt;or, in a data step&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
set have end=done;
retain hivcases hivtotal 0;
hivtotal + 1;
if hivpos = 1 then hivcases + 1;
if done;
hivprevalence = hivcases / hivtotal;
keep hivcases hivtotal hivprevalence;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sun, 30 Aug 2020 06:54:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Creating-a-prevalence-rate-variable/m-p/680267#M205547</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2020-08-30T06:54:58Z</dc:date>
    </item>
    <item>
      <title>Re: Creating a prevalence rate variable</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Creating-a-prevalence-rate-variable/m-p/680332#M205575</link>
      <description>&lt;P&gt;Hey&amp;nbsp;@&lt;A href="https://communities.sas.com/t5/user/viewprofilepage/user-id/11562" target="_blank" rel="noopener"&gt;KurtBremser&lt;/A&gt;&amp;nbsp;thanks so much! That was super helpful - I was able to get things working with some minor adjustments to my code.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am now have three new vars:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="sas"&gt;hivcases hivtotal hivprevalence&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;I removed the keep statement so that they were added to my existing dataset.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am having trouble getting the output I intended now, initially I was thinking something like&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;Proc freq; 
tables hivprevelence*timeyear; 
run; &lt;/PRE&gt;&lt;P&gt;which would give me a table with the total prevalence for the disease by year. However, obviously now hivprevelence is not a 1/0 variable and so I used the proc univariate step instead.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;proc univariate data=work.import; 
class timeyear;
var hivprevalence;
run; &lt;/PRE&gt;&lt;P&gt;This gives me the mean for hivprevelence (which is the prevalence) but it is buried in a mountain of other stuff that is not very useful. I was looking to build an output that is more flexible and table oriented (something more like pivot tables in excel). I tried adding the freq option to proc univariate but that didn't get me too far.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here is a basic example of the types of tables I am thinking of generating:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Screen Shot 2020-08-30 at 5.20.05 PM.png" style="width: 999px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/48813iE4AE064A2EEAF510/image-size/large?v=v2&amp;amp;px=999" role="button" title="Screen Shot 2020-08-30 at 5.20.05 PM.png" alt="Screen Shot 2020-08-30 at 5.20.05 PM.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you for helping further my learning! Its much appreciated!&lt;/P&gt;</description>
      <pubDate>Sun, 30 Aug 2020 21:27:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Creating-a-prevalence-rate-variable/m-p/680332#M205575</guid>
      <dc:creator>u49592411</dc:creator>
      <dc:date>2020-08-30T21:27:58Z</dc:date>
    </item>
    <item>
      <title>Re: Creating a prevalence rate variable</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Creating-a-prevalence-rate-variable/m-p/680364#M205595</link>
      <description>&lt;P&gt;Please post a sample of your import dataset in a data step with datalines, so we can readily use it, and what you expect to get out of that data.&lt;/P&gt;</description>
      <pubDate>Mon, 31 Aug 2020 05:25:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Creating-a-prevalence-rate-variable/m-p/680364#M205595</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2020-08-31T05:25:41Z</dc:date>
    </item>
    <item>
      <title>Re: Creating a prevalence rate variable</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Creating-a-prevalence-rate-variable/m-p/680823#M205851</link>
      <description>&lt;P&gt;Hey sure - I just made up some code to reflect the variables of interest related to figuring out the prevalence:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data Prevalence;
   input date age hivpos location $ ;
   datalines;
09/27/19 56 1 A 
03/12/15 92 0 B
12/05/16 39 1 D
;

I have categorized some of the variables as follows: 
if year(date) = 2014 then Year = "2014";
else if year(date) = 2015 then Year = "2015";
else if year(date) = 2016 then Year = "2016";
else if year(date) = 2017 then Year = "2017";
else if year(date) = 2018 then Year = "2018";
else if year(date) = 2019 then Year= "2019";

length matage_category $15;
   if      . &amp;lt;  age &amp;lt;  20 then age_category = "&amp;lt;20";
   else if 20 &amp;lt;= age &amp;lt;=  40 then age_category = "20 to 40";
   else if 40   &amp;lt; age &amp;lt;=  60 then age_category = "41 to 60";
   else if age &amp;gt; 60 then age_category = "&amp;gt;61";
   else if        age = . then age_category = "Unknown";&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Proc Freq will give me tables with frequencies by age_category / time year / location but divides the tables into HIV=1 and HIV=0 so in principle I could just divide the the cells in each table to get the prevalence however I thought there must be an easy way to internalize this type of calculation. So using the code from the previous post I created a new variable "hivprevalence" which I want to display in tables by the stratified variables above so the output looks something like:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Screen Shot 2020-09-01 at 5.11.21 PM.png" style="width: 762px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/48852i378FA9457585349D/image-size/large?v=v2&amp;amp;px=999" role="button" title="Screen Shot 2020-09-01 at 5.11.21 PM.png" alt="Screen Shot 2020-09-01 at 5.11.21 PM.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;I tried using:&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;proc means data=prevelence n mean;
Class year age_category;
var hivprevalence;
run; &lt;/PRE&gt;&lt;P&gt;This seemed to get things close - but the hivprevelence calculation is not correct. It dosent seem to be taking the number of hivpos=1 cases and dividing by the total hivpos=0 cases for each cell in the table. Any thoughts / help would be greatly appreciated!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 01 Sep 2020 21:15:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Creating-a-prevalence-rate-variable/m-p/680823#M205851</guid>
      <dc:creator>u49592411</dc:creator>
      <dc:date>2020-09-01T21:15:58Z</dc:date>
    </item>
    <item>
      <title>Re: Creating a prevalence rate variable</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Creating-a-prevalence-rate-variable/m-p/683956#M207205</link>
      <description>&lt;P&gt;Anyone have any suggestions or thoughts on this?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Much appreciated!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 15 Sep 2020 14:06:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Creating-a-prevalence-rate-variable/m-p/683956#M207205</guid>
      <dc:creator>u49592411</dc:creator>
      <dc:date>2020-09-15T14:06:43Z</dc:date>
    </item>
    <item>
      <title>Re: Creating a prevalence rate variable</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Creating-a-prevalence-rate-variable/m-p/684518#M207441</link>
      <description>I've been away for some time from my Mac, which I need for testing. Will come back to you soon.</description>
      <pubDate>Thu, 17 Sep 2020 08:39:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Creating-a-prevalence-rate-variable/m-p/684518#M207441</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2020-09-17T08:39:45Z</dc:date>
    </item>
    <item>
      <title>Re: Creating a prevalence rate variable</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Creating-a-prevalence-rate-variable/m-p/689117#M209464</link>
      <description>&lt;P&gt;First, use formats on your variables for grouping:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc format;
value agebin
  . = 'Unknown'
  0 -&amp;lt; 20 = "&amp;lt;20"
  20 - 40 = "20 to 40"
  40 &amp;lt;- 60 = "41 to 60"
  60 &amp;lt;- high = "&amp;gt;61"
;
run;

data Prevalence;
input date age hivpos location $ ;
format date year4. age agebin.;
datalines;
09/27/19 56 1 A 
03/12/15 92 0 B
12/05/16 39 1 D
;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;These groups will take effect if the variables are used as CLASS in procedures.&lt;/P&gt;
&lt;P&gt;Or you use formatted values as GROUP BY in a SQL SELECT.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you provide a larger dataset where sums and counts per group make sense, and the expected output, I can test the code against that.&lt;/P&gt;</description>
      <pubDate>Tue, 06 Oct 2020 08:37:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Creating-a-prevalence-rate-variable/m-p/689117#M209464</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2020-10-06T08:37:01Z</dc:date>
    </item>
  </channel>
</rss>

