i have a data like this
data client;
input pid county$ disease$;
cards;
123 ABC Chlam
124 DEF Chlam
134 EFG Syph
135 HIJ Gon
209 KLM Chlam
103 ABC Chlam
120 DEF Chlam
131 EFG Syph
133 HIJ Gon
201 KLM Chlam
123 ABC Chlam
101 ABC Chlam
129 ABC Chlam
;
proc freq data=client;
table county;
where disease='Chlam';
run;
i got this after proc freq;
county | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
---|---|---|---|---|
ABC | 5 | 55.56 | 5 | 55.56 |
DEF | 2 | 22.22 | 7 | 77.78 |
KLM | 2 | 22.22 | 9 | 100.00 |
But i want like this:
County | #of Case range |
ABC | 3-5 |
DEF | 1-2 |
KLM | 1-2 |
Please please get back to me. I am waiting.
@Dhana18 wrote:
i have a data like this
data client;
input pid county$ disease$;
cards;
123 ABC Chlam
124 DEF Chlam
134 EFG Syph
135 HIJ Gon
209 KLM Chlam
103 ABC Chlam
120 DEF Chlam
131 EFG Syph
133 HIJ Gon
201 KLM Chlam
123 ABC Chlam
101 ABC Chlam
129 ABC Chlam
;
proc freq data=client;
table county;
where disease='Chlam';
run;i got this after proc freq;
county Frequency Percent Cumulative
FrequencyCumulative
PercentABC 5 55.56 5 55.56 DEF 2 22.22 7 77.78 KLM 2 22.22 9 100.00 But i want like this:
County #of Case range ABC 3-5 DEF 1-2 KLM 1-2 Please please get back to me. I am waiting.
The first thing is to supply a description of all expected possible ranges. For example you only show up to 5. What if the Count is actual 10, 15, 500 or larger in a real case. You should also describe what you might want for no values.
To display different text for a numeric value SAS uses custom formats that you write. Then you apply the format to an appropriate variable as needed.
Here a brief example of creating a format and then using a procedure that gives a bit more control of displayed output than Proc Freq where you can specify the format for the statistic.
roc format; value mynrange 1 - 2 = ' 1-2' 3 - 5 = ' 3-5' 6 -10 = ' 6-10' 11 - high ='11+' ; proc tabulate data=client; class county; table county=' ', n="# case range"*f=mynrange. /box=County ; run;
In the Value statement of the Proc Format code the - left of the = is a range indicator, not the hyphen of the displayed values. There are options that let you indicate whether the end points are included or not in the range plus some other keywords allowed for the value list.
Proc tabulate is another reporting procedure that has more control over displayed tables but a different choice of statistics than Proc Freq. The statistic, such as n, followed by ="something" is the label that the statistic will use for the procedure output, the *f=<some format> allows you to specify a format for the specific use of the statistic in that use.
OK, so here's some code. The results are below the code. Is this what you need? In the results I have two tables. The first is not categorized into ranges. I just put the first table there so you can see the difference that the Proc Format makes when I assign the ranges. The TITLE function that I used should be replaced by a TITLE (without a %) that is suitable for your SAS environment.
Jim
TITLE;
FOOTNOTE;
OPTIONS VALIDVARNAME = ANY;
DATA Client;
INPUT pid County $ Disease $;
DATALINES;
123 ABC Chlam
124 DEF Chlam
134 EFG Syph
135 HIJ Gon
209 KLM Chlam
103 ABC Chlam
120 DEF Chlam
131 EFG Syph
133 HIJ Gon
201 KLM Chlam
123 ABC Chlam
101 ABC Chlam
129 ABC Chlam
;
RUN;
%TITLE(Data - Client);
PROC FREQ DATA=Client NOPRINT;
TABLE County / Out=Client_Freqs;
WHERE Disease='Chlam';
RUN;
%TITLE(Data - Client_Freqs);
PROC PRINT DATA=Client_Freqs NOOBS;
VAR County Count;
RUN;
PROC FORMAT;
VALUE Chlam_Range
0 = '0'
1 - 3 = '1-3'
4 - 19 = '4-19'
20 - 49 = '20-49'
50 - 300 = '50-300'
301 - HIGH = '>300'
. = 'Missing'
OTHER = 'Invalid'
;
RUN;
%TITLE(Data - Client_Freqs);
OPTIONS LABEL;
PROC PRINT DATA=Client_Freqs (RENAME=COUNT='# of Case Range'n) NOOBS;
FORMAT '# of Case Range'n Chlam_Range6.;
VAR County '# of Case Range'n;
RUN;
Results:
@Dhana18 wrote:
It is by disease specific range the range I gave just an example.
If you want an actual solution you have to provide the actual rules. I hope my example is enough to get started.
However, SAS formats apply to a variable. So the range using formats would apply regardless of "disease". If the display range has to vary by disease then you must show the rules because that is going to require data step coding and creation of character variable(s) to hold the result.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.