BookmarkSubscribeRSS Feed
pallis
Fluorite | Level 6

I have data like

no   Code date

1        a      jun13

2        b      jun13

10     x        aug13

looking for output "

                                 Jun13        Jul13   Aug13

Total no                        3                  2        5

Bounced                       7                 1          2

{Code}

proc tabulate data=bjun_jul_aug;

class  cutdate  CODE;

format date monyy7.;

tables (n='Total  Accounts '),(date='');

tables BOUNCE_REASON_CODE='',cutdate=''*n='';

run;

My output for second table is going like this

                                  Jun13      Jul13    AUg13

Code    

               a                     1              1              2

                b                      2            3               4

       -----------------------------------------------------------

        -----------------------------------------------------------

Can any one give me hint to to resolve this issue

12 REPLIES 12
AncaTilea
Pyrite | Level 9

HI,

right now in the little sample data you provided, there is no "bounced" variable.

So, a bit more detailed sample data/sample output may help.

Smiley Happy

pallis
Fluorite | Level 6

I have data set which consists

No   Code  date

For these 3 columns I have data like

no   Code date

a1        a      jun13

a2        b      jun13

a10     x        aug13

Im trying to figure out using tabulate procedure

I looking for month wise  count of code.(Simply saying if I make group on month wise how many records are falling under each month)


MODIFIED {Code}

proc tabulate data=bjun_jul_aug;

class  cutdate  CODE;

format date monyy7.;

tables (n='Total  Accounts '),(date='');

tables CODE='',cutdate=''*n='';

run;

My output for second table is going like this

                                  Jun13      Jul13    AUg13

Code  

               a                     1              1              2

                b                      2            3               4

Vince28_Statcan
Quartz | Level 8

looks to me like you are trying for something like

proc tabulate data=bjun_jul_aug;

class date code;

format date monyy7.;

tables (code)*(n=''), (date);

run;

You would need to provide a full (small) sample data with a datalines statement or something similar to further test if it produces your desired output.

ballardw
Super User

If you do not want all of the levels of your classification code to appear then create a custom format to assign all values to single displayed value and associate the new format with the variable in proc.

Maybe something like:

proc format;

value $codes

'a','b','c',....,'x' = 'Code';

run;

Then add Format code $codes.;

But I'm guessing as your example data hasn't matched your displayed desired output.

pallis
Fluorite | Level 6

Here is my sample data set

No  Date  Status Code

1   30-Jun-13 CLS  2

2  30-Jun-13  Rcs  1

3  31-Jul-13   CLS  1

4  31-Aug-13   CLS  3

Im looking for out put lke

How many total members in code based on monthwise

like

                                   Jun   Jul     Aug

# of code                   2        1            1

Im getting output like

                                   Jun   Jul     Aug

Code      1                   1       1          .

                2                   1       .           .

                3                    .            .        1

Can you me giving me some kind of hint.

Vince28_Statcan
Quartz | Level 8

If you have no missing codes that you wish not to count, you can do

data have;
input @1 code @3 date date9.;
datalines;
2 30JUN2013
1 30JUN2013
1 31JUL2013
3 31AUG2013
;
run;

proc tabulate;
class code date;
format date monyy7.;
tables (n='# of codes'*f=f6.0), (date);
run;

if you have missing codes, using a custom format as mentionned by Ballard is probably the best approach.

This really ignores code completely and counts the number of records per month, hence my comment above missing codes that you may want to account for.

pallis
Fluorite | Level 6

One more important point i forgot to inform that some of my codes are characters and some r numbers.

Vince28_Statcan
Quartz | Level 8

That really doesn't matter. It simply means that the variable is stored as character and at this point, to SAS, there is no further difference between "A" and "B" as there is between, say, "A" and "1". It only matters if you wish to count only for a subset of codes.

Again, in the case that you would want to sum only for a subset of codes, the format approach suggested by Ballard is likely the best way to go about it. That or use a where statement although where statement is not as reuseable as proc format if you ought to do the same for different datasets or periodically.

pallis
Fluorite | Level 6

I need to write more than 1000+ codes. God its hell for me.

Vince28_Statcan
Quartz | Level 8

Again, that's only if you need to somehow subset your codes...

It's extremely difficult to provide appropriate help or alternate solutions without more details on how exactly you wish to subset your codes...characters have an order just like numbers based on their ascii/hex/bin representation. The example you've provided above only has 4 data points and you were counting for all of the codes there. In this case, you can ignore code all together and use my solution in post #6.

If you wanted to retain only a count for, say, codes 1-9, you could simply use where condition like

where input(trim(code), best32.) between 1 and 9;

Saving you from doing a proc format for 1000 codes. Additionnal knowledge about your codes could also allow you to just read all distinct codes in macro variables and generate a proc format with the thousand codes typing only 10-30 lines of SAS code. I don't exactly expect you to provide a list of all of your codes but some additionnal details on their representation and what range of codes you wish to count for would be helpful.

pallis
Fluorite | Level 6

,

    Thank you very much for the reply. My  code starts with 0 and ends with Character.  There are couple of characters (SME, AME CLS CLR ECS TAR) and Numeric code ranges from 0 to 180. My data set totally consists of 1211456 observations (This is nothing but the total Members  I used as No here in above examples).Based on payment type on month basis how many total members are under bad rate. I need to generate report based on month wise how many members are bounced. We are basically considering the last 3 months (Jun,Jul,Aug).  Im not supposed to post data here. Thats the reason I created sample data and posted here.

Vince28_Statcan
Quartz | Level 8

I understand data confidentiality. I'm working for a NSO In a division for Census Operations. It always increases the challenge to both provide and receive help for complex issues.

I'm still not entirely sure what you want of an output with regards to each code/groups of codes/all codes as a single unit. But I'll try to throw out a few different alternatives using where conditioning and by grouping.

proc tabulate;

class code date;

format date monyy7.;

where substr(code, -1, 3) = "SME" and

            month(date) in (6, 7, 😎 and

            year(date) = 2013;

tables (n='# of codes'*f=f6.0), (date);

run;

Would give you a count, for the last 3 month, by month, of all SME codes. If the digits of your code each have a representation (e.g. most of our IDs comprise a 2 digit code for province here so I could subset on input(substr(code, 6, 2), best32.) between 25 and 30

The idea with this is to use the where statement to shrink the data set that is actually used to calculate the statistics (here only N) desired through tabulate. Alternatively, if you wanted one small table per code instead of the giant table with each different code as row like in post#3, you could've simply used

proc tabulate;

class date;

format date monyy7.;

by code;

tables (n='# of codes'*f=f6.0), (date);

run;

I don't know if I'm stating the obvious here or giving at least some insights but anyway. For so long as you don't need to achieve calculations over rows or columns produced by the tables statement that cannot be achieved through the proc's available Statistics, there's definitely a way to get the output you desire. Since you seem to be looking merely for the total count of a subset of data for 3 months, using where statement to simply condition the data used to calculate statistics and produce the table seems to be the way to go. You could further make it a macro taking an input of the reference month and output the results for 3 month span ending on the inputed month for example.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 1227 views
  • 0 likes
  • 4 in conversation