Solved: Re: Creating a summary table or dataset

brophymj · Posted 07-07-2014 11:15 AM

Hi

I'm trying to find an efficient of doing the following:

I'm carrying out a reconciliation on a data set with 2000 observations and on of the fields in the data set is a comment on the reconciliation and it can have the following comments

Premium matches

Mismatch due to..reason 1

Mismatch due to..reason 2

and so on

I want to create a table or dataset which I can export that has the summary information i.e. something that looks like this:

Rec Summary	Frequency	% of Total
Match	1952	97.5%
Mismatch - Reason 1	22	1.1%
Mismatch - Reason 2	28	1.4%
Total	2000	100%
Start Date	20130101
End Date	20131231

The start and end dates are taken as the lowest and highest values of one of the fields in the dataset with the transaction dates for each observation.

RW9 · Posted 07-08-2014 04:13 AM

Yep, I assume dnuermic for the min and forgot the put,:

set REC_SUMMARY="Start Date", /* Insert the min start date */

FREQ=(select put(min(POLDTR),yymmdd10.) from OUT_BOTH)

View solution in original post

ballardw · Posted 07-07-2014 11:22 AM

With that second requirement it is time to post a description of your data.

The comment "some variable" for the max min dates can be interpretted several ways; a specific variable in which case name it, or multiple variables that you want to select from.

brophymj · Posted 07-07-2014 11:51 AM

Thanks ballardw. What I mean is that on of the fields in my data set is the transaction date for each transaction in my dataset. The fields looks like this:

POLDTR

20130302

20130211

20130807

20130202

20130117

...

The start date is the min of all these dates and the end date is the max.

Also, is there a way of including a 'total' row for the table as shown above?

RW9 · Posted 07-07-2014 12:10 PM

Well, I have just finished for the day, but thought I would throw out a couple of ideas (note this is is in no way complete, just gives a flavor. Would need to see some test data and what reason 1/2 is and what constitutes a match):

data out_both out_a out_b;

merge data1 (in=a) data2 (in=b);

by by_vars;

if a and b then out_both;

else if a then out_a;

else out_b;

run;

proc sql;

create table RESULT

(

REC_SUMMARY char(200),

FREQ char(200),

PCENT char(200)

);

insert into RESULT

set REC_SUMMARY="Match",

FREQ=(select put(count(*),best.) from OUT_BOTH),

PCENT=(select put( (count(*) / &TOTAL.) *100,5.2)

from OUT_BOTH)

set REC_SUMMARY="Mismatch reason 1",

FREQ=(select put(count(*),best.) from OUT_A)...

set REC_SUMMARY="Start Date",

FREQ=(select min(DATE) from OUT_BOTH group by ..)...

quit;

brophymj · Posted 07-07-2014 12:58 PM

Hi RW9

Here is my dataset, just two colunms. I don't fully follow your code, I'm getting errors.

RW9 · Posted 07-07-2014 01:50 PM

Ok, its even easier then:

proc sql;

create table RESULT /* Create an empty results table which we will populate by the statements further below */

(

REC_SUMMARY char(200),

FREQ char(200),

PCENT char(200)

);

insert into RESULT

set REC_SUMMARY="Match", /* Here we start inserting the first row of the results which is those that match */

FREQ=(select put(count(*),best.) from HAVE where COMMENT="MATCH"), /* Count all records with MATCH */

PCENT=put( (select count(*) from HAVE where COMMENT="MATCH") / (select count(*) from HAVE) *100,5.2) /* Num rows with Match / all rows * 100 = percent */

set REC_SUMMARY="Mismatch reason 1", /* Do same as above but using condition doe reason 1 - which I can't identify from what you have given */

FREQ=(select put(count(*),best.) from OUT_A)...

set REC_SUMMARY="Start Date", /* Insert the min start date */

FREQ=(select min(POLDTR) from OUT_BOTH)

set REC_SUMMARY="End Date", /* Insert the max start date */

FREQ=(select max(POLDTR) from OUT_BOTH);

quit;

brophymj · Posted 07-08-2014 04:02 AM

Hi RW9

Thanks for your help. The code is working except for the piece that creates teh Start Date and End Date. I get an error message that reads "It is invalid to assign a numeric expression to a character value using the SET clause".

I have attached the dataset above.

Rregards

Matthew

RW9 · Posted 07-08-2014 04:13 AM

Yep, I assume dnuermic for the min and forgot the put,:

set REC_SUMMARY="Start Date", /* Insert the min start date */

FREQ=(select put(min(POLDTR),yymmdd10.) from OUT_BOTH)

Reeza · Posted 07-07-2014 12:14 PM

You need to post an example of what your full data looks like.

stat_sas · Posted 07-07-2014 12:25 PM

Try this.

proc tabulate data=have out=want;
class comments;
table comments='Rec Summary' all='Total',(n='Frequency' pctn='% of Total');
run;

proc print data=want;
run;

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away