Solved: Re: PROC SQL different values within a BY GROUP and Calculate percenta...

Nikos · Posted 05-11-2012 12:48 PM

Hi all,

I have a data set with ID_NO , TRANSACTION_DATE , PRODUCT, NAME and STATE the By GROUP variables are ID_NO & TRANS_DATE

Although I would expect NAME and STATE to be the same across each BY GROUP, this is not the case.

ID_NO	TRANS_DATE	PRODUCTS	NAME	STATE
255	3/2/2012	BED	JOHN	NY
255	3/2/2012	PILLOW	JOHN	NY
255	3/2/2012	MATTRES	JOHN	NY
468	9/12/2012	BED	MIKE	California
468	9/12/2012	PILLOW	MIKE	MARYLAND
468	9/12/2012	MATTRES	MIKE	California
137	6/25/2012	BED	JIM	Michigan
137	6/25/2012	PILLOW	JIM	MARYLAND
137	6/25/2012	MATTRES	TOM	Michigan
137	6/25/2012	MATTRES	JIM	Michigan

I would like to run a PROC SQL that would identify these BY GROUP that deviate and at the same time to provide percentages of how many of these BY GROUP are diffferent

a) in NAME

b) in STATE

Any suggestions would be much appreciated.

Kind regBY GROUP rards

Nikos

Ksharp · Posted 05-13-2012 10:56 PM

Since you didn't post what output you need.

So it is hard to make sure what your intention is .

data have;
input ID_NO TRANS_DATE :mmddyy10. PRODUCTS $ NAME $ STATE $;
datalines;
255 3/2/2012 BED JOHN NY
255 3/2/2012 PILLOW JOHN NY
255 3/2/2012 MATTRES JOHN NY
468 9/12/2012 BED MIKE California
468 9/12/2012 PILLOW MIKE MARYLAND
468 9/12/2012 MATTRES MIKE California
137 6/25/2012 BED JIM Michigan
137 6/25/2012 PILLOW JIM MARYLAND
137 6/25/2012 MATTRES TOM Michigan
137 6/25/2012 MATTRES JIM Michigan
;
run;
proc sql;
title "Pct deviate name";
select b.*, n_name/n as pctName format=percent8.2 
 from (select id_no,trans_date,count(*) as n from have group by id_no,trans_date) as a,
      (select id_no,trans_date,name,count(*) as n_name from have group by id_no, trans_date,name) as b
  where a.id_no=b.id_no and a.trans_date=b.trans_date ;


title "Pct deviate state";
select b.*, n_STATE/n as pctSTATE format=percent8.2 
 from (select id_no,trans_date,count(*) as n from have group by id_no,trans_date) as a,
      (select id_no,trans_date,STATE,count(*) as n_STATE from have group by id_no, trans_date,STATE) as b
  where a.id_no=b.id_no and a.trans_date=b.trans_date ;
quit;

Ksharp

View solution in original post

PGStats · Posted 05-11-2012 01:37 PM

I suggest :

data have;
input ID_NO TRANS_DATE :mmddyy10. PRODUCTS $ NAME $ STATE $;
datalines;
255 3/2/2012 BED JOHN NY
255 3/2/2012 PILLOW JOHN NY
255 3/2/2012 MATTRES JOHN NY
468 9/12/2012 BED MIKE California
468 9/12/2012 PILLOW MIKE MARYLAND
468 9/12/2012 MATTRES MIKE California
137 6/25/2012 BED JIM Michigan
137 6/25/2012 PILLOW JIM MARYLAND
137 6/25/2012 MATTRES TOM Michigan
137 6/25/2012 MATTRES JIM Michigan
;

proc sql;
title "Pct deviate name";
select sum(n>1)/count(n) as pctName format=percent8.2 from
(select count(distinct name) as n from have group by id_no, trans_date);
title "Pct deviate state";
select sum(n>1)/count(n) as pctState format=percent8.2 from
(select count(distinct state) as n from have group by id_no, trans_date);
quit;

Add UPCASE() function to name or state if you do not care about differences in case.

PG

FriedEgg · Posted 05-11-2012 02:09 PM

proc sql;

select a.id_no,

a.trans_date,

case

when count(distinct a.name) > 1 then 1

else 0

end as name_deviate,

1 - max(b.name_count) / count(*) as name_deviate_percent format=percent.,

case

when count(distinct a.state) > 1 then 1

else 0

end as state_deviate,

1 - max(c.state_count) / count(*) as state_deviate_percent format=percent.

from foo a

left outer join ( select id_no,trans_date,name,count(*) as name_count from foo group by id_no,trans_date,name ) b on a.id_no=b.id_no and a.trans_date=b.trans_date and a.name=b.name

left outer join ( select id_no,trans_date,state,count(*) as state_count from foo group by id_no,trans_date,state ) c on a.id_no=c.id_no and a.trans_date=c.trans_date and a.state=c.state

group by a.id_no,a.trans_date;

quit;

id_no	trans_date	name_deviate	name_deviate_percent	state_deviate	state_deviate_percent
137	25JUN2012	1	25%	1	25%
255	02MAR2012	0	0%	0	0%
468	12SEP2012	0	0%	1	33%

Ksharp · Posted 05-13-2012 10:56 PM

Since you didn't post what output you need.

So it is hard to make sure what your intention is .

data have;
input ID_NO TRANS_DATE :mmddyy10. PRODUCTS $ NAME $ STATE $;
datalines;
255 3/2/2012 BED JOHN NY
255 3/2/2012 PILLOW JOHN NY
255 3/2/2012 MATTRES JOHN NY
468 9/12/2012 BED MIKE California
468 9/12/2012 PILLOW MIKE MARYLAND
468 9/12/2012 MATTRES MIKE California
137 6/25/2012 BED JIM Michigan
137 6/25/2012 PILLOW JIM MARYLAND
137 6/25/2012 MATTRES TOM Michigan
137 6/25/2012 MATTRES JIM Michigan
;
run;
proc sql;
title "Pct deviate name";
select b.*, n_name/n as pctName format=percent8.2 
 from (select id_no,trans_date,count(*) as n from have group by id_no,trans_date) as a,
      (select id_no,trans_date,name,count(*) as n_name from have group by id_no, trans_date,name) as b
  where a.id_no=b.id_no and a.trans_date=b.trans_date ;


title "Pct deviate state";
select b.*, n_STATE/n as pctSTATE format=percent8.2 
 from (select id_no,trans_date,count(*) as n from have group by id_no,trans_date) as a,
      (select id_no,trans_date,STATE,count(*) as n_STATE from have group by id_no, trans_date,STATE) as b
  where a.id_no=b.id_no and a.trans_date=b.trans_date ;
quit;

Ksharp

PROC SQL different values within a BY GROUP and Calculate percentages of the different records

Re: PROC SQL different values within a BY GROUP and Calculate percentages of the different records

Re: PROC SQL different values within a BY GROUP and Calculate percentages of the different records

Re: PROC SQL different values within a BY GROUP and Calculate percentages of the different records

Re: PROC SQL different values within a BY GROUP and Calculate percentages of the different records

Catch up on SAS Innovate 2026

PROC SQL different values within a BY GROUP and Calculate percentages of the different records

Re: PROC SQL different values within a BY GROUP and Calculate percentages of the different records

Re: PROC SQL different values within a BY GROUP and Calculate percentages of the different records

Re: PROC SQL different values within a BY GROUP and Calculate percentages of the different records

Re: PROC SQL different values within a BY GROUP and Calculate percentages of the different records

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away