Solved: RE: Counting using 2 datasets

twildone · Posted 02-16-2017 08:31 PM

Hi...I have a data set A that has a unique product id numbers and the corresponding benchmark weight for that product id number. A second data set B contains the raw data of which I want to count for each client and product id number, the number of records (sales) below the benchmark weight, the number of records (sales) having the benchmark weight (equal to) and also the number of records (sales) above the benchmark weight. The problem I am having is how to bring those benchmark weights in so I can compare the actual weights in data set B to the corresponding product weights. Any suggestions. Thanks,

nehalsanghvi · Posted 02-17-2017 09:14 AM

You can combine the four SQL statements into one and then do your transpose:

PROC SQL;

CREATE TABLE Summary AS

SELECT B.Client, B.Product_ID, A.Weight as BenchMarkWt,

case when a.Weight-b.Weight>0 then 'Greater than'

when a.Weight-b.Weight<0 then 'Less than'

when a.Weight-b.Weight=0 then 'Same'

else '' end as Condition, count(*) as COUNT_of_Condition

FROM WORK.trialB b LEFT JOIN WORK.trialA a

ON (B.Product_ID = A.Product_ID)

group by B.Client, B.Product_ID, A.Weight, calculated Condition;

QUIT;

PROC TRANSPOSE DATA=Summary OUT=WORK.SummaryTrans(DROP=_NAME_);

BY Client Product_ID;

ID Condition;

VAR COUNT_of_Condition;

RUN;

Not sure if that's what you needed, though.

View solution in original post

art297 · Posted 02-16-2017 08:33 PM

It would help seeing example datasets for each of your datasets (in datastep form), the result you'd like to get from the process, and the code you've tried thus far.

Art, CEO, AnalystFinder.com

twildone · Posted 02-17-2017 08:45 AM

Hi....this is what I have so far. Although the output in table Summary5 is what I would like to end up with, is there a more simplier and efficient way to end up with the same results as the actual dataset for trialb is quite large.

DATA trialA;

LENGTH Product_ID $ 6 Weight 8;

INFORMAT Product_ID $CHAR6. Weight BEST12.;

INPUT Product_ID : $CHAR6. Weight : BEST32.;

DATALINES;

007075 100

076060 120

131630 40

342993 56

408007 91

487707 35

591680 40

605309 22

632145 56

785850 22

845866 105

;;;;

DATA trialb;

LENGTH Client $ 9 Product_ID $ 6 Weight 8;

INFORMAT Client $CHAR9. Product_ID $CHAR6. Weight BEST12.;

INPUT Client : $CHAR9. Product_ID : $CHAR6. Weight : BEST32.;

DATALINES;

100087540 007075 85

100087540 007075 110

100087540 007075 100

100087540 007075 95

100087540 007075 105

100087540 408007 25

100087540 408007 30

100087540 408007 40

100087540 408007 35

100092380 605309 20

100092380 605309 25

100092380 605309 22

100092380 605309 23

100092380 785850 18

100092380 785850 23

100092380 785850 24

100092380 785850 19

114811568 342993 55

114811568 845866 110

114811568 845866 105

;;;;

PROC SQL NOPRINT;

CREATE TABLE Summary1 AS

SELECT

trialB.Client,

trialB.Product_ID,

trialB.Weight,

trialA.Product_ID AS Product_ID1,

trialA.Weight AS Weight1

FROM WORK.trialB LEFT JOIN WORK.trialA ON (trialB.Product_ID = trialA.Product_ID);

QUIT;

PROC SQL NOPRINT;

CREATE TABLE Summary2 AS

SELECT

Summary1.Client,

Summary1.Product_ID,

Summary1.Weight,

Summary1.Product_ID1,

Summary1.Weight1,

(Summary1.Weight1-Summary1.Weight) AS Difference

FROM WORK.Summary1;

QUIT;

PROC SQL NOPRINT;

CREATE TABLE Summary3 AS

SELECT

Summary2.Client,

Summary2.Product_ID,

Summary2.Weight,

Summary2.Product_ID1,

Summary2.Weight1,

Summary2.Difference,

(case when Summary2.Difference<0

then 'Less than'

when Summary2.Difference=0

then 'Same'

when Summary2.Difference>0

then 'Greater than'

else ' '

end) AS Condition

FROM WORK.Summary2;

QUIT;

PROC SQL NOPRINT;

CREATE TABLE Summary4 AS

SELECT DISTINCT

Summary3.Client,

Summary3.Product_ID,

Summary3.Condition,

(COUNT(Summary3.Condition)) AS COUNT_of_Condition

FROM WORK.Summary3

GROUP BY Summary3.Client, Summary3.Product_ID, Summary3.Condition

ORDER BY Summary3.Client, Summary3.Product_ID;

QUIT;

PROC TRANSPOSE DATA=Summary4 OUT=WORK.Summary5(DROP=_NAME_);

BY Client Product_ID;

ID Condition;

VAR COUNT_of_Condition;

RUN;

QUIT;

nehalsanghvi · Posted 02-17-2017 09:14 AM

You can combine the four SQL statements into one and then do your transpose:

PROC SQL;

CREATE TABLE Summary AS

SELECT B.Client, B.Product_ID, A.Weight as BenchMarkWt,

case when a.Weight-b.Weight>0 then 'Greater than'

when a.Weight-b.Weight<0 then 'Less than'

when a.Weight-b.Weight=0 then 'Same'

else '' end as Condition, count(*) as COUNT_of_Condition

FROM WORK.trialB b LEFT JOIN WORK.trialA a

ON (B.Product_ID = A.Product_ID)

group by B.Client, B.Product_ID, A.Weight, calculated Condition;

QUIT;

PROC TRANSPOSE DATA=Summary OUT=WORK.SummaryTrans(DROP=_NAME_);

BY Client Product_ID;

ID Condition;

VAR COUNT_of_Condition;

RUN;

Not sure if that's what you needed, though.

art297 · Posted 02-17-2017 11:40 AM

I'd do all of the heavy lifting in your first sql step, then use proc freq, then transpose. e.g. (which produces your summary5 table). However, while I won't delete my post, I like @nehalsanghvi's solution better (i.e., simply including the count variable in your one proc sql step):

PROC SQL NOPRINT;
     CREATE TABLE Summary AS
           SELECT
                trialB.Client,
         trialB.Product_ID,
         case 
           when (trialB.Weight-trialA.Weight) > 0 then 'Less_than'
           when (trialB.Weight-trialA.Weight) < 0 then 'Greater_than'
           when (trialB.Weight-trialA.Weight) = 0 then 'Same'
           else ''
         end as condition
          FROM WORK.trialB LEFT JOIN WORK.trialA ON (trialB.Product_ID = trialA.Product_ID);
QUIT;

proc freq data=Summary;
  tables client*product_ID*condition/out=need (drop=percent);
run;

proc transpose data=need out=want (DROP=_:);
     BY Client Product_ID;
     ID Condition;
     VAR COUNT;
RUN;

HTH,

Art, CEO, AnalystFinder.com

RE: Counting using 2 datasets

Re: RE: Counting using 2 datasets

Re: RE: Counting using 2 datasets

Re: RE: Counting using 2 datasets

Re: RE: Counting using 2 datasets

Re: RE: Counting using 2 datasets

RE: Counting using 2 datasets

Re: RE: Counting using 2 datasets

Re: RE: Counting using 2 datasets

Re: RE: Counting using 2 datasets

Re: RE: Counting using 2 datasets

Re: RE: Counting using 2 datasets

Click image to register for webinar

Classroom Training Available!