BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Malathi13
Obsidian | Level 7

Hi,

I have a table like:

 

distterrrelidmktproductTRNR
1011210XI_UXIA1000
1012211XI_UXIA060
1011212XI_UXIA11070
1011213XI_UXIA11280
1012214XI_UXIA11591

 

I want to have the roll-ups of all the same terr and remove duplicates: For e.g. for dist 101 and terr 1 I should have only one row with TR=322 and NR=150. How do I remove duplicates and roll up the values as per the terr for the dist. In this case there should be only two rows (dist 101 and Terr1 and Dist 101 and terr2).

 

I tried to remove the duplicates using nodupkey in proc sort but that removes some of the products (not shown here).

 

Any help is greatly appreciated.

 

Thank you

M

1 ACCEPTED SOLUTION

Accepted Solutions
thomp7050
Pyrite | Level 9

One thing: if you roll up like this then you will always have one row per distinct response.  If you wish to include relid in with the rest of the responses then you will get one row per terr/relid, which may not be what you're looking for?  If you just wanted one row per terr then you could not include relid, as such:

 

data have;
input dist terr relid mkt $ product $ TR NR;
datalines;
101 1 210 XI_U XIA 100 0 
101 2 211 XI_U XIA 0 60 
101 1 212 XI_U XIA 110 70 
101 1 213 XI_U XIA 112 80 
101 2 214 XI_U XIA 115 91 
;

PROC SQL;
CREATE TABLE WANT AS
SELECT dist, TERR,  MKT, SUM(TR) AS TOTAL_TR, SUM(NR) AS TOTAL_NR
FROM HAVE
GROUP BY dist, TERR, MKT ;
QUIT;


Alternatively, you could count the distinct relids per terr if you wanted.

View solution in original post

11 REPLIES 11
PeterClemmensen
Tourmaline | Level 20

like this?

 

data have;
input dist terr relid mkt $ product $ TR NR;
datalines;
101 1 210 XI_U XIA 100 0 
101 2 211 XI_U XIA 0 60 
101 1 212 XI_U XIA 110 70 
101 1 213 XI_U XIA 112 80 
101 2 214 XI_U XIA 115 91 
;

proc sort data = have;
	by dist terr;
run;

data want;
	set have;
	by dist terr;
	if first.dist and first.terr then do; 
		sum_TR=0;
		sum_NR=0;
	end;
	
	sum_TR = sum_TR + TR;
	sum_NR = sum_NR + NR;

	if last.terr then output;

	retain sum_TR sum_NR 0;
	drop TR NR;

run;

ballardw
Super User

It appears that you are SUMMING variables TR and NR. What values, if any of RELID, MKT or Product should be with the output?

 

Are you looking for a REPORT or a Dataset?

thomp7050
Pyrite | Level 9

Assuming what @ballardw is saying is true in that your purpose is to sum, and you are grouping by terr, then here is a relatively simple solution in proc sql (there are many ways do this but this is one):

 

data have;
input dist terr relid mkt $ product $ TR NR;
datalines;
101 1 210 XI_U XIA 100 0 
101 2 211 XI_U XIA 0 60 
101 1 212 XI_U XIA 110 70 
101 1 213 XI_U XIA 112 80 
101 2 214 XI_U XIA 115 91 
;

PROC SQL;
CREATE TABLE WANT AS
SELECT TERR, SUM(TR) AS TOTAL_TR, SUM(NR) AS TOTAL_NR
FROM HAVE
GROUP BY TERR;
QUIT;
Malathi13
Obsidian | Level 7

I'm trying to generate a report.

Malathi13
Obsidian | Level 7

My output should have dist, mkt, product, TR and NR. Don't need relid and Terr but the duplicates should be taken off by relid.

 

Thank you

M

mkeintz
PROC Star

By "rollup" you apparently meain you want to sum TR and NR for each combination of DIST and TERR.  But product and MKT appear to be constants that you might want to keep in the output. , so I include them in the CLASS statement.

 

That is what PROC SUMMARY is for.

 

proc summary data=have nway;

   class dist terr mkt product;

   var tr nr;

   output out=want   sum(tr nr)=tr_sum nr_sum;

run;

 

 

You can also use proc summary to get other stats (N min max mean std ...... ).

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Malathi13
Obsidian | Level 7

By doing summing, there are still duplicates. I think it is not summing up, it is rolling up. For e.g.

 

dist       terr          relid    mkt   product   TR     NR

101      1             210       XI_U    XIA     100     0

101       2            212       XI_U   XIA      0        60

101       2            212       XI_U   XIA      0        60

101       2            213       XI_U   XIA     110     70

 

I need to have only one row per terr per product. So in the above example, it is having duplicate for Terr 2, so I should remove that duplicate and sum Relid 212 and 213.

 

Sorry for the confusion and thanks a lot for all of your comments.

 

Malathi

thomp7050
Pyrite | Level 9

One thing: if you roll up like this then you will always have one row per distinct response.  If you wish to include relid in with the rest of the responses then you will get one row per terr/relid, which may not be what you're looking for?  If you just wanted one row per terr then you could not include relid, as such:

 

data have;
input dist terr relid mkt $ product $ TR NR;
datalines;
101 1 210 XI_U XIA 100 0 
101 2 211 XI_U XIA 0 60 
101 1 212 XI_U XIA 110 70 
101 1 213 XI_U XIA 112 80 
101 2 214 XI_U XIA 115 91 
;

PROC SQL;
CREATE TABLE WANT AS
SELECT dist, TERR,  MKT, SUM(TR) AS TOTAL_TR, SUM(NR) AS TOTAL_NR
FROM HAVE
GROUP BY dist, TERR, MKT ;
QUIT;


Alternatively, you could count the distinct relids per terr if you wanted.

Malathi13
Obsidian | Level 7

You are right,

I'm not going to include relid and terr as well in the final output.

thomp7050
Pyrite | Level 9

In that case, does the solution that I have above work for you?

Malathi13
Obsidian | Level 7

So, product should not be included in teh select statement?

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 3036 views
  • 1 like
  • 5 in conversation