BookmarkSubscribeRSS Feed
mjabed600
Fluorite | Level 6

 Hello,

I've written a query below that gets the customer which generated the maximum revenue or spent the most money. But I would like to get the customer(s) generated the most money in each individual year. The set contains data from 2007 to 2015 and date is in mmddyy10. How would I go about achieving this with proc sql? thanks a lot for the help

proc sql;
select Customer_ID, sum
	from(
		select Customer_ID, sum(total_retail_Price) as Sum label= "Total Revenue" format= dollar10.2
		from testdata
		group by customer_ID
	    )
	having sum= max(sum);
	quit;
5 REPLIES 5
quickbluefish
Obsidian | Level 7

I think with "real" SQL this would be simple because you could use the RANK OVER / PARTITION BY functions that PROC SQL lacks (I think?) - otherwise might be awkward.  Here's a data step solution that assigns an integer to each customer ID (which I guessed was a character variable, length $20 - change as appropriate) for the purpose of gathering up data, then back-converts to customer ID at the end.  

 

proc sql noprint;
select count(distinct customer_id) into :ncust from testdata;
select min(year(dt)), max(year(dt)) into :firstyr, :lastyr from testdata;

create table CN as select "cust2num" as fmtname, "C" as type, 
customer_id as start, customer_id as end, monotonic() as label 
from 
	(select distinct customer_id from testdata);

create table NC as select "num2cust" as fmtname, "N" as type,
monotonic() as start, monotonic() as end, customer_id as label
from
	(select distinct customer_id from testdata);
quit;

proc format cntlin=CN; run;
proc format cntlin=NC; run;

data top_customer_by_year (keep=year customer total);
set testdata END=LAST;
length year 3 customer $20 total 8;
array T {&ncust,&firstyr:&lastyr} _temporary_;
T[put(customer_id,$cust2num.)*1,year(dt)]+total_retail_price;
if last then do;
	do year=lbound(T,2) to hbound(T,2);
		customer = '';
		total = .;
		do c=1 to dim(T,1);
			if T[c,year]>total then do;
				customer=put(c,num2cust.);
				total=T[c,year];
			end;
		end;
		output;
	end;
end;
run;
novinosrin
Tourmaline | Level 20

can you please provide a sample to work with and your expected output from the sample?

PGStats
Opal | Level 21

Try this:

 

proc sql;
select 
    Customer_ID, 
    year format=year4., 
    sum label= "Total Revenue" format= dollar10.2
from(
    select 
        Customer_ID, 
        intnx("year", date, 0) as year,
        sum(total_retail_Price) as Sum 
    from testdata
    group by customer_ID, calculated year
)
group by year
having sum= max(sum);
quit;

this assumes that your dates are SAS dates. If they are SAS datetimes, replace date with datepart(date)

 

In the result set, year is the SAS date of the first day of the year, formatted as an integer. If you prefer a simple number variable, use year(year) as year format=4.0.

PG
Tom
Super User Tom
Super User

You just need to use GROUP BY.  You might also need to transform your date variable to have just the YEAR value.

Example:

proc sql ;
create table want as
  select
    year(date) as year
   ,max(volume) as max_volume
   ,mean(volume) as mean_volume
  from sashelp.stocks
  group by 1
;
quit;
PaigeMiller
Diamond | Level 26

This is pretty simple using two PROC SUMMARY calls. The first PROC SUMMARY computes the sum of the sales by customer_id and year; the second takes the results of the first and finds the maximum values per year and the customer_id associated with the maximum value.

 

 

--
Paige Miller

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 4595 views
  • 0 likes
  • 6 in conversation