I've the following code which creates WANT dataset. Now I want to update the WANT dataset by doing sum(sales) and group by of other fields. I can do this in proc sql with create table statement but I don't want to create any new dataset and instead I want to update the existing WANT dataset with the aggregation as told above. Any help?
data work.want; set work.have; upn = material; country = propcase(country_name); mon_intrvl = put(date, yymmd7.); sales_date = intnx("month",date,0,"b"); sales = delivery_qty; run;
Artificial restriction like "I don't want to create any new dataset " don't lend themselves to clean easy to understand code.
Plus, without an example data set of what this is supposed to look like after the "update" I am not sure exactly what you are expecting as a result.
"Group by" isn't really a data step concept either. So an example gets more important.
Typically I would look at using something like Proc Summary to create the summaries and then merge back to this data.
If you can do it in SQL then why not?
In SAS terms any time you add a variable you are not actually updating a data set, you are creating a new set that may just happen to have the same name. That is what your example data step code does as well.
@David_Billa wrote:
I don't know how to do it in sql to update and aggregate the WANT dataset
in one step.
It would be nice if you can point me the document where I can refer to
update and aggregate the table in proc summary and data step
Using SQL:
proc sql;
create table class as
select *, sum(weight) as weightsum
from sashelp.class
group by sex
;
quit;
PROC SUMMARY:
proc summary data=sashelp.class nway;
class sex;
var weight;
output out=class_sum sum(weight)=weightsum;
run;
data class;
set sashelp.class;
if _n_ = 1
then do;
length weightsum 8;
declare hash s (dataset:"class_sum");
s.definekey("sex");
s.defineedata("weightsum");
s.definedone();
weightsum = .;
end;
rc = s.find();
drop rc;
run;
proc delete data=class_sum;
run;
While the SQL will re-order the data, the second method will keep the original order.
Since all grouping implies some kind of sorting, you either need a preceding sorting step, or your dataset has to fit into available memory, so you can use a hash object. Be aware that the OUTPUT() method will write the new dataset in order of the key variable(s).
Either @DonH or @hashman might be able to show us a way how to use a hash object to create a sum by one variable, but order by another.
I have to respectfully disagree with
Since all grouping implies some kind of sorting, you either need a preceding sorting step, or your dataset has to fit into available memory
The hash object does not require the source data to fit into memory. It can (but need not) require the summary table to fit into memory. The book and articles that were noted by @yabwon earlier provide details on memory management issues.
Sorting is not required for the hash object solutions described in @yabwon's comments.
I assume on your point about grouping by one variable but ordering by another, you are talking about the case where the order variable is not a grouping variable. If you can clarify I can perhaps provide an example. Regardless, unless the summary table is large, I don't see why a sort after the aggregation is a big deal
For generating aggregation within the Data Step, I would recommend looking into
- The DOW Loop
Additional papers can be found here
- Using Data Aggregation Using the SAS® Hash Object
Hope this helps
No such thing as updating an existing data set and at the same time creating a new column.
As @ballardw said, you can create a brand new data set that has the changes you want and has the same name as the previous data set (highly not recommended, it is a poor programming practice, but yes you can do that).
proc sql;
create table want as
select *, material as upn, propcase(country_name) as country,
put(date, yymmd7.) as mon_intrvl, intnx('month', date, 0, 'b') as sales_date,
delivery_qty as sales,
sum(delivery_qty) as total
from have
group by col1, col2, col3;
quit;
Is there a reason something like above wouldn't work?
@David_Billa wrote:
There were some attrib statements and format statements between data and
set lines from the initial post which I have not shown. In that case how to
tweak your proc sql?
Same way you would specify the attributes in any SQL, via column modifiers.
select put(....) as new_variable format = $8. length = $8.
If I want to handle this in data step I wish to know how to do it given
that we have to 'group by' few fields after sum.
It depends on the specifics. You have a few ways of doing this but all essentially recreate the table behind the scenes anyway. An exception to this may be if you're using a DB to store the data and writing directly to it. The code doesn't show the re-creation but it happens.
The SQL supports the group by without summarizing the data as long as you have some fields selected that are unique. It will not summarize the data.
Some options:
Whether your proc sql works if 'from' and 'create table' table name is same?
Yes, but it warns you this isn't a good idea and can lead to problems, which is why it's not recommended by anyone here. Just because you can, doesn't mean you should type deal.
data class;
set sashelp.class;
run;
proc sql;
create table class as
select *, put(age, 8.) as age_char format=$8.,
mean(weight) as avg_weight_by_gender
from class
group by sex, age;
quit;
Your step does create a new dataset. During execution, the file want.sas7bdat is read, and the results are written to want.sas7bdat.lck. When the step finishes successfully, want.sas7bdat is deleted, and want.sas7bdat.lck renamed to want.sas7bdat.
Anytime a new variable is added, or observations are inserted (not appended at the "end"), the whole dataset must be rewritten.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.