BookmarkSubscribeRSS Feed
Aar684
Calcite | Level 5
Hi all,

I have a csv dataset that includes an account number, a date, and 24 hours of data per day. Each row contains one day of a month of the year. All account numbers have 365 days.

So the sample data set is as follows:

account# date hour1 hour2 hour3 ...............hour24

There are around 50 account numbers in the data set totaling somewhere in the range of 4000 or so rows. I need to find the maximum value by MONTH and return that value for each account number for each month. So, in the end I will have 12 max values for each account number. HELP! PLEASE. The code, minus the import that I have so far is as follows :

data usagesum;
set kva end = eof;
retain maxuse;
month = month(date);
if (day(date) = 1) and not (_n_ = 1) then do;
maxuse=0;
end;
maxuse= max(of hour1 - hour24);
if last.account or month ne month(date+1) then output;
run;

I am getting the max but not of the entire month. Message was edited by: Aar684
8 REPLIES 8
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
You are using what's called "BY GROUP PROCESSING" -- the FIRST. and/or LAST. must also include a BY statement. So, you will need to derive all variables, sort the data, then have a DATA step with your max-value derivation logic.

Scott Barry
SBBWorks, Inc.

Recommended Google advanced search argument, this topic/post:

by group processing site:sas.com
Aar684
Calcite | Level 5
Very confusing. Is my max-value derivation logic even remotely correct?
Aar684
Calcite | Level 5
Scott,

I am sorry. My knowledge is very basic. I was getting help from someone here who isn't available to me at the moment. I am lost, but would like to learn. Unfortunately I have a tight time constraint. Am I any where near where I need to be?
Flip
Fluorite | Level 6
You are missing the BY ACCOUNT MONTH;
statement after the SET statement.
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Yes, you are close, but your MONTH derivation code must be in a separate DATA step. You cannot have a SET with a BY where the variable is not yet derived -- that's why you need multiple DATA steps -- suggest adding a SORT as mentioned.

Scott Barry
SBBWorks, Inc.
Doc_Duke
Rhodochrosite | Level 12
Try changing the third from the last line from
maxuse= max(of hour1 - hour24);
to
maxuse= max(of maxuse hour1 - hour24);
. That should include the max from the previous records in the month.

Doc Muhlbaier
Duke [This is in addition to Scott's comments.]


Message was edited by: Doc@Duke
Doc_Duke
Rhodochrosite | Level 12
OK. You were close. I think you can actually do this without BY-group processing; my code is not tested. :

PROC SORT DATA=kva; BY account date; RUN;
* assumes dates are SAS dates, not text strings;

data usagesum;
set kva; * end = eof; * end not needed.;
retain maxuse;
month = month(date);
if (day(date) = 1) /* and not (_n_ = 1) */ then do; * RETAINed variables initialized to 0 for _N_=1 already.;
maxuse=0;
end;
maxuse= max(of MAXUSE hour1 - hour24); * added maxuse to accumulate through the month.;
PUT account date maxuse; * debugging code to see if the accumulatior is working;
if /* last.account or */ month ne month(date+1) then output; * "last" not needed.;
run;

Give this a try and see where you get.

Doc

Message was edited by: Doc@Duke Message was edited by: Doc@Duke
Aar684
Calcite | Level 5
Ok guys. Trying your suggestions and will report back. Didn't have a chance to look at this again as I was out sick. I am going to plugging away today. Thanks all.
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2185 views
  • 0 likes
  • 4 in conversation