Hi....I am trying to create the output that I was able to produce in the output table Want. I was only able to do so by creating a macro and arbitrary selecting "12" in the "%do I=1 %to 12;" statement. This seems to work as long as the number of iterations needed for each ID groups is no more than 12. My problem is that I don't know in advance how many iterations are actually needed and would like it to iterated as long as BegAmount1 is not missing for each ID group. Thanks.
data have;
format ID Counts Counts1 ShowAmount NumberDays NumberDaysMonth NumberDaysBeg NumberDaysEnd BegAmount best12.;
infile datalines4 dlm='7F'x missover dsd;
input ID Counts Counts1 ShowAmount NumberDays NumberDaysMonth NumberDaysBeg NumberDaysEnd BegAmount : best32.;
datalines4;
152 1 1 . 36 28 0 28 3650
152 1 2 . 36 8 0 8 3650
152 2 3 . 85 14 0 14 4100
152 2 4 . 85 31 0 31 4100
152 2 5 700 85 30 10 20 4100
152 2 6 . 85 10 0 10 4100
152 3 7 . 71 19 0 19 4650
152 3 8 450 71 29 28 1 4650
152 3 9 . 71 23 0 23 4650
152 4 10 . 85 25 0 25 4650
152 4 11 . 85 36 0 31 4650
152 4 12 . 85 29 0 29 4650
154 1 1 . 36 28 0 28 3650
154 1 2 . 36 8 0 8 3650
154 2 3 . 85 14 0 14 4100
154 2 4 . 85 31 0 31 4100
154 2 5 700 85 30 10 20 4100
155 2 1 . 85 10 0 10 4100
155 3 2 . 71 19 0 19 4650
155 3 3 450 71 29 28 1 4650
155 3 4 . 71 23 0 23 4650
155 4 5 . 85 25 0 25 4650
155 4 6 . 85 36 0 31 4650
155 4 7 . 85 29 0 29 4650
;;;;
data have;
set have;
if ShowAmount = .
then ShowAmount = 0;
run;
data have(drop=NumberDaysMonth2 NumberDays2);
set have;
by ID Counts Counts1 notsorted;
NumberDaysMonth2 = lag(NumberDaysMonth);
NumberDays2 = lag(NumberDays1);
NumberDays1 = NumberDays2 - NumberDaysMonth2;
if Counts ^=lag(Counts) then
NumberDays1 = NumberDays;
run;
data want2;
format BegAmount BegAmount1 EndAmount MonthlyAmount EndAmount1 ShowAmount bestx12.2;
set have;
by ID Counts Counts1 notsorted;
if first.ID and first.Counts and first.Counts1 then do;
BegAmount1 = BegAmount;
EndAmount = (BegAmount + ShowAmount);
NumberDays1 = NumberDays1;
MonthlyAmount = (((NumberDaysBeg/NumberDays1)*BegAmount1) + ((NumberDaysEnd/NumberDays1)*EndAmount));
EndAmount1 = (EndAmount - MonthlyAmount);
end;
run;
data want1;
format BegAmount EndAmount2 BegAmount2 ShowAmount bestx12.2;
set want2;
by ID notsorted;
if not first.Counts1 then
EndAmount2 = lag(EndAmount1);
else EndAmount2 = EndAmount2;
if not first.Counts1 then
BegAmount2 = lag(BegAmount);
else BegAmount2 = BegAmount2;
if not first.Counts1 then
NumberDays2 = lag(NumberDays1);
else NumberDays2 = NumberDays2;
if not first.Counts1 then
NumberDaysMonth2 = lag(NumberDaysMonth);
else NumberDaysMonth2 = NumberDaysMonth2;
if not first.Counts1 then
Counts2 = lag(Counts);
else Counts2 = Counts;
if missing(BegAmount1) then
BegAmount1 = BegAmount - BegAmount2 + EndAmount2;
else BegAmount1 = BegAmount1;
if missing(EndAmount) then
EndAmount = BegAmount - BegAmount2 + ShowAmount + EndAmount2;
else EndAmount = EndAmount;
if missing(NumberDays1) and Counts = Counts2 then
NumberDays1 = NumberDays2 - NumberDaysMonth2;
else NumberDays1 = NumberDays1;
if missing( MonthlyAmount) then
MonthlyAmount = (((NumberDaysBeg/NumberDays1)*BegAmount1) + ((NumberDaysEnd/NumberDays1)*EndAmount));
else MonthlyAmount = MonthlyAmount;
if missing(EndAmount1) then
EndAmount1 = (EndAmount - MonthlyAmount);
else EndAmount1 = EndAmount1;
output;
run;
%macro test();
%do i=1 %to 12;
data want1;
format BegAmount BegAmount1 EndAmount MonthlyAmount EndAmount1 EndAmount2 BegAmount2 ShowAmount bestx12.2;
set want1;
by ID notsorted;
if not first.Counts1 then
EndAmount2 = lag(EndAmount1);
else EndAmount2 = EndAmount2;
if not first.Counts1 then
BegAmount2 = lag(BegAmount);
else BegAmount2 = BegAmount2;
if not first.Counts1 then
NumberDays2 = lag(NumberDays1);
else NumberDays2 = NumberDays2;
if not first.Counts1 then
NumberDaysMonth2 = lag(NumberDaysMonth);
else NumberDaysMonth2 = NumberDaysMonth2;
if not first.Counts1 then
Counts2 = lag(Counts);
else Counts2 = Counts2;
if missing(BegAmount1) then
BegAmount1 = BegAmount - BegAmount2 + EndAmount2;
else BegAmount1 = BegAmount1;
if missing(EndAmount) then
EndAmount = BegAmount - BegAmount2 + ShowAmount + EndAmount2;
else EndAmount = EndAmount;
if missing(NumberDays1) and Counts = Counts2 then
NumberDays1 = NumberDays2 - NumberDaysMonth2;
else NumberDays1 = NumberDays1;
if missing( MonthlyAmount) then
MonthlyAmount = (((NumberDaysBeg/NumberDays1)*BegAmount1) + ((NumberDaysEnd/NumberDays1)*EndAmount));
else MonthlyAmount = MonthlyAmount;
if missing(EndAmount1) then
EndAmount1 = (EndAmount - MonthlyAmount);
else EndAmount1 = EndAmount1;
output;
run;
%end;
%mend test;
%test
data want(drop=BegAmount2 EndAmount2 NumberDays2 NumberDaysMonth2 Counts2 NumberDays1 EndAmount1 BegAmount);
retain ID Counts Counts1 NumberDays NumberDaysMonth NumberDaysBeg NumberDaysEnd ShowAmount BegAmount1 MonthlyAmount EndAmount;
set want1;
run;
data want;
set want;
rename BegAmount1 = BegAmount;
run;
Did you use 12 because it is the max number of observations of IDs ?
If positive, you can count it and assign it in a macro variable:
In first step:
data have;
format ID Counts Counts1 ShowAmount NumberDays NumberDaysMonth
NumberDaysBeg NumberDaysEnd BegAmount best12.;
retain count_iter max_iter saved_id 0;
drop count_iter max_iter saved_id ;
infile datalines4 dlm='7F'x missover dsd end=eof;
input ID Counts Counts1 ShowAmount NumberDays NumberDaysMonth
NumberDaysBeg NumberDaysEnd BegAmount : best32.;
if ID ne saved_ID then do;
max_iter = max(max_iter, count_iter);
count_iter=0;
end;
count_iter +1;
if eof then call symput('max_iter' , strip(max_iter));
datalines4;
....
then use the macro variable instead the constant 12:
%macro test();
%do i=1 %to &max_iter;
....
beyond, I'm not sure that your code works as you expect, because of using LAG function under conditions.
I don't know what you expect. Since your Want1 uses many instances of "If not first.counts1" and you do not have Counts1 on the by statement the First is always undefined (missing) and and so 'not first.counts1' is always true. So any result is very suspect as to the output matching you desire.
Consider:
proc sort data=sashelp.class out=work.class; by sex; run; data example; set work.class; by sex; if not first.age then newname=lag(name); run; /* is exactly the same as*/ data example2; set work.class; newname=lag(name); run;
I have to guess since you chose 12 loops that you may be doing something based on months of a year. If so say so. I have a suspicion that you really need something like a start date, an end date and the rules of what you are doing. Intnx and Intck likely get you any of the intermediate periods you may need. But again, an actual rule or two would help.
And what are you trying to do. 'match an example' is fraught with peril as without an idea of what is supposed to happen it is hard to generalize to a rule (i.e. code)
Hi Ballardw...thank you for your response and the suggestion.....I will make the corrections to deal with the "if not first.counts1" .
The "12" was chosen since that was the number of iterations necessary for all the variables/fields to populate since many of them depends on the previous record values.
Did you use 12 because it is the max number of observations of IDs ?
If positive, you can count it and assign it in a macro variable:
In first step:
data have;
format ID Counts Counts1 ShowAmount NumberDays NumberDaysMonth
NumberDaysBeg NumberDaysEnd BegAmount best12.;
retain count_iter max_iter saved_id 0;
drop count_iter max_iter saved_id ;
infile datalines4 dlm='7F'x missover dsd end=eof;
input ID Counts Counts1 ShowAmount NumberDays NumberDaysMonth
NumberDaysBeg NumberDaysEnd BegAmount : best32.;
if ID ne saved_ID then do;
max_iter = max(max_iter, count_iter);
count_iter=0;
end;
count_iter +1;
if eof then call symput('max_iter' , strip(max_iter));
datalines4;
....
then use the macro variable instead the constant 12:
%macro test();
%do i=1 %to &max_iter;
....
beyond, I'm not sure that your code works as you expect, because of using LAG function under conditions.
Hi Shmuel...thanks for your response. Yes you are correct. I selected 12 because it was the maximum number of records when the records are grouped by ID's and it would result in all the required fields to be populated after the completion of the number of iterations. I guess I can always select a large enough number to ensure the number of iterations necessary are performed but not too sure how efficient that would be.
And still haven't described what this is actually supposed to accomplish.
Often seeing that many variables named "numberdays" or similar tells me that likely DATES should be involved somewhere, at least in a description. And the functions INTCK which will return values like "number of days between two dates" maybe should be in the running for consideration.
Hi....what is actually supposed to be accomplished is to obtain the MonthlyAmount which is the amount of revenue received for a project that is allocated to each month. The allocation is based on the revenue that has been received to date and the work must that has be done, that is revenue is only recognized if the revenue has been received and work that has been completed. Since projects can take several months and revenue can be received before the start of the project or during the completion of the project, I need to pay attention to the number of days in each month that the project was being worked on as this is used to calculate the proportion of the revenue that has been received for that particular month. The Beginning Amount and End Amount for each month is crucial in the calculation of the Monthly Amount. Since revenue can be received during the completion of the project, the iterate loop is why I thought it is necessary to add the new revenue received in the month it was received and adjust the Month's Begining Amount and End Amount. I hope this is helpful....thanks.
Hi Shmuel....thanks for your help and suggestion.....I wasn't able to get the suggestion of creating max_iter in the data have but ended of creating a dataset of the number of rows for each ID and then taking the maximum number from all ID's and then assigned it to a macro variable. I also ended up defining all of the lag variable first before calling them. Nevertheless, it seems to work.....thanks once again.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.