I will try to explain this the best I can, but I don't even really know what I'm asking for... I have account numbers, and sums of paid amounts split by quarter:
ACCOUNT_ID | Paid | Quarter | Invoice_Amount |
123456 | Y | 2019 Q1 | 1000 |
123456 | Y | 2019 Q3 | 1250 |
123456 | Y | 2019 Q4 | 900 |
Notice in this hypothetical, there was no invoice amount for 2019 Q2, so there is no observation... I would like to still have a row for 2019 Q2, with Invoice_amount as summed as 0, even though there aren't any Q2 claims in the dataset
Any ideas?
edit: I should also mention that the Quarter date is based on the data date range pulled, so I don't want to manually enter the missing date each time
Is quarter a character variable?
Yes, it is created with this:
,CASE WHEN DATEPART(C.SERVICE_DT) LE &PRIOR. THEN CAT(YEAR(&PRIOR.),' Q',QTR(&PRIOR.),' AND PRIOR') ELSE
cat(YEAR(DATEPART(C.SERVICE_DT)),' Q',QTR(DATEPART(C.SERVICE_DT))) END AS QUARTER
There are several ways to do this.. Here is one. I added another ACCOUNT_ID for multiple group demonstration.
data have;
input ACCOUNT_ID $ 1-6 Paid $ 8 Quarter $ 10-16 Invoice_Amount;
datalines;
123456 Y 2019 Q1 1000
123456 Y 2019 Q3 1250
123456 Y 2019 Q4 900
654321 Y 2019 Q2 1000
654321 Y 2019 Q4 1250
;
data want(drop=rc);
if _N_=1 then do;
declare hash h (dataset : 'have');
h.definekey ('ACCOUNT_ID', 'Quarter');
h.definedata (all : 'Y');
h.definedone();
end;
set have;
by account_id;
if first.account_id then do Quarter = '2019 Q1', '2019 Q2', '2019 Q3', '2019 Q4';
Invoice_Amount = 0;
rc = h.find();
output;
end;
run;
Result:
ACCOUNT_ID Paid Quarter Invoice_Amount 123456 Y 2019 Q1 1000 123456 Y 2019 Q2 0 123456 Y 2019 Q3 1250 123456 Y 2019 Q4 900 654321 Y 2019 Q1 0 654321 Y 2019 Q2 1000 654321 Y 2019 Q3 0 654321 Y 2019 Q4 1250
This is a classic complete the square problem.
There is an on going discussion of a similar issue on this thread. Just use one of the techniques there.
How is the date range available when the code is run?
Hi,
Try this:
data have;
infile cards dlm="\";
input ACCOUNT_ID Paid : $ 1. Quarter : $ 7. Invoice_Amount;
cards;
123456\Y\2019 Q1\1000
123456\Y\2019 Q3\1250
123456\Y\2019 Q4\900
123457\Y\2019 Q1\3000
123457\Y\2019 Q3\3250
;
run;
proc sort data = have;
by ACCOUNT_ID Quarter;
run;
proc sql;
create table _tmp_ as
select distinct ACCOUNT_ID
from have
;
run;
data _tmp_;
set _tmp_;
start = input("2018Q04", yyq7.);
length Quarter $ 7;
do i = 0 to 5;
Quarter = tranwrd(put(intnx("quarter", start, i), yyqp6.), ".", " Q");
output;
end;
keep ACCOUNT_ID Quarter;
run;
data want;
merge _tmp_ have;
by ACCOUNT_ID Quarter;
Invoice_Amount = coalesce(Invoice_Amount,0);
run;
All the best
Bart
Hmm, these solutions are all way above my head, to the point where I cannot apply them to my unique situation and get the output
@vanpeltm1785 What makes your actual problem more unique than the problem you posted?
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.