Solved: Help with rolling data into one row

corji · Posted 01-09-2020 06:03 PM

Hi, I'm having a little difficulty with the data below:

datalines;
01OCT2017 XX-1 XX 0
01JAN2018 XX-1 XX 7320
01APR2018 XX-1 XX 30240
01JUL2018 XX-1 XX 40600
01OCT2018 XX-1 XX 45360
01JAN2019 XX-1 XX 29302
01APR2019 XX-1 XX 0
01OCT2017 XX-2 XX 0
01JAN2018 XX-2 XX 7320
01APR2018 XX-2 XX 30240

In the above data each individual, signified with the -(number) is part of the overall group XX. Each individual was given a certain amount of money, denoted by the final column; however, the numbers shown are incorrect, as they signify the total for the group, not the individual (as in, for period 1 Jan 2018, 7320 was issued for XX in total, not just XX-1).

I want to find a way to report out the money given out per time period for XX without the duplicating of values in the money column. Ideally it should look something like this:

Time Period Group Money Issued

01OCT2017 XX 0
01JAN2018 XX 7320
01APR2018 XX 30240
01JUL2018 XX 40600

and so on.

Could someone help with thinking this through?

unison · Posted 01-10-2020 02:21 PM

If I understand what you're saying, try:

proc sort data=have(drop=member) nodupkey out=want;
	by time_period group;
run;

-unison

View solution in original post

PaigeMiller · Posted 01-09-2020 06:12 PM

I don't understand the problem. From your output, it seems like you just want to remove XX-1 or XX-2 from each record. Is the logic really that simple, or are there other steps? What happened to the other lines in the input that are not represented in the output?

--
Paige Miller

corji · Posted 01-09-2020 06:28 PM

It's a bit more complicated than that (at least in my head). I basically want to keep only one individual (say XX-1) for all time periods, using that as the basis for XX.

But there are multiple groups too, so I'd want to keep 1 of XX, one of AA, etc. Does that make sense?

unison · Posted 01-09-2020 11:05 PM

I think you're looking for the unique entries by time period. nodupkey as it's used below will give you the desired output.

data have;
	input time_period :date9. member $ group $ money_issued;
	format time_period date9.;
	datalines;
01OCT2017 XX-1 XX 0
01JAN2018 XX-1 XX 7320
01APR2018 XX-1 XX 30240
01JUL2018 XX-1 XX 40600
01OCT2018 XX-1 XX 45360
01JAN2019 XX-1 XX 29302
01APR2019 XX-1 XX 0
01OCT2017 XX-2 XX 0
01JAN2018 XX-2 XX 7320
01APR2018 XX-2 XX 30240
;
run;

data desired_output;
	input time_period :date9. group $ money_issued;
	format time_period date9.;
	datalines;
01OCT2017 XX 0
01JAN2018 XX 7320
01APR2018 XX 30240
01JUL2018 XX 40600
01OCT2018 XX 45360
01JAN2019 XX 29302
01APR2019 XX 0
;
run;

proc sort data=have(drop=member) nodupkey out=want;
	by time_period;
run;

proc compare base=desired_output compare=want;
run;

-unison

corji · Posted 01-10-2020 12:21 PM

I think the issue with this solution is that it de-dupes the time periods as well, which I want to keep as is: the periods need to remain as there are other unique groups (think YY to XX, ZZ etc.) that need to have that time period as an identifier. XX isn't the only group.

unison · Posted 01-10-2020 02:21 PM

If I understand what you're saying, try:

proc sort data=have(drop=member) nodupkey out=want;
	by time_period group;
run;

-unison

corji · Posted 01-13-2020 02:19 PM

This was the one that worked! Thank you.

Help with rolling data into one row

Re: Help with rolling data into one row

Re: Help with rolling data into one row

Re: Help with rolling data into one row

Re: Help with rolling data into one row

Re: Help with rolling data into one row

Re: Help with rolling data into one row

Re: Help with rolling data into one row

SAS Innovate 2025: Register Now