BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
corji
Calcite | Level 5

Hi, I'm having a little difficulty with the data below:


datalines;
01OCT2017 XX-1 XX  0
01JAN2018 XX-1 XX  7320
01APR2018 XX-1 XX 30240
01JUL2018 XX-1 XX  40600
01OCT2018 XX-1 XX  45360
01JAN2019 XX-1 XX  29302
01APR2019 XX-1 XX  0
01OCT2017 XX-2 XX  0
01JAN2018 XX-2 XX  7320
01APR2018 XX-2 XX  30240

 

In the above data each individual, signified with the -(number) is part of the overall group XX. Each individual was given a certain amount of money, denoted by the final column; however, the numbers shown are incorrect, as they signify the total for the group, not the individual (as in, for period 1 Jan 2018, 7320 was issued for XX in total, not just XX-1).

 

I want to find a way to report out the money given out per time period for XX without the duplicating of values in the money column. Ideally it should look something like this:

 

Time Period   Group   Money Issued

01OCT2017   XX        0
01JAN2018    XX        7320
01APR2018   XX        30240
01JUL2018    XX        40600

 

and so on.

 

Could someone help with thinking this through?

1 ACCEPTED SOLUTION

Accepted Solutions
unison
Lapis Lazuli | Level 10

If I understand what you're saying, try:

proc sort data=have(drop=member) nodupkey out=want;
	by time_period group;
run;
-unison

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26

I don't understand the problem. From your output, it seems like you just want to remove XX-1 or XX-2 from each record. Is the logic really that simple, or are there other steps? What happened to the other lines in the input that are not represented in the output?

--
Paige Miller
corji
Calcite | Level 5

It's a bit more complicated than that (at least in my head). I basically want to keep only one individual (say XX-1) for all time periods, using that as the basis for XX.

 

But there are multiple groups too, so I'd want to keep 1 of XX, one of AA, etc. Does that make sense?

unison
Lapis Lazuli | Level 10

I think you're looking for the unique entries by time period. nodupkey as it's used below will give you the desired output.

data have;
	input time_period :date9. member $ group $ money_issued;
	format time_period date9.;
	datalines;
01OCT2017 XX-1 XX 0
01JAN2018 XX-1 XX 7320
01APR2018 XX-1 XX 30240
01JUL2018 XX-1 XX 40600
01OCT2018 XX-1 XX 45360
01JAN2019 XX-1 XX 29302
01APR2019 XX-1 XX 0
01OCT2017 XX-2 XX 0
01JAN2018 XX-2 XX 7320
01APR2018 XX-2 XX 30240
;
run;

data desired_output;
	input time_period :date9. group $ money_issued;
	format time_period date9.;
	datalines;
01OCT2017 XX 0
01JAN2018 XX 7320
01APR2018 XX 30240
01JUL2018 XX 40600
01OCT2018 XX 45360
01JAN2019 XX 29302
01APR2019 XX 0
;
run;

proc sort data=have(drop=member) nodupkey out=want;
	by time_period;
run;

proc compare base=desired_output compare=want;
run;
-unison
corji
Calcite | Level 5

I think the issue with this solution is that it de-dupes the time periods as well, which I want to keep as is: the periods need to remain as there are other unique groups (think YY to XX, ZZ etc.) that need to have that time period as an identifier. XX isn't the only group.

unison
Lapis Lazuli | Level 10

If I understand what you're saying, try:

proc sort data=have(drop=member) nodupkey out=want;
	by time_period group;
run;
-unison
corji
Calcite | Level 5

This was the one that worked! Thank you.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 961 views
  • 1 like
  • 3 in conversation