BookmarkSubscribeRSS Feed
Adam_
Calcite | Level 5

Hello ,

 

I am trying to calculate the total amount that we have received from a customer at different points of time. But,  I am not sure if there is an easy way to do it.

Let me share my data ;

 

IDPeriodAmount
123Jan-1710
123Feb-1720
123Mar-1730
123Apr-1740
123May-1750
123Jun-1760
123Jul-1770
123Aug-1780

 

For example I want to calculate how much the customer (ID=123)  has paid since Feb-17 until Aug-17, so that will gives me this

CheckpointIDPeriodAmount
Feb-17123Feb-1720
Feb-17123Mar-1730
Feb-17123Apr-1740
Feb-17123May-1750
Feb-17123Jun-1760
Feb-17123Jul-1770
Feb-17123Aug-1780

 

And I would like to repeat that for all these months ; Hence my final data set will look like that

CheckPointIDPeriodAmount
Feb-17123Feb-1720
Feb-17123Mar-1730
Feb-17123Apr-1740
Feb-17123May-1750
Feb-17123Jun-1760
Feb-17123Jul-1770
Feb-17123Aug-1780
Mar-17123Mar-1730
Mar-17123Apr-1740
Mar-17123May-1750
Mar-17123Jun-1760
Mar-17123Jul-1770
Mar-17123Aug-1780
Apr-17123Apr-1740
Apr-17123May-1750
Apr-17123Jun-1760
Apr-17123Jul-1770
Apr-17123Aug-1780
May-17123May-1750
May-17123Jun-1760
May-17123Jul-1770
May-17123Aug-1780
Jun-17123Jun-1760
Jun-17123Jul-1770
Jun-17123Aug-1780
Jul-17123Jul-1770
Jul-17123Aug-1780
Aug-17123Aug-1780

 

 

The way I have done it is ;

 

Data feb17;

set have(keep= id period);

where period >=Feb-17;

checkpoint=Feb-17;

run;

 

Data t1;

merge feb17( in=a)

have (in=b);

by id period;

if a;

run;

 

But this code takes ages to run and creates a huge final data set.

 

I was wondering if there is another way to do it ?

 

 

Thank you in advance

 

 

 

10 REPLIES 10
PeterClemmensen
Tourmaline | Level 20

@Adam_  welcome to the SAS community 🙂 Is Period a numeric or character variable?

 

Adam_
Calcite | Level 5

period is numeric. It is actually a date formatted as yymmn6.

 

But if that is a problem I can easily convert it to a character.

Shmuel
Garnet | Level 18

You can use proc summary as in:

proc summary data=have(where=(period > '01FEB2017'd );
   class id period;
format period yymmn6.; /* or yymms7. */ var amount; output out=want sum=; run;
Adam_
Calcite | Level 5

Hello  ,

 

Thanks for your reply.

 

That won't work for my case. Because I won't be able to see how much the customer has paid each month.

And I will I have to repeat it , several times and then to merge all the outputs

PaigeMiller
Diamond | Level 26

But this code takes ages to run and creates a huge final data set.

 

Well, that's not surprising if the real data set (not the example you show) is large. 

 

I'm really wondering why you need the final output in the form you show it in, what do you plan to do with it in that final output form? I'mg guessing that there might be another way to get what you want without creating such a huge data set.

--
Paige Miller
Adam_
Calcite | Level 5

Hello Paige,

 

Indeed , my data set contains more than 7m of records.

 

I don't need my final output to have this output.

 

I have this data set

AccountPeriodpayment
123Jan-1710
123Feb-1720
123Mar-1730
123Apr-1740
123May-1750
123Jun-1760
123Jul-1770
123Aug-1780

 

and  what I really want to do is

 

Checkpoint     Customer   Month1  Month2 Month3

Feb-17          123            20             30      40

Mar-17          123            30             40      50

 

Where month1 -  monthN  is the difference between the checkpoint and the period

 

I wasn;t sure how to do it ; So I have created that huge data set and then I have used the proc means (Class Statement has the Checkpoint)

 

 

 

Adam_
Calcite | Level 5

I was wondering if I can use arrays to calculate this?

any ideas?

PaigeMiller
Diamond | Level 26

@Adam_ wrote:

Hello Paige,

 

Indeed , my data set contains more than 7m of records.

 

I don't need my final output to have this output.

 

I have this data set

Account Period payment
123 Jan-17 10
123 Feb-17 20
123 Mar-17 30
123 Apr-17 40
123 May-17 50
123 Jun-17 60
123 Jul-17 70
123 Aug-17 80

 

and  what I really want to do is

 

Checkpoint     Customer   Month1  Month2 Month3

Feb-17          123            20             30      40

Mar-17          123            30             40      50

 

Where month1 -  monthN  is the difference between the checkpoint and the period

 

I wasn;t sure how to do it ; So I have created that huge data set and then I have used the proc means (Class Statement has the Checkpoint)

 

 

 


This still doesn't answer WHY you need the data in a different form, and it doesn't answer what analysis you intend to do with it in this form that you can't do in its original form. I ask because there may be other ways to get to the desired analysis without doing this transformation of the data.

 

I don't think ARRAYs work given your original data set.

--
Paige Miller
Adam_
Calcite | Level 5
Hi,

I have a data set with the monthly outstanding balances from all my customers, from 2000 until now. This isn't at customer level.

The data set that I have mentioned, contains all the payments that I have received from my customers from 1990 until today. And it is obviously at customer level.

What I 'm trying to do is;

For any given month, let's say Jan - 00, to track all the monthly payments (by using the data set that i have shown to you) that I have received from that point until today. And I have to repeat that for all months from Jan-00 until today.

That's why I am trying to put my data set in that format.

Given the size of the data sets, I think that it would be better to reference somehow to these payments, instead of transforming my data
But I don't know if that is feasible.
PaigeMiller
Diamond | Level 26

@Adam_ wrote:



For any given month, let's say Jan - 00, to track all the monthly payments (by using the data set that i have shown to you) that I have received from that point until today. And I have to repeat that for all months from Jan-00 until today.




Well, either I'm lost, or your explanation is lacking some key detail, and maybe it's best to stop here, but I'll try one more time.


What benefit do you gain by performing this rearrangement? What additional analyses are going to happen once you have something to work with?

 

And why can't you track the monthly payments with the data set in its original form? What do you even mean by "track" (because my understanding of "track" is that I can "track" the payments with the data in its original form)?

--
Paige Miller

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 637 views
  • 0 likes
  • 4 in conversation