BookmarkSubscribeRSS Feed
acrosb
Calcite | Level 5

Hi, 

 

I have two data sets:

Data Set 1 Variables: Description (unique), Organization_ID, and Submit_Date

Data Set 2 Variables: Organization_ID, Commit_Date, and Amount 

Data looks like this:

DATASET1

Description                                                                   Organization_ID           Submit_Date (MMDDYY6.)

Customer was unhappy with product.                            A12                                 01052015

They completed the training and called for help            B22                                 02012015 

The customer asked for assistance with the product.    B22                                 02202015

No long want to do business                                          F02                                 03102015

 

DATASET2:

Organization_ID            Commit_Date (MMDDYY6.)            Amount 

C50                                  05012015                                           258 

D15                                  01122015                                          1589 

A12                                  01102015                                          100

B22                                  02152015                                          200

B22                                  02162015                                          50

B22                                  10152015                                          879 

 

 

For each Description, I want to find the total Amount for that Organization_ID, when the Submit_Date is within 6 months of the Commit_Date. In other words, for each Description, I want to see if/how much $ the submitting organization spent in the 6 months since submitting the description. 

 

Output I want:

DATASET1 with this column appended at the end:

 

TotalAmount

100

250

250

0

 

My very messy way of doing it was a left join by Organization_ID (which blows up the data set size since an org can have many records in the second data set - this would become an issue if I increased my data universe), then filtered to only records where dates within 6 months. Then I summed the Amount by Organization_ID.

 

Is there a more concise way of doing this? I'd like to have a program that goes through each Description, then looks through the records in the second table to find those with the same org and a date within 6 months, then return with either a 0 or a summation of the Amounts for the matching records. 

 

Any suggestions for certain procedures or methods I should look into?

 

Amanda

 

 

 

5 REPLIES 5
kiranv_
Rhodochrosite | Level 12

please show a sample of input and output data, someone will able to help you

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Please post test data in the form of a datastep to illustrate what you have, and show what you want out.

One method maybe to generate the code from the first, something like:

data _null_;
  set ds1;
  call execute(cat('proc means data=ds2; 
                      where organization_id="',organization_id,'" 
                        and ',commit_date,' <= submit_date <= ',intnx('month',commit_date,6),';
                      output out=tmp sum=sum;
                    run;'));
if _n_=1 then call execute('data want; set tmp; run;');
else call execute('data want; set want tmp; run;'); run;

This will sum each block and add it to the want dataset.

acrosb
Calcite | Level 5

Thank you for your help. This was how I figured the 6 month window:

 

 

Time=INTCK('month',submit_date, commit_date);

 

 

Then created a new data set....

 

where 0<Time<6;

 

 

The submit_date has to occur before the commit_date. Is this same logic being used here?

 

and ',commit_date,' <= submit_date <= ',intnx('month',commit_date,6),';

Is it looking for commit dates that occur within 6 months of the submit dates, or vice versa?

ballardw
Super User

Are you date variables actual SAS date values? Not character values that look like dates, not numeric values like 20180515 that mimic some form of date but a numeric value with hopefully a SAS date format.

 

 

acrosb
Calcite | Level 5

The dates are actual dates. (I'm using SAS EG) The icon for the columns are calendars, and hoovering over the field name the diaglog box says "Type: Date" 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 2066 views
  • 0 likes
  • 4 in conversation