BookmarkSubscribeRSS Feed
acrosb
Calcite | Level 5

Hi, 

 

I have two data sets:

Data Set 1 Variables: Description (unique), Organization_ID, and Submit_Date

Data Set 2 Variables: Organization_ID, Commit_Date, and Amount 

Data looks like this:

DATASET1

Description                                                                   Organization_ID           Submit_Date (MMDDYY6.)

Customer was unhappy with product.                            A12                                 01052015

They completed the training and called for help            B22                                 02012015 

The customer asked for assistance with the product.    B22                                 02202015

No long want to do business                                          F02                                 03102015

 

DATASET2:

Organization_ID            Commit_Date (MMDDYY6.)            Amount 

C50                                  05012015                                           258 

D15                                  01122015                                          1589 

A12                                  01102015                                          100

B22                                  02152015                                          200

B22                                  02162015                                          50

B22                                  10152015                                          879 

 

 

For each Description, I want to find the total Amount for that Organization_ID, when the Submit_Date is within 6 months of the Commit_Date. In other words, for each Description, I want to see if/how much $ the submitting organization spent in the 6 months since submitting the description. 

 

Output I want:

DATASET1 with this column appended at the end:

 

TotalAmount

100

250

250

0

 

My very messy way of doing it was a left join by Organization_ID (which blows up the data set size since an org can have many records in the second data set - this would become an issue if I increased my data universe), then filtered to only records where dates within 6 months. Then I summed the Amount by Organization_ID.

 

Is there a more concise way of doing this? I'd like to have a program that goes through each Description, then looks through the records in the second table to find those with the same org and a date within 6 months, then return with either a 0 or a summation of the Amounts for the matching records. 

 

Any suggestions for certain procedures or methods I should look into?

 

Amanda

 

 

 

5 REPLIES 5
kiranv_
Rhodochrosite | Level 12

please show a sample of input and output data, someone will able to help you

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Please post test data in the form of a datastep to illustrate what you have, and show what you want out.

One method maybe to generate the code from the first, something like:

data _null_;
  set ds1;
  call execute(cat('proc means data=ds2; 
                      where organization_id="',organization_id,'" 
                        and ',commit_date,' <= submit_date <= ',intnx('month',commit_date,6),';
                      output out=tmp sum=sum;
                    run;'));
if _n_=1 then call execute('data want; set tmp; run;');
else call execute('data want; set want tmp; run;'); run;

This will sum each block and add it to the want dataset.

acrosb
Calcite | Level 5

Thank you for your help. This was how I figured the 6 month window:

 

 

Time=INTCK('month',submit_date, commit_date);

 

 

Then created a new data set....

 

where 0<Time<6;

 

 

The submit_date has to occur before the commit_date. Is this same logic being used here?

 

and ',commit_date,' <= submit_date <= ',intnx('month',commit_date,6),';

Is it looking for commit dates that occur within 6 months of the submit dates, or vice versa?

ballardw
Super User

Are you date variables actual SAS date values? Not character values that look like dates, not numeric values like 20180515 that mimic some form of date but a numeric value with hopefully a SAS date format.

 

 

acrosb
Calcite | Level 5

The dates are actual dates. (I'm using SAS EG) The icon for the columns are calendars, and hoovering over the field name the diaglog box says "Type: Date" 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1437 views
  • 0 likes
  • 4 in conversation