BookmarkSubscribeRSS Feed
KJazem
Obsidian | Level 7

I am trying to implement some difficult logic in SAS, and since I don't have much experience in macros, which I assume I have to use, can someone please give me a bit of direction of how you might do this? 

 

The high-level logic goes something like this. We have a base of 10million, and we choose batches from that. Batches are randomly selected. If someone opens an application at any stage, they are completely excluded from all batches for 1 month.

 

 

Day 1: 10k batch -> CATEGORY A
Day 2:
New 10k batch excluding day 1 batch -> CATEGORY A
From day 1 customers (in CATEGORY A), if they open application -> STOP
                                      else they go to CATEGORY B
Day 3:
New 10k batch excluding day 1 and day 2 batch -> CATEGORY A
From day 2 customers (in CATEGORY A), if they open application -> STOP
                                      else they go to CATEGORY B
From day 2 customers (in CATEGORY B), if they open application -> STOP
                                      else they go to CATEGORY C

 

 

And this goes on until Day 5 (and to CATEGORY E) and resets after - so back to Day 1 etc. Some categories can have a capacity - so if out of the 10k batch, 5k go to CATEGORY D, which has a capacity of 2k, only 2k out of 5k can go. The others are ignored. These are chosen based on a value.  

 

 

7 REPLIES 7
Astounding
PROC Star

While this doesn't sound like a macro application, there are design issues to consider before programming begins.  For example ....

 

Do you intend to pull all the categories at once, or will you pull a new category as needed?  Or a new category every month?

 

It would be a good idea to define "month" ... calendar month, 30-day time period ... what exactly does "month" mean?

 

If a person is already assigned to a category, but then opens an application, what happens to that person?  How does your master data set change?

 

If enough time passes, can a person be re-assigned to a second category?

 

Might A, B, C be a poor choice for categories?  You might need more than 26 categories.  If you are not assigning all categories at the same time, it might pay to use "date of assignment to category" instead of A, B, C.

KJazem
Obsidian | Level 7
  • Categories are assigned per day basis. Day 1 only has CATEGORY A, day 2 has A and B, etc. 5 days pass, and the flow resets.
  • Month is 30 days; when a customer is assigned a category, they're excluded from all other categories for 30 days. After that, he can again be assigned to any category.
  • If a person is already assigned a category means he opened an application. What happens after that is irrelevant for 30 days. If they open another application during the 30 day period, we just ignore.
  • If enough time passes, a customer can be assigned a different category. Even if their recommended product (see below explanation for opening application) is changed 5 days after being assigned to a category, we ignore. 30 days pass, and we can include them in a new batch

 

To clarify on what opening application is; every customer gets a recommended product. If we recommend product A, and that customer starts an application for it the next day, he is assigned to a category. Categories are basically baskets; each customer has to satisfy a few requirements before entering that basket.

 

You must open an application to get assigned a category. If you don't qualify for that category (or categories) on a particular day, you're removed from that day and pushed to next day (maybe you'll qualify for the next available category). They could well not qualify for any basket, and these customers are completely ignored.


It's still not 100% clear even me but hopefully I clarified at least half of it. If not macros, how do you think this can be approached? 

Astounding
PROC Star

I'm not sure I can devote enough time to this today, but I can (I hope) get you moving in the right direction.

 

To write a macro, you need to first have working non-macro code.  So picture what the data looks like now, and what it would look like after extracting category A.  If you are not able to write the program to do this, just describe what would need to happen to change the data from its original form when you assign category A for the first time.  While you probably will not need macros, there is an outside chance that you may.  Either way, you will need to start with the non-macro code.

Reeza
Super User
Can you provide examples, small ones, of what the input data would look like? Format and data types are helpful.
For example you have your starting data and the data that shows who opens applications.
This isn't that complicated but you basically have to update every day and the data sets will change every day...
KJazem
Obsidian | Level 7

Everyday we generate, for X million customers, a recommended product. The table looks like this. The value column is used when creating daily batches - sort descending by product and by value and take highest 10k batch etc.; but at the moment, that might not be applicable as we want to take a completely random batch regardless of value. The flags will be used as filters to assign categories; ex: if for customer 1, flag1 = 1 and flag2 = 1, they can be assigned CATEGORY A, and so on. We save a 60 day history of this table.

id product value flag1 flag2 flag3 run_date
1 A 22.52 1 0 1 18.10.2022
2 B 13.53 1 1 0 18.10.2022
3 A 55.21 1 1 1 18.10.2022

 

Based on our recommendations, a customer can choose to open an application for that product. To deliver or communicate the product to the customer, we assign customers into categories. These categories ensure the product is delivered to the customer in different ways; ex, over mobile or through email, etc. 

 

Also, everyday a new table is generated for open applications. From this table, we can know if a customer 'reacted' to the product we recommended and was assigned to a particular category, by opening an application. We save a 5 year history of this table.

id product application_date
1 A 26.10.2022
3 B 27.10.2022

 

A couple of notes:

  • We have to generate a new 10k batch everyday, and those 10k are always assigned CATEGORY A unless they are disqualified (from the flags). If they are disqualified, they are pushed into the next day, and can be assigned to CATEGORY B if they qualify. 
  • If you open an application the next day (after being assigned a category the previous day), it's considered a 'success' and you're out. You can't be assigned a category for 30 days, even if we recommend you a different product during that time.
  • CATEGORY D and CATEGORY E have daily capacities. If we have a potential of 25k customers to be assigned to either category but have a daily capacity of 5k for each, we take the 5k customers with the highest values and assign those. The rest of 20k are pushed to the next day's batch.
KJazem
Obsidian | Level 7

@Reeza @Astounding 

 

I drew a couple of quick visuals of the workflow:

 

KJazem_0-1666975127050.png

KJazem_1-1666975157585.png

Clarifications:

 

  • If they did NOT open application on day 2 (after being assigned to category A) AND they are not eligible to category B, they skip a day and are moved to category C on day 3.
  • Category D and E have 'capacities'; ex: if 25k on day 3 did not open an application (after being assigned to category C) AND are eligible for category D, only 5k can be assigned to category D. They are dropped for day 4 and are either, 1) moved to next week's batch or 2) moved to the next day batch. I would like to implement both as we have no decision on this yet. Similarly for category E.
  • 10K on day 2 are customer the next day's batch who are eligible for category A. There are fresh batches everyday.
  • 9k in category B are customers who did NOT open an application (after being assigned to category A) AND are eligible for category B.
  • Customers who open an application at any point are excluded from this flow for 30 days.
  • If after being assigned category E (last category) and customer has not opened an application, they are excluded from this flow for 30 days.
KJazem
Obsidian | Level 7
A couple of more notes. To be eligible for:

category A, flag1 = 1 or flag2 = 1
category B, flag2 = 1
category C, flag1 = 1 and flag3 = 1
category D, flag2 = 1
category E, flag3 = 1

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1315 views
  • 7 likes
  • 3 in conversation