About edmb

PGStats · ‎10-06-2020

OK, the example only involves 3 card types, so it doesn't show any clustering at all. Now, imagine there are three card types with 10 cards of each type. I expanded the example with such cards (identified by variable cardId): /* Fake random data */ data test; call streaminit(879767); do acc = 1 to 30000; cardType = rand("table", 0.4, 0.4); cardId = ((cardType - 1) * 10) + rand("integer", 10); app = rand("table", 0.4); tenure = rand("poisson", 10); calls = rand("poisson", 3); if cardType = 1 then do; transac = rand("poisson", 10); spend = rand("lognormal", log(100)); end; else if cardType = 2 then do; transac = rand("poisson", 20); spend = rand("lognormal", log(1000)); end; else do; transac = rand("poisson", 50); spend = rand("lognormal", log(2000)); end; output; end; run; /* Use formats to define categories */ proc format; value tenure 0-12 = "new card" 13-36 = "mid time card" 37-high = "long time card"; value calls 0-2 = "few calls" 3-5 = "mid calls" 6-high = "lots of calls"; value transac 0-9 = "few transact" 10-49 = "mid transact" 50-high = "many transact"; value spend 0-100 = "low spend" 100-1000 = "mid spend" 1000-high = "high spend"; value app 1 = "App" 2 = "No App"; value cardType 1 = "major" 2 = "store" 3 = "prepaid"; value cardId 1 = "Major 01" 2 = "Major 02" 3 = "Major 03" 4 = "Major 04" 5 = "Major 05" 6 = "Major 06" 7 = "Major 07" 8 = "Major 08" 9 = "Major 09" 10 = "Major 10" 11 = "Store 11" 12 = "Store 12" 13 = "Store 13" 14 = "Store 14" 15 = "Store 15" 16 = "Store 16" 17 = "Store 17" 18 = "Store 18" 19 = "Store 19" 20 = "Store 20" 21 = "Prepaid 21" 22 = "Prepaid 22" 23 = "Prepaid 23" 24 = "Prepaid 24" 25 = "Prepaid 25" 26 = "Prepaid 26" 27 = "Prepaid 27" 28 = "Prepaid 28" 29 = "Prepaid 29" 30 = "Prepaid 30"; run; /* Perform simple correspondence analysis */ proc corresp data=test; format cardId cardId. app app. tenure tenure. calls calls. transac transac. spend spend.; tables cardId, app tenure calls transac spend; run; Now, I guess you can see how the clustering of cards by card type is represented on the graph and how the angular (from the origin) proximity of explanatory categories shows their relationship with clusters. Correspondence analysis is not yet very popular in the USA, but it has been in France, Japan, and elsewhere in the world under different names for a long time, especially in marketing research. https://documentation.sas.com/?docsetId=statug&docsetTarget=statug_corresp_overview01.htm&docsetVersion=15.2&locale=en

HB · ‎03-30-2018

Can you assume first foreign transaction is start date and go from there?

edmb · ‎07-25-2017

No worries, thank you both for your assistance.

edmb · ‎10-18-2016

Cheers to you both, though Chris's method in RW9's post worked the best for me.

edmb · ‎03-04-2016

Thank you SASKiwi, my company's SAS admin team are now going to look into it, at last! I shall pass them on your comments. Many thanks 🙂

Online Status	Offline
Date Last Visited	‎10-07-2020 12:22 PM

Re: Clustering Problem

Clustering Problem

Estimating Survival Times with Left-Truncated & Right-Censored Data

Re: Using macro inside PROC NLP results in error

Re: Using macro inside PROC NLP results in error

Re: Using macro inside PROC NLP results in error

Using macro inside PROC NLP results in error

Re: Copying and renaming a XLSX file using SAS 9.3

Copying and renaming a XLSX file using SAS 9.3

Re: Connecting outside Virtual Environment

Re: Clustering Problem

Re: Clustering Problem

Re: Using macro inside PROC NLP results in error

Re: Using macro inside PROC NLP results in error

Re: Connecting outside Virtual Environment

Re: Clustering Problem

Re: Estimating Survival Times with Left-Truncated & Right-Censored Dat...

Re: Using macro inside PROC NLP results in error

Re: Copying and renaming a XLSX file using SAS 9.3

Re: Connecting outside Virtual Environment