BookmarkSubscribeRSS Feed
qkaiwei
Calcite | Level 5

I'm using path analysis to analyze web-log data, but the data quality is not good, and also there's no time to rebuild mechanism of collecting data.

For example:

1. No session, onlye uid + date

2. No sequence, although each step(referer, target) is recoreded, but there is no URL request time (sequence), that means we don't exactly konw which step fowards, and which follows.

A lot of difficulties exsiting, but analysis must be done, and the possible way of sovling I think is :

1. Treat uid + date as a fake session, because 80% visitors visit the website once a day.

2. sequence variable value in each fake session is a constant, such as 1, that means no sequence.

My question is: Is the 2nd point of my solution feasible or not?

Thank you!

William

11 REPLIES 11
qkaiwei
Calcite | Level 5

Any question submitted by me in other profressional communities, such as forecasting, EG, Base, will be answered within 3 days, except EM communities.

Is the SAS question is a question to SAS Company?

Reeza
Super User

This is a user forum, so users answer. Unfortunately there aren't a lot of EM users out there, so you're less likely to get responses. You also post at non-US standard office hours so the questions kind of get buried.

You can always contact tech support (SAS Company) they're pretty quick and knowledgeable.

qkaiwei
Calcite | Level 5

I know there are a lot of warm-hearted SAS product managers, replying quickly, in the forum, such as Udo. etc.

jwexler
SAS Employee

Hi apologies you did not receive an expedient response.  It's an interesting question, I will get back to you with an apprpriate response...

Thanks,

Jonathan

Reeza
Super User

Are you saying that if a person has only 1 occurrence then that isn't a forward/follow? To me that's a direct hit, e.g. typing in a web address, so just a different part of the analysis.

If I even vaguely hit on what you're trying to say Smiley Happy

jwexler
SAS Employee

Thanks goes to Tao Wang, one of our EM Developers at SAS:

Yes:

If the sequence variable value in each fake session is a constant, such as 1, that means no sequence and the “analysis type” of PROC PATH is “all frequent paths”.

Then neither session or sequence matters.

PROC PATH then just outputs all frequent paths using the “support-confidence” framework to generate rules like X ==> Y.

X and Y are URLs.

The “support-confidence” framework is described in Wikipedia:

http://en.wikipedia.org/wiki/Association_rule_learning

qkaiwei
Calcite | Level 5

Thank you for your reply and the info is useful for me.

Try the codes, if I use T1, I can't get any rules, but if using T2, the rules are generated, the diference between T1 and T2 is only the order of obs.

Can anyone explain it?

data t1;

  input id prev_page:$8. curr_page:$8. seq_id;

cards;

1 a b 1

1 . a 1

;

data t2;

  input id prev_page:$8. curr_page:$8. seq_id;

cards;

1 . a 1

1 a b 1

;

%let id=w1;

proc path data=t1

  out=&id._RULES freqout=&id._FREQ

  support = 1

  items=10

  ;

  id id ;

  target curr_page;

  sequence seq_id / min=0 max=1000000;

  REFERRER prev_page / BACKUP_LIMIT=5 MAXIMAL ;

  Score out=&id._SCORERULES numrules=1000000 sortby SUPPORT include all_id ;

  Transition out=&id._TRANSITION;

  Funnel out=&id._FUNNELCOUNTS ;

run;quit;

jwexler
SAS Employee

Hi, try removing the REFERRER statement, that should do the trick.  There's some nuances of the REFERRER statement that we are working through.

qkaiwei
Calcite | Level 5

Hi,

I think perhaps it's not good to remove REFERRER statment, because at least REFERRER containing sequence information.

I said I had no sequence_id within a session, if removing  REFERRER, the path, such as b->a ,althought that was never passed though by the vistor, still would be generated by the pure association theory.

qkaiwei
Calcite | Level 5

Need help!

jwexler
SAS Employee

Hi at this point my recommendation is to work with Tech Support at support.sas.com  There are some coding changes we are going through for this statement so we want to make sure you get the right guidance and support.

They will be able to guide you through usage.  That way we can log your usage (no pun intended!) and work with you and your specific needs.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 11 replies
  • 2080 views
  • 0 likes
  • 3 in conversation