SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

create new variable when group-statement met

Accepted Solution Solved
Reply
Contributor
Posts: 58
Accepted Solution

create new variable when group-statement met

[ Edited ]

Hi

 

I have huge datasets (7 mil) and I need to find out how many times a client was placed. 

The client _id in this case is REFER_ID: 

Placement times being with BEG_RMVL_DT and end with END_RMVL_DT (these are datetime variables / numbers)

Every time there is a new Placement there is a new Placement number: RMVL_ID

 

So I could either work woth the placement beginning and end times, or simply with the PLacement number, bound or grouped by REFER_ID.

What I want to know is 

How many times was a client (REFER_ID ) placed, and that could have happened several times per client, once or not at all. 

 

My thinking was to create a new variable to say there was a new placement, more so as this data will later on be used for machine learning purposes. Yes, I could use RMVL_ID to cout the incidents with standard proceedures, but I really want a new column indicating when a new incident happenened, grouped by refer_id.

How would I do that? And / Or are there better ideas. 

Please find a test dataset attached. 

 

In a different, but similiar  case, I only have the datetimes and need to find out how often a new incident takes place, again grouped by Refer_ID or Cl_ID or even a specific Role.  How does it work there? 

 

Many thanks

 

 

REFER_IDRMVL_IDBEG_RMVL_DTEND_RMVL_DT
1800217   
1800217   
1800217   
1800217   
1800217   
1800217   
1800217   
1800219   
1800219   
1800219   
1800219   
1800219   
1800219   
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
18002215449715Nov1989 0:00:0013Jan1990 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
18002215449831Jul1991 0:00:0031Jul1991 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800221155976625Mar2007 0:00:0013Sep2007 0:00:00
1800222   
1800222   
1800222   
1800222   
1800222   
1800223   
1800223   
1800223   
1800223   
1800223   
1800223   
1800223   
1800223   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800226   
1800227154842403Feb2006 0:00:0009Feb2006 0:00:00
1800227154842403Feb2006 0:00:0009Feb2006 0:00:00
1800227154842403Feb2006 0:00:0009Feb2006 0:00:00
1800227154842403Feb2006 0:00:0009Feb2006 0:00:00
1800227154842403Feb2006 0:00:0009Feb2006 0:00:00
1800227154842403Feb2006 0:00:0009Feb2006 0:00:00
1800227154842403Feb2006 0:00:0009Feb2006 0:00:00
1800227154842403Feb2006 0:00:0009Feb2006 0:00:00
1800227154842403Feb2006 0:00:0009Feb2006 0:00:00
1800227154842403Feb2006 0:00:0009Feb2006 0:00:00
1800227154842403Feb2006 0:00:0009Feb2006 0:00:00
1800227154842403Feb2006 0:00:0009Feb2006 0:00:00
1800227154842403Feb2006 0:00:0009Feb2006 0:00:00
1800227154842403Feb2006 0:00:0009Feb2006 0:00:00
1800227154842403Feb2006 0:00:0009Feb2006 0:00:00
1800227154842403Feb2006 0:00:0009Feb2006 0:00:00
1800227154842403Feb2006 0:00:0009Feb2006 0:00:00
1800227154842403Feb2006 0:00:0009Feb2006 0:00:00
1800227154842403Feb2006 0:00:0009Feb2006 0:00:00
1800227154842403Feb2006 0:00:0009Feb2006 0:00:00
 

Accepted Solutions
Solution
‎07-08-2017 10:56 PM
Super User
Posts: 17,868

Re: create new variable when group-statement met

I think this post should answer your questions:

https://stats.idre.ucla.edu/sas/faq/how-can-i-create-an-enumeration-variable-by-groups/

 

For BY group processing, in general, a data step is much easier to code than SQL. 

The SAS documentation has some good diagrams that illustrate BY group processing.

 

http://support.sas.com/documentation/cdl/en/lrcon/69852/HTML/default/viewer.htm#n01a08zkzy5igbn173zj...

 


AnnaNZ wrote:

I think PROC SQL could be the way to go 

http://www2.sas.com/proceedings/sugi27/p191-27.pdf

 

CAn somebody help me though how to group by REFER_ID , create a variable called INCIDENT and create a condition, whereby every time that a new (BEG_RMVL_DT and END_RMVL_DT)   pair occurs within the same REFER_ID , there is a 1 (opposed to a 0)  in the  INCIDENT variable

 

That would be marvellous, thanks, 

 


 

View solution in original post


All Replies
Super User
Posts: 5,257

Re: create new variable when group-statement met

Why do you need to add something to the detail data?
There are tons of ways in SAS to do counts: SQL, PROC MEANS/SUMMARY, PROC FREQ.. .
Data never sleeps
Contributor
Posts: 58

Re: create new variable when group-statement met

I know that,

but it has to be grouped by Refer_ID.

IN a diferent case and I wanted to add this to my question above, I only have datetimes and need to find out how often a new incident takes place.

So how can I get tehse two case working?

 

Contributor
Posts: 58

Re: create new variable when group-statement met

I think PROC SQL could be the way to go 

http://www2.sas.com/proceedings/sugi27/p191-27.pdf

 

CAn somebody help me though how to group by REFER_ID , create a variable called INCIDENT and create a condition, whereby every time that a new (BEG_RMVL_DT and END_RMVL_DT)   pair occurs within the same REFER_ID , there is a 1 (opposed to a 0)  in the  INCIDENT variable

 

That would be marvellous, thanks, 

 

Super User
Posts: 5,085

Re: create new variable when group-statement met

You're asking about fairly standard programming techniques.  For example:

 

proc sort data=have;

by refer_id beg_rmvl_dt;

run;

 

data want;

set have;

by refer_id beg_rmvl_dt;

incident = first.beg_rmvl_dt;

run;

 

It's up to you how you want to treat missing values for BEG_RMVL_DT.

 

If all you want is the counts of INCIDENT, then SQL might come into play.  But if you want to keep all the original observations, SQL becomes more complex.

Solution
‎07-08-2017 10:56 PM
Super User
Posts: 17,868

Re: create new variable when group-statement met

I think this post should answer your questions:

https://stats.idre.ucla.edu/sas/faq/how-can-i-create-an-enumeration-variable-by-groups/

 

For BY group processing, in general, a data step is much easier to code than SQL. 

The SAS documentation has some good diagrams that illustrate BY group processing.

 

http://support.sas.com/documentation/cdl/en/lrcon/69852/HTML/default/viewer.htm#n01a08zkzy5igbn173zj...

 


AnnaNZ wrote:

I think PROC SQL could be the way to go 

http://www2.sas.com/proceedings/sugi27/p191-27.pdf

 

CAn somebody help me though how to group by REFER_ID , create a variable called INCIDENT and create a condition, whereby every time that a new (BEG_RMVL_DT and END_RMVL_DT)   pair occurs within the same REFER_ID , there is a 1 (opposed to a 0)  in the  INCIDENT variable

 

That would be marvellous, thanks, 

 


 

Contributor
Posts: 58

Re: create new variable when group-statement met

THat´s it thank you

 

  Proc sort data=   Kea.REFER_DT; by  RMVL_ID ; run; 
  Data Kea.REFERRAL; 
Set  Kea.REFER_DT; 
 count + 1;
by RMVL_ID;
if first.RMVL_ID then count = 1;
If RMVL_ID = . then count = 0;
run; 
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 222 views
  • 1 like
  • 4 in conversation