BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
freda
Fluorite | Level 6

Good evening,

 

This is my first time to use SAS and also my first time to ask a question in this forum. Thank you all first for your patient and help.

 

I have the following data (sample):

obs    id     listing_1  listing_2  selling_1 selling_2       status             date_sign                      la1_no_la1_5     la1_no_la2_5

1        1           A              B             C            D              sold               2012-01-23 00:00:12            1                        2

2        2           C             .               D             .               sold              2007-05-01  00:01:15            0                        0

3        3           E              F             A             C              sold              2013-03-05  08:21:05            1                        0

4        4           B              A              .              .              expired         2011-01-15 05:21:00              0                       0

5        5            A             C             B             F               sold              2008-05-11 00:00:00             0                       0

6        6           E               .             D             .                sold               2011-07-15 00:23:15            0                        0

7        7           C              A            A             .                 sold               2012-11-26 00:15:00             1                       1

8        8           C               .             C              F             sold               2007-12-15  00:17:25            1                        0

9        9           F              B             B              C              sold               2014-12-01 00:00:00            0                       1

10      10          D             A              .               .              expired         2011-12-14 00:15:25              0                        0

 

The last two columns are the results or new variables I want to create.

 

I have 10 observations, each observation is a transaction. For each transaction, at most two listing agents (listing_1 and listing_2) and two selling agents (selling_1 and selling_2) involved. Each id, no matter get sold or go expired, has a date_sign, which is the date the contract signature date between the listing agents and the client. I want to create two new variables: la1_no_la1_5 and la1_no_la2_5 to record the first listing agent's past activities. la1_no_la1_5 represents the total number of transactions (both sold and expired) the first listing agent involved as a first listing agent in the past five years of the signature date.  For example, for obs1, la1_no_la1_5= 1 means the first listing agent of id=1 transaction involved in 1 transaction as a first listing agent in the past 5 years of the signature date (2012-01-23 00:00:12 ).

 

The process behind I want to create a loop that for each id, the system finds its first listing agent, A in the above example, then look up all the id's (from obs1 to obs10) first listing agents and find the first listing agent = A, then compare the date_sign. if date_sign is within five years, +1 to la1_no_la1_5.

 

I used stata to successfully calculate the numbers, but because my full data is too large, almost 1.5 million transactions, stata takes too long to process the code.

Following are my stata code:

gen la1_no_la1_5 = .
local N = _N
forvalues i = 1(1)`N' {
count if listing_1[`i']==listing_1 & (date_sign[`i'] - date_sign)/365.25 <= 5 &(date_sign[`i'] - date_sign)/365.25 > 0
replace la1_no_la1_5 = r(N) in `i'
}

 

Can anyone help me to translate the stata code to useful sas code or help me to write a new sas code to process the above calculation? I really appreciate.

 

Thank you again

 

Freda

 

1 ACCEPTED SOLUTION

Accepted Solutions
13 REPLIES 13
PeterClemmensen
Tourmaline | Level 20

When you say "within five years", does that mean looking back five years?

freda
Fluorite | Level 6
Yes, within last 5 years of the date_sign.
ChrisNZ
Tourmaline | Level 20

Like this?

 


proc sql; 
select unique have.*, count(past1.date_sign) as la1_no_la1_5
from have
    left join
    have past1
    on  have.listing_1=past1.listing_1 
    and 0 < have.date_sign-past1.date_sign < 5*365.25
    group by 1  
    order by 1  ;
obs id listing_1 listing_2 selling_1 selling_2 status date_sign la1_no_la1_5
1 1 A B C D sold 23-Jan-12 1
2 2 C   D   sold 1-May-07 0
3 3 E F A C sold 5-Mar-13 1
4 4 B A     expired 15-Jan-11 0
5 5 A C B F sold 11-May-08 0
6 6 E   D   sold 15-Jul-11 0
7 7 C A A   sold 26-Nov-12 1
8 8 C   C F sold 15-Dec-07 1
9 9 F B B C sold 1-Dec-14 0
10 10 D A     expired 14-Dec-11 0

 

 

freda
Fluorite | Level 6
Thank you so much. I copied your code, it generates the desired result in the output. However, it does not create new file by telling me the following error message:
NOTE: The query requires remerging summary statistics back with the original data.
Do you know how to solve this issue?
Thank you again.

Freda
ChrisNZ
Tourmaline | Level 20

It does not create new file 

Add create table at the start of the query

freda
Fluorite | Level 6
Hello,

May I ask if I want to add the more complicated condition on for the count if, how should I modify the code? I tried the following but failed.
proc sql;
create table table1 as
select unique a.*, count(past1.date_sign) as la1_no_hand_la1
from lal
left join
lal past1
on (lal.listing_1=past1.listing_1 | lal.listing_1=past1_listing_2 | lal.listing_1=past1_listing_3 | lal.listing_1=past1_listing_4)
& ((lal.date_sign< past1.date_sign < lal.date_exp & past1.status=$EX) | (lal.date_sign< past1.date_sign < lal.date_sold & past1.status=$VE)|(lal.date_sign< past1.date_exp < lal.date_exp & past1.status=$EX & lal.status=$EX)|(lal.date_sign< past1.date_exp < lal.date_sold & past1.status=$EX & lal.status=$VE) | (lal.date_sign< past1.date_sold < lal.date_exp & past1.status=$VE & lal.status=$EX)|(lal.date_sign< past1.date_sold < lal.date_sold & past1.status=$VE & lal.status=$VE) )
group by 1
order by 1 ;

Could you give me some suggestion?

Thank you so much

Freda
ChrisNZ
Tourmaline | Level 20

We have no clue what "more complicated" conditions you want.

However this syntax is invalid.

past1.status=$EX

Do you mean 

past1.status='$EX'

?

 

freda
Fluorite | Level 6

Yes, I want to count the number of transactions at hand. EX stands for expired and VE stands for sold.

 

Thank you 

 

Freda

freda
Fluorite | Level 6
Hello,

I modified the code as follows:

proc sql;
create table table1 as
select unique la1.*, count(?) as la1_no_hand_la1
from la1
left join
la1 past1
on (la1.listing_1=past1.listing_1 and la1.date_sign< past1.date_sign < la1.date_exp and la1.status="$EX" )
or (la1.listing_1=past1.listing_1 and la1.date_sign< past1.date_sign < la1.date_sold and la1.status="$VE")
or (la1.listing_1=past1.listing_1 and la1.date_sign< past1.date_exp < la1.date_exp and past1.status="$EX" and la1.status="$EX")
or (la1.listing_1=past1.listing_1 and la1.date_sign< past1.date_exp < la1.date_sold and past1.status="$EX" and la1.status="$VE")
or (la1.listing_1=past1.listing_1 and la1.date_sign< past1.date_sold < la1.date_exp and past1.status="$VE" and la1.status="$EX")
or (la1.listing_1=past1.listing_1 and la1.date_sign< past1.date_sold < la1.date_sold and past1.status="$VE" and la1.status="$VE")
group by 1
order by 1 ;

However, I don't know what I should put instead of the ? in the count(?) because I don't understand what does past.1 stands for. I tried the original code provided by you as count(past1.date_sign), but it does not work this time.
The result I want to have is the listing at hand during the contract period. For example, for id=1, the listing_1 A has its own contract period between date_sign and date_exp or date_sold if the listing has been sold. I want to count all the listings the first listing agent A involved.

Thank you

Freda
ChrisNZ
Tourmaline | Level 20

I don't understand what does past.1 stands for.

past1 is the name of a table.

There is only one table, but you merge it *with itself* on ID, and past1 only matches on past dates as per the match condition.

freda
Fluorite | Level 6
The result supposed to be:
id result
1 1
2 5
3 0
4 1
5 0
6 2
7 0
8 0
9 0
10 1
11 1
12 0
13 2
14 0
15 0
16 0
17 0
18 1
19 0
20 1
21 0
22 0
23 0
24 1
25 1
26 0
27 0
28 1
29 1
30 0
freda
Fluorite | Level 6

Here is the full data for the testing

 

Thank you 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 1298 views
  • 1 like
  • 3 in conversation