for each applicant, collect applications appllied by applicants over p...

Alexxxxxxx · Posted 08-05-2021 02:52 AM

For each application_id, I expect to 1) identify the applicant which applied the application_id and then 2) identify application_id which applied by applicants over the past 5 years.

By using the following data as an example,

data have;
       infile cards dsd  dlm=",";
       input
       Obs :8.
       Applicant_name :$8.
       application_date: YYMMDD10.
       application_id :8.
       ;
       format
       application_date :YYMMDD10.
       ;
       cards;
       1,A,2021-07-28,101
       2,B,2021-07-28,101
       3,A,2017-07-28,102
       4,A,2016-07-27,103
       5,B,2018-07-28,104
       ;;;;

run;

I expect to have the below result.

Obs,Applicant_name,application_date,application_id,applied_id

1,A, 2021-07-28, 101, 102

2,B, 2021-07-28, 101, 104

3,A, 2017-07-28, 102, 103

4,A, 2016-07-27, 103

It is because,

For application_id 101, it is applied by two applicants, which are A and B. and then I expect to include application_id 102, because it is applied by applicant A on 2017-07-28, which is within 5 years before 2021-07-28. I include application_id 104 for the same reason (because 1) B applied for application 101 in 2021-07-28, 2) B also applied application 104 in 2018-07-28, 3) 2018-07-28 within the 5-years before 2021-07-28). I do not include application_id 103, because although it is applied by applicant A, it is not applied within 5 years before 2021-07-28.

I am using the code,

%MACRO Explo(year=);
	PROC SQL;
		CREATE TABLE want_p5y_&year. AS
		SELECT DISTINCT
		citing1.Applicant_name,
		citing1.application_date,
		citing1.application_id,
		citing2.application_id as applied_id
		FROM
		have as citing1
		left JOIN have AS citing2 ON citing1.Applicant_name = citing2.Applicant_name
		where citing1.application_id ^= citing2.application_id
		group by citing1.application_date
		having citing2.application_date between intnx('year', min(citing1.application_date), -5, 'S') and citing1.application_date 
		;
	QUIT;
%MEND Explo;
%Explo(year=2021)

and I got the result as

Applicant_name	application_date	application_id	applied_id
A	2017/7/28	102	103
A	2021/7/28	101	102
B	2021/7/28	101	104

The problem is

1) I only get 3 rows rather than 4 rows.

2) My dataset has 17883455 rows and 3 columns (from 1985 to 2020). It is too big for me. So I split the dataset by year. Even so, when I run the dataset with the above code, I still get the following results

ERROR: Insufficient space in file WORK.'SASTMP-000000010'n.UTILITY.
ERROR: File WORK.'SASTMP-000000010'n.UTILITY is damaged. I/O processing did not complete.
NOTE: Error was encountered during utility-file processing. You may be able to execute the SQL
      statement successfully if you allocate more space to the WORK library.

NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
real time 56:41.45
cpu time 2:14.93

Could you please give me some advice to get the expected result? I guess there should be some smart methods to get the result easily and quickly. but I just do not know.

Many thanks in advance for any advice.

andreas_lds · Posted 08-05-2021 03:41 AM

Seems to be the same problem as in https://communities.sas.com/t5/SAS-Programming/for-each-applicaiont-collect-applications-appllied-by.... Right? I tend to merge both posts ...

Alexxxxxxx · Posted 08-05-2021 05:11 AM

yes, they are the same. Many thanks for your advice. I do not know how to merge these two posts. This post summarizes the question of the previous post.

for each applicant, collect applications appllied by applicants over past 5 years

Re: for each applicant, collect applications appllied by applicants over past 5 years

Re: for each applicant, collect applications appllied by applicants over past 5 years

for each applicant, collect applications appllied by applicants over past 5 years

Re: for each applicant, collect applications appllied by applicants over past 5 years

Re: for each applicant, collect applications appllied by applicants over past 5 years

Registration is open

SAS Training: Just a Click Away