Dear all,
When I run the code below,
PROC SQL;
CREATE TABLE Step4.M_3cit AS
SELECT
t_base.HRM_L2_cited,
t_base.earliest_filing_month,
t_base.earliest_publn_date,
t_do.cited_docdb_family_id,
t_do.docdb_family_id,
t_ap2.appln_id,
t_ap2.pat_publn_id,
t_ap2.publn_date,
t_ap2.HRM_L2_citing
FROM
Pat_ori.Docdbfamiliescitations AS t_do
JOIN Step3.M_2cit AS t_base ON t_do.cited_docdb_family_id=t_base.docdb_family_id_base
JOIN Step1.appln_publn AS t_ap2 ON t_do.docdb_family_id=t_ap2.docdb_family_id
WHERE t_ap2.publn_date <= intnx('year', t_base.earliest_publn_date,3) /*a 3-year moving window starting at the earliest publication date of a patent family */
AND t_ap2.publn_date >= intnx('year', t_base.earliest_publn_date,0)
GROUP BY HRM_L2_cited,cited_docdb_family_id,t_do.Docdb_family_id
Having sum(HRM_L2_cited = HRM_L2_citing)>0 /* exclude self_citation */
ORDER BY t_base.HRM_L2_cited
;
QUIT;
I get the
320 PROC SQL;
321 CREATE TABLE Step4.M_3cit AS
322 SELECT
323 t_base.HRM_L2_cited,
324 t_base.earliest_filing_month,
325 t_base.earliest_publn_date,
326 t_do.cited_docdb_family_id,
327 t_do.docdb_family_id,
328 t_ap2.appln_id,
329 t_ap2.pat_publn_id,
330 t_ap2.publn_date,
331 t_ap2.HRM_L2_citing
332 FROM
333 Con_ori.Docdbfamiliescitations AS t_do
334 JOIN Step3.M_2cit AS t_base ON t_do.cited_docdb_family_id=t_base.docdb_family_id_base
335 JOIN Step1.appln_publn AS t_ap2 ON t_do.docdb_family_id=t_ap2.docdb_family_id
336 WHERE t_ap2.publn_date <= intnx('year', t_base.earliest_publn_date,3) /*a 3-year moving
336! window starting at the earliest publication date of a patent family */
337 AND t_ap2.publn_date >= intnx('year', t_base.earliest_publn_date,0)
338 GROUP BY HRM_L2_cited,cited_docdb_family_id,t_do.Docdb_family_id
339 Having sum(HRM_L2_cited = HRM_L2_citing)>0 /* exclude self_citation */
340 ORDER BY t_base.HRM_L2_cited
341 ;
NOTE: The query requires remerging summary statistics back with the original data.
ERROR: Sort execution failure.
342 QUIT;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
real time 2:10:07.98
cpu time 19:42.26
How can I overcome this problem? Could you please give me some suggestion about this ?
thanks in advance
Bottom line: not enough resources on the computer.
Use the _METHOD option to know more about the joins.
Workarounds:
- Presort the tables
- Break the query into 2 queries.
- Increase available RAM or disk space
Daer Reeza,
Could you please tell me how to change WHERE to ON conditions?
Besides, I simplify the code and run the following codes,
PROC SQL;
CREATE TABLE Step4.M_3cit1 AS
SELECT
t_do.HRM_L2_cited,
t_do.earliest_filing_month,
t_do.earliest_publn_date,
t_do.cited_docdb_family_id,
t_do.docdb_family_id,
t_ap2.appln_id,
t_ap2.pat_publn_id,
t_ap2.publn_date,
t_ap2.HRM_L2_citing
FROM
step5.M_3cit_step1 AS t_do
JOIN Step1.appln_publn AS t_ap2 ON t_do.docdb_family_id=t_ap2.docdb_family_id
WHERE t_ap2.publn_date <= intnx('year', t_do.earliest_publn_date,3) /*a 3-year moving window starting at the earliest publication date of a patent family */
AND t_ap2.publn_date >= intnx('year', t_do.earliest_publn_date,0)
GROUP BY HRM_L2_cited,t_do.cited_docdb_family_id,t_do.Docdb_family_id
Having sum(HRM_L2_cited = HRM_L2_citing)>0 /* exclude self_citation */
;
QUIT;
and the get following results,
238 PROC SQL;
239 CREATE TABLE Step4.M_3cit1 AS
240 SELECT
241 t_do.HRM_L2_cited,
242 t_do.earliest_filing_month,
243 t_do.earliest_publn_date,
244 t_do.cited_docdb_family_id,
245 t_do.docdb_family_id,
246 t_ap2.appln_id,
247 t_ap2.pat_publn_id,
248 t_ap2.publn_date,
249 t_ap2.HRM_L2_citing
250 FROM
251 step5.M_3cit_step1 AS t_do
252 JOIN Step1.appln_publn AS t_ap2 ON t_do.docdb_family_id=t_ap2.docdb_family_id
253 WHERE t_ap2.publn_date <= intnx('year', t_do.earliest_publn_date,3) /*a 3-year moving
253! window starting at the earliest publication date of a patent family */
254 AND t_ap2.publn_date >= intnx('year', t_do.earliest_publn_date,0)
255 GROUP BY HRM_L2_cited,t_do.cited_docdb_family_id,t_do.Docdb_family_id
256 Having sum(HRM_L2_cited = HRM_L2_citing)>0 /* exclude self_citation */
257 ;
NOTE: The query requires remerging summary statistics back with the original data.
NOTE: Invalid argument 2 to function INTNX. Missing values may be generated.
NOTE: Invalid argument 2 to function INTNX. Missing values may be generated.
ERROR: User asked for termination.
258 QUIT;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
real time 18.27 seconds
cpu time 0.78 seconds
closing clipboard failed
why there are
'NOTE: Invalid argument 2 to function INTNX. Missing values may be generated.
NOTE: Invalid argument 2 to function INTNX. Missing values may be generated.'
Could you please give me some suggestion about this?
thanks in advance.
Things like that were the reason for writing Maxim 10.
Break your task down into several steps, and use data step merging wherever you do not need cartesian joins.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.