hello all,
I would like to count the number of distinct pat_bubln_id of earliest_filing_id by following codes,
proc sql;
create table step1.citatins_of_earliest_filing_id as
SELECT
t6.psn_name,
t1.earliest_filing_id,
count(distinct t3.pat_publn_id) AS cites_in_2006
from Sasdata.Applicationsnew as t1
join Step1.Earlist_grant_date_earlist_id2 as t2 on t1.appln_id=t2.appln_id
join Sasdata.Publicationsnew as t2b on t2.appln_id=t2b.appln_id
join Sasdata.Citations as t3 on t2b.pat_publn_id=t3.cited_pat_publn_id
join Sasdata.Publicationsnew as t4 on t3.pat_publn_id=t4.pat_publn_id
join Sasdata.Personapplication as t5 on t1.appln_id = t5.appln_id
join Sasdata.Companies as t6 on t5.person_id = t6.person_id
where
year(t2.granted_date) <> 9999
and t2.granted_date >= '2000-01-01'
and t2.granted_date <= '2000-12-31'
and t2b.publn_first_grant=1
and t4.publn_date <= DATE_ADD(t2.Earliest_grant_date_final,INTERVAL 3 YEAR)
group by co.psn_name, ap.earliest_filing_id
order by co.psn_name, ap.earliest_filing_id
;
quit;
but the result shows,
292 proc sql;
293
294 create table step1.citatins_of_earliest_filing_id as
295 SELECT
296 t6.psn_name,
297 t1.earliest_filing_id,
298 count(distinct t3.pat_publn_id) AS cites_in_2006
299 from Sasdata.Applicationsnew as t1
300 join Step1.Earlist_grant_date_earlist_id2 as t2 on t1.appln_id=t2.appln_id
301 join Sasdata.Publicationsnew as t2b on t2.appln_id=t2b.appln_id
302 join Sasdata.Citations as t3 on t2b.pat_publn_id=t3.cited_pat_publn_id
303 join Sasdata.Publicationsnew as t4 on t3.pat_publn_id=t4.pat_publn_id
304 join Sasdata.Personapplication as t5 on t1.appln_id = t5.appln_id
305 join Sasdata.Companies as t6 on t5.person_id = t6.person_id
306 where
307 year(t2.granted_date) <> 9999
308 and t2.granted_date >= '2000-01-01'
309 and t2.granted_date <= '2000-12-31'
310 and t2b.publn_first_grant=1
311 and t4.publn_date <= DATE_ADD(t2.Earliest_grant_date_final,INTERVAL 3 YEAR)
-
22
200
ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, (, ), *, **, +, ',', -, '.',
/, <, <=, <>, =, >, >=, ?, AND, BETWEEN, CONTAINS, EQ, EQT, GE, GET, GT, GTT, IN, IS,
LE, LET, LIKE, LT, LTT, NE, NET, NOT, NOTIN, OR, ^, ^=, |, ||, ~, ~=.
ERROR 200-322: The symbol is not recognized and will be ignored.
312 group by co.psn_name, ap.earliest_filing_id
313 order by co.psn_name, ap.earliest_filing_id
314 ;
NOTE: The "<>" operator is interpreted as "not equals".
315
316 quit;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
real time 0.01 seconds
cpu time 0.03 seconds
could you please give me some suggestions to improve the codes?
thanks in advance.
You need to do some debugging yourself. The log is telling you what the issue is:
ERROR: INPUT function requires a character argument.
Therefore your variable is numeric, so treat it as numeric (remember I cannot see your data, I can only suggest):
year(t2.Earliest_grant_date_final) <> 9999 and t2.Earliest_grant_date_final >= '01Jan2000'd and t2.Earliest_grant_date_final <= '31Dec2000'd and t2b.publn_first_grant=1 and t4.publn_date <= intnx('year',t2.earliest_grant_date_final,3)
This:
DATE_ADD(t2.Earliest_grant_date_final,INTERVAL 3 YEAR)
Is not an ANSI SQL piece of code. I suspect you have got this code from a database which has this "date_add" function. You will need to replace it with either ANSI SQL compliant code, or use SAS functions, or you could pass through to the original database and execute it there. As for SAS syntax:
intnx('year',t2.earliest_grant_date_final,3)
Should be nearly there.
thanks @RW9 .
but I have a new problem for this code now. By using following codes,
SELECT
t6.psn_name,
t1.earliest_filing_id,
count(distinct t3.pat_publn_id) AS cites_in_2006
from Sasdata.Applicationsnew as t1
join Step1.Earlist_grant_date_earlist_id2 as t2 on t1.appln_id=t2.appln_id
join Sasdata.Publicationsnew as t2b on t2.appln_id=t2b.appln_id
join Sasdata.Citations as t3 on t2b.pat_publn_id=t3.cited_pat_publn_id
join Sasdata.Publicationsnew as t4 on t3.pat_publn_id=t4.pat_publn_id
join Sasdata.Personapplication as t5 on t1.appln_id = t5.appln_id
join Sasdata.Companies as t6 on t5.person_id = t6.person_id
where
year(t2.Earliest_grant_date_final) <> 9999
and t2.Earliest_grant_date_final >= '2006-01-01'
and t2.Earliest_grant_date_final <= '2006-12-31'
and t2b.publn_first_grant=1
and t4.publn_date <= intnx('year',t2.earliest_grant_date_final,3)
group by t6.psn_name, t1.earliest_filing_id
order by t6.psn_name, t1.earliest_filing_id
;
quit;
the result shows,
367 proc sql;
368
369 create table step1.citatins_of_earliest_filing_id as
370 SELECT
371 t6.psn_name,
372 t1.earliest_filing_id,
373 count(distinct t3.pat_publn_id) AS cites_in_2006
374 from Sasdata.Applicationsnew as t1
375 join Step1.Earlist_grant_date_earlist_id2 as t2 on t1.appln_id=t2.appln_id
376 join Sasdata.Publicationsnew as t2b on t2.appln_id=t2b.appln_id
377 join Sasdata.Citations as t3 on t2b.pat_publn_id=t3.cited_pat_publn_id
378 join Sasdata.Publicationsnew as t4 on t3.pat_publn_id=t4.pat_publn_id
379 join Sasdata.Personapplication as t5 on t1.appln_id = t5.appln_id
380 join Sasdata.Companies as t6 on t5.person_id = t6.person_id
381 where
382 year(t2.Earliest_grant_date_final) <> 9999
383 and t2.Earliest_grant_date_final >= '2000-01-01'
384 and t2.Earliest_grant_date_final <= '2000-12-31'
385 and t2b.publn_first_grant=1
386 and t4.publn_date <= intnx('year',t2.earliest_grant_date_final,3)
387 group by t6.psn_name, t1.earliest_filing_id
388 order by t6.psn_name, t1.earliest_filing_id
389 ;
NOTE: The "<>" operator is interpreted as "not equals".
NOTE: Data file SASDATA.PUBLICATIONSNEW.DATA is in a format that is native to another host, or the
file encoding does not match the session encoding. Cross Environment Data Access will be
used, which might require additional CPU resources and might reduce performance.
NOTE: Data file SASDATA.PUBLICATIONSNEW.DATA is in a format that is native to another host, or the
file encoding does not match the session encoding. Cross Environment Data Access will be
used, which might require additional CPU resources and might reduce performance.
ERROR: Expression using IN has components that are of different data types.
NOTE: The IN referred to may have been transformed from an OR to an IN at some point during PROC
SQL WHERE clause optimization.
390
391 quit;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
real time 0.55 seconds
cpu time 0.09 seconds
But there is no IN in the codes, could you please show me how to improve it ?
thanks a lot.
You are mixing up types somewhere:
year(t2.Earliest_grant_date_final) <> 9999 383 and t2.Earliest_grant_date_final >= '2000-01-01' 384 and t2.Earliest_grant_date_final <= '2000-12-31' 385 and t2b.publn_first_grant=1 386 and t4.publn_date <= intnx('year',t2.earliest_grant_date_final,3)
Is earliest_grant_date_final a numeric date value - as indicated by the first line, if so line 2 and 3 are invalid as you can't compare a text string with a number. If its a character varaible then line 1 and 5 are incorrect as they assume numbers.
I will assume earliest_grant_date_final is a character:
year(input(t2.Earliest_grant_date_final,yymmdd10.)) <> 9999 and input(t2.Earliest_grant_date_final,yymmdd10.) >= '01Jan2000'd and input(t2.Earliest_grant_date_final,yymmdd10.) <= '31Dec2000'd and t2b.publn_first_grant=1 and t4.publn_date <= intnx('year',input(t2.earliest_grant_date_final,yymmdd10.),3)
Something like that. I would really read up on how dates are stored/used in SAS as they are different from a database.
thanks but the result like this,
443 proc sql;
444
445 create table step1.citatins_of_earliest_filing_id as
446 SELECT
447 t6.psn_name,
448 t1.earliest_filing_id,
449 count(distinct t3.pat_publn_id) AS cites_2006
450 from Sasdata.Applicationsnew as t1
451 join Step1.Earlist_grant_date_earlist_id2 as t2 on t1.appln_id=t2.appln_id
452 join Sasdata.Publicationsnew as t2b on t2.appln_id=t2b.appln_id
453 join Sasdata.Citations as t3 on t2b.pat_publn_id=t3.cited_pat_publn_id
454 join Sasdata.Publicationsnew as t4 on t3.pat_publn_id=t4.pat_publn_id
455 join Sasdata.Personapplication as t5 on t1.appln_id = t5.appln_id
456 join Sasdata.Companies as t6 on t5.person_id = t6.person_id
457 where
458 year(input(t2.Earliest_grant_date_final,yymmdd10.)) <> 9999
459 and input(t2.Earliest_grant_date_final,yymmdd10.) >= '01Jan2000'd
460 and input(t2.Earliest_grant_date_final,yymmdd10.) <= '31Dec2000'd
461 and t2b.publn_first_grant=1
462 and t4.publn_date <= intnx('year',input(t2.earliest_grant_date_final,yymmdd10.),3)
463 and t5.applt_seq_nr > 0
464 group by t6.psn_name, t1.earliest_filing_id
465 order by t6.psn_name, t1.earliest_filing_id
466 ;
NOTE: The "<>" operator is interpreted as "not equals".
NOTE: Data file SASDATA.PUBLICATIONSNEW.DATA is in a format that is native to another host, or the
file encoding does not match the session encoding. Cross Environment Data Access will be
used, which might require additional CPU resources and might reduce performance.
NOTE: Data file SASDATA.PUBLICATIONSNEW.DATA is in a format that is native to another host, or the
file encoding does not match the session encoding. Cross Environment Data Access will be
used, which might require additional CPU resources and might reduce performance.
ERROR: INPUT function requires a character argument.
ERROR: Function YEAR requires a numeric expression as argument 1.
ERROR: Expression using not equals (^=) has components that are of different data types.
ERROR: INPUT function requires a character argument.
ERROR: Expression using greater than or equal (>=) has components that are of different data types.
ERROR: INPUT function requires a character argument.
ERROR: Expression using less than or equal (<=) has components that are of different data types.
ERROR: INPUT function requires a character argument.
ERROR: Function INTNX requires a numeric expression as argument 2.
ERROR: Expression using less than or equal (<=) has components that are of different data types.
ERROR: INPUT function requires a character argument.
ERROR: INPUT function requires a character argument.
ERROR: INPUT function requires a character argument.
ERROR: INPUT function requires a character argument.
467
468 quit;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
real time 0.63 seconds
cpu time 0.06 seconds
You need to do some debugging yourself. The log is telling you what the issue is:
ERROR: INPUT function requires a character argument.
Therefore your variable is numeric, so treat it as numeric (remember I cannot see your data, I can only suggest):
year(t2.Earliest_grant_date_final) <> 9999 and t2.Earliest_grant_date_final >= '01Jan2000'd and t2.Earliest_grant_date_final <= '31Dec2000'd and t2b.publn_first_grant=1 and t4.publn_date <= intnx('year',t2.earliest_grant_date_final,3)
Maxim 3: Know your data.
Do a proc contents on table Step1.Earlist_grant_date_earlist_id2 and look for the type and display format of earliest_grant_date_final.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.