Dear all,
how can I create the date variable like No_of_cit200001,No_of_cit200002...No_of_cit200010,No_of_cit200011 through macro variable .
I use the following codes,
%macro base_of_cit3;
%local y;
%do y=2000 %to 2010;
%local m;
%do m=1 %to 12;
PROC SQL;
CREATE TABLE Step4.M_3cit&y.&m AS
SELECT
t_base.psn_name,
COUNT(DISTINCT t_do.docdb_family_id) AS No_of_cit&y.&m
FROM
Pat_ori.Docdbfamiliescitations AS t_do
JOIN Step3.M_2cit&y AS t_base ON t_do.cited_docdb_family_id=t_base.docdb_family_id_base
JOIN Step1.appln_publn AS t_ap2 ON t_do.docdb_family_id=t_ap2.docdb_family_id
WHERE month = &m
AND t_ap2.publn_date <= intnx('year', t_base.earliest_publn_date,3) /*a 3-year moving window starting at the earliest publication date of a patent family */
GROUP BY t_base.psn_name
ORDER BY t_base.psn_name
;
QUIT;
%end;
%end;
%mend base_of_cit3;
%base_of_cit3;
/*Step3: merge and delete missing value at psn_name variable*/
DATA Step5.Number_of_cit_end;
MERGE Step4.M_3cit20: ;
BY psn_name;
IF psn_name='' THEN delete;
RUN;
/*Step4: replace ‘.’ with zero */
DATA RESULT.M_number_of_cit;
SET Step5.Number_of_cit_end;
ARRAY n{*} _numeric_; /* gets all numeric variables into array */
DO i=1 TO dim(n); /* Do for all elements */
IF n{i}=. then n{i}=0;
END;
DROP i;
RUN;
proc transpose data= RESULT.M_number_of_cit out=Step6.M_No_cit_t;
by psn_name;
var No_of_cit: ;
run;
and get the result like
I want to get
No_of_cit200001,
No_of_cit200002,
No_of_cit200003,
...
No_of_cit200010,
No_of_cit200011
for each psn_name.
Could you please give me some suggestions?
thanks in advance
@France wrote:
Dear RW9,
Could you please provide suggestions about how to create variables like No_of_cit200001, No_of_cit200002..., No_of_cit200012. I can only get the variable like No_of_cit200001, No_of_cit200002...,No_of_cit200012 by using following codes.
Allow me to give the same advice as @RW9.
This is generally a bad idea to split up the data by month and year. It should all go into a single data set, and from there the analyses can be performed using the proper BY statement in whatever PROC you plan to use.
Now that you have been given the same advice twice, please don't ignore it.
Does not make sense. Please follow the guidance, post test data in the form of a datastep and then show what you want out from that test data. From what I can make out of your post you simply want to count a variable by a by group:
proc means data=have; class psn_name; var ...; output out=want n=n; run;
No need for splitting data up, and macro etc.
Dear RW9,
thanks for your reply.
Yes I'd like to count a variable by the group. I have counted the number of patents based on ballardw's suggestion. And now I need to count to count the number of patent citations, and I want to merge these two tables. However, while the date of the number of patents is 200001, the date of the number of patent citations is 20001. So I cannot merge them.
could you please give me some suggestions?
Just convert one of them? Its quite hard to code anything not knowing input test data or required output.
Dear RW9,
Could you please provide suggestions about how to create variables like No_of_cit200001, No_of_cit200002..., No_of_cit200012. I can only get the variable like No_of_cit200001, No_of_cit200002...,No_of_cit200012 by using following codes.
%macro base_of_cit3;
%local y;
%do y=2000 %to 2010;
%local m;
%do m=1 %to 12;
PROC SQL;
CREATE TABLE Step4.M_3cit&y.&m AS
SELECT
t_base.psn_name,
COUNT(DISTINCT t_do.docdb_family_id) AS No_of_cit&y.&m
FROM
Pat_ori.Docdbfamiliescitations AS t_do
JOIN Step3.M_2cit&y AS t_base ON t_do.cited_docdb_family_id=t_base.docdb_family_id_base
JOIN Step1.appln_publn AS t_ap2 ON t_do.docdb_family_id=t_ap2.docdb_family_id
WHERE month = &m
AND t_ap2.publn_date <= intnx('year', t_base.earliest_publn_date,3) /*a 3-year moving window starting at the earliest publication date of a patent family */
GROUP BY t_base.psn_name
ORDER BY t_base.psn_name
;
QUIT;
%end;
%end;
%mend base_of_cit3;
%base_of_cit3;
You would need to put m to z2, something like:
step4.m_3cit&y.%sysfunc(put(&m.,z2.)) as
However, I can only stress once more that splitting data up in lots of same chunks and then writing lots of macro to cope with that will give you a headache and break 90% of the time.
Note that %SYSFUNC does not support the PUT function. You would have to use PUTN instead.
@France wrote:
Dear RW9,
Could you please provide suggestions about how to create variables like No_of_cit200001, No_of_cit200002..., No_of_cit200012. I can only get the variable like No_of_cit200001, No_of_cit200002...,No_of_cit200012 by using following codes.
Allow me to give the same advice as @RW9.
This is generally a bad idea to split up the data by month and year. It should all go into a single data set, and from there the analyses can be performed using the proper BY statement in whatever PROC you plan to use.
Now that you have been given the same advice twice, please don't ignore it.
Dear PaigeMiller,
thanks for both of your advice. now I am trying to not split up the data by month, but I really do not know how to get the result if I do not split the data by year.
now I am using the following codes,
%macro base_of_cit3;
%local y;
%do y=2000 %to 2010;
PROC SQL;
CREATE TABLE Step4.M_3cit&y AS
SELECT
t_base.psn_name,
COUNT(DISTINCT(case WHEN month = 1 THEN t_do.docdb_family_id END) AS No_of_cit&y.01,
COUNT(DISTINCT(case WHEN month = 2 THEN t_do.docdb_family_id END) AS No_of_cit&y.02,
COUNT(DISTINCT(case WHEN month = 3 THEN t_do.docdb_family_id END) AS No_of_cit&y.03,
COUNT(DISTINCT(case WHEN month = 4 THEN t_do.docdb_family_id END) AS No_of_cit&y.04,
COUNT(DISTINCT(case WHEN month = 5 THEN t_do.docdb_family_id END) AS No_of_cit&y.05,
COUNT(DISTINCT(case WHEN month = 6 THEN t_do.docdb_family_id END) AS No_of_cit&y.06,
COUNT(DISTINCT(case WHEN month = 7 THEN t_do.docdb_family_id END) AS No_of_cit&y.07,
COUNT(DISTINCT(case WHEN month = 8 THEN t_do.docdb_family_id END) AS No_of_cit&y.08,
COUNT(DISTINCT(case WHEN month = 9 THEN t_do.docdb_family_id END) AS No_of_cit&y.09,
COUNT(DISTINCT(case WHEN month = 10 THEN t_do.docdb_family_id END) AS No_of_cit&y.10,
COUNT(DISTINCT(case WHEN month = 11 THEN t_do.docdb_family_id END) AS No_of_cit&y.11,
COUNT(DISTINCT(case WHEN month = 12 THEN t_do.docdb_family_id END) AS No_of_cit&y.12
FROM
Pat_ori.Docdbfamiliescitations AS t_do
JOIN Step3.M_2cit&y AS t_base ON t_do.cited_docdb_family_id=t_base.docdb_family_id_base
JOIN Step1.appln_publn AS t_ap2 ON t_do.docdb_family_id=t_ap2.docdb_family_id
WHERE t_ap2.publn_date <= intnx('year', t_base.earliest_publn_date,3) /*a 3-year moving window starting at the earliest publication date of a patent family */
GROUP BY t_base.psn_name
ORDER BY t_base.psn_name
;
QUIT;
%end;
%mend base_of_cit3;
%base_of_cit3;
but the result shows that,
NOTE: Line generated by the invoked macro "BASE_OF_CIT3".
105 WHEN month = 8 THEN t_do.docdb_family_id END) AS citation&y.08, COUNT(DISTINCT(case WHEN
105! month = 9 THEN t_do.docdb_family_id END) AS citation&y.09, COUNT(DISTINCT(case WHEN month
105! = 10 THEN t_do.docdb_family_id END) AS citation&y.10,
--
79
ERROR 79-322: Expecting a ).
NOTE: Line generated by the invoked macro "BASE_OF_CIT3".
106 COUNT(DISTINCT(case WHEN month = 11 THEN t_do.docdb_family_id END) AS citation&y.11,
--
79
106! COUNT(DISTINCT(case WHEN month = 12 THEN t_do.docdb_family_id END) AS citation&y.12 FROM
106! Pat_ori.Docdbfamiliescitations AS t_do JOIN
NOTE: Line generated by the invoked macro "BASE_OF_CIT3".
106 COUNT(DISTINCT(case WHEN month = 11 THEN t_do.docdb_family_id END) AS citation&y.11,
106! COUNT(DISTINCT(case WHEN month = 12 THEN t_do.docdb_family_id END) AS citation&y.12 FROM
--
79
106! Pat_ori.Docdbfamiliescitations AS t_do JOIN
ERROR 79-322: Expecting a ).
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
real time 0.02 seconds
cpu time 0.03 seconds
Could you please give me some suggestions?
All of your COUNT calculations have two open parentheses but only one close parenthesis.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.