BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
nazmul
Quartz | Level 8

Dear Everyone,

 

I have the US patent data from 1926 to 2010. In each year, firms apply for patents in different technology class. I want to count the number of patents a firm applied for in each year from known and unknown technology class. Say, Firm 10001 (Permno- a unique firm identifier) applied for a patent 1000 (patnum- an unique patent identifier) in technology class 80 in year 1990 and then applied for another patent 1500 in the same technology class 80 in year 2000. Then I would assume that the firm filed for a patent in the known technology class in 2000. So the dummy variable identifying known technology class would take value 1 for patent 1500 of firm 10001 in year 2000.

 

Now assume that Firm 10001 applied for another patent 1600 in the technology class 90 in year 2000 for the first time in its history. So the dummy variable identifying known technology class would take value 0 for patent 1600 of firm 10001 in year 2000.

 

Could you please help in putting the above stated arguments in a proper SAS code? I would really appreciate your help. I have around 2 million observations in my original file from year 1926 to 2010. A firm may have multiple patent applications in each year and may not have a single application in a year.

 

All the variables- Year, Patnum, Permno, Class are numeric.

Sample Data:

 

yearPatnumPermnoClass
199010001000180
19959001000170
200015001000180
200016001000190
192613001000270
200014001000280
201012001000290

 

Expected Outcome:

 

yearPatnumPermnoClassKnown_Dummy
1990100010001800
199590010001700
2000150010001801
2000160010001900
1926130010002700
2000140010002800
2010120010002900
1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

One approach that may work for you:

proc sort data=have;
   by permno class year;
run;

data want;
   set have;
   by permno class;
   if first.class then Known_dummy=0;
   else Known_dummy = 1;
run;

Where have is the name of your data set.

 

 

When data is used with BY variables then SAS creates special values to indicate whether an observation is the first or the last of its group. You can reference then by use of First.variablename or Last.varaiblename. Notice that the dot here is NOT related to a library just in case that came to mind. The values returned by First. and Last. are 1 for true and 0 for false so they can be used with IF and any other calculations.

For added fun you might want to figure out why:  Known_dummy = not(first.class); returns the same values as the IF/THEN/ELSE.

 

If you want to test this with a subset of your data you could use

proc sort data=have (obs=1000) out=temp;
   by permno class year;
run;

data temp;
   set have;
   by permno class;
   if first.class then Known_dummy=0;
   else Known_dummy = 1;
run;

replace the 1000 with any number 1 or greater up to the number of records in the input set to select that many for testing.

 

View solution in original post

4 REPLIES 4
ballardw
Super User

Why do you not show values for "known_dummy" for Permno=10001, Class=80 and year=1990? I would expect that to be 0.

similar question for Permno=10002 class=70 and year=1926.

nazmul
Quartz | Level 8

Dear Ballardw,

 

Yes. You are right. That should also be 0.

I am updating the tables.

 

I appreciate your feedback

ballardw
Super User

One approach that may work for you:

proc sort data=have;
   by permno class year;
run;

data want;
   set have;
   by permno class;
   if first.class then Known_dummy=0;
   else Known_dummy = 1;
run;

Where have is the name of your data set.

 

 

When data is used with BY variables then SAS creates special values to indicate whether an observation is the first or the last of its group. You can reference then by use of First.variablename or Last.varaiblename. Notice that the dot here is NOT related to a library just in case that came to mind. The values returned by First. and Last. are 1 for true and 0 for false so they can be used with IF and any other calculations.

For added fun you might want to figure out why:  Known_dummy = not(first.class); returns the same values as the IF/THEN/ELSE.

 

If you want to test this with a subset of your data you could use

proc sort data=have (obs=1000) out=temp;
   by permno class year;
run;

data temp;
   set have;
   by permno class;
   if first.class then Known_dummy=0;
   else Known_dummy = 1;
run;

replace the 1000 with any number 1 or greater up to the number of records in the input set to select that many for testing.

 

nazmul
Quartz | Level 8

Thank you so much. I really appreciate your help

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1679 views
  • 1 like
  • 2 in conversation