BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
karanfil
Fluorite | Level 6

Hello All,
I need to fill in missing rows by groups( ticker, year) for the same person. Here how the data looks like and how I would like to have the data. I really appreciate your help. I have huge data set and can not do manually. Any code will be appreciate it. I use SAS 9.4

CURRENT

TickeryearFnameLnameaudexeccomnom
AB2012PG    
AB2012PGX XX
AB2012PG X  
DD2012GL    
DD2012GL C  
EG2013WZ    
EG2013WZ  C

C

DESIRED

tickeryearfnamelnameaudexccomnom
AB2012PGXXXX
AB2012PGXXXX
AB2012PGXXXX
DD2012GL C  
DD2012GL C  
EG2013WZ  CC
EG2013WZ  CC

 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

data have;
input (Ticker	year	Fname	Lname	aud	exec	com	nom) ($);
cards;
AB	2012	P	G	.	.	.	. 	 	 	 
AB	2012	P	G	X	. 	X	X
AB	2012	P	G	. 	X	. 	. 
DD	2012	G	L	.   .   .   . 	 	 	 
DD	2012	G	L	. 	C	. 	. 
EG	2013	W	Z	. 	. 	. 	. 
EG	2013	W	Z	. 	. 	C	C
;

proc sql;
create table want as
select Ticker,year,fname, lname,max(aud) as max,max(exec) as exec, max(com) as com, max(nom) as nom
from have
group by ticker, year;
quit;

Notes: Take advantage of autoremerge by not including fname,lname in group by

View solution in original post

4 REPLIES 4
novinosrin
Tourmaline | Level 20

data have;
input (Ticker	year	Fname	Lname	aud	exec	com	nom) ($);
cards;
AB	2012	P	G	.	.	.	. 	 	 	 
AB	2012	P	G	X	. 	X	X
AB	2012	P	G	. 	X	. 	. 
DD	2012	G	L	.   .   .   . 	 	 	 
DD	2012	G	L	. 	C	. 	. 
EG	2013	W	Z	. 	. 	. 	. 
EG	2013	W	Z	. 	. 	C	C
;

proc sql;
create table want as
select Ticker,year,fname, lname,max(aud) as max,max(exec) as exec, max(com) as com, max(nom) as nom
from have
group by ticker, year;
quit;

Notes: Take advantage of autoremerge by not including fname,lname in group by

karanfil
Fluorite | Level 6

Thanks a lot. It is great.

karanfil
Fluorite | Level 6

Thanks a lot.  It works.

mkeintz
PROC Star

This is one of those problem where you can take advantage of the LOCF (last observation carried forward) task supported by the UPDATE statement accompanied by the OBS=0 parameter and a BY statement.    Here LOCF means to bring forward the most recent non-missing value in the variables of interest, so by the end of the BY group (last record for a  given ticker/year) you have the variables of interest.  Then just re-read all the other variables for the same ticker/group and output:

 

data have;
input (Ticker	year	Fname	Lname	aud	exec	com	nom) ($);
cards;
AB	2012	P	G	.	.	.	. 	 	 	 
AB	2012	P	G	X	. 	X	X
AB	2012	P	G	. 	X	. 	. 
DD	2012	G	L	.   .   .   . 	 	 	 
DD	2012	G	L	. 	C	. 	. 
EG	2013	W	Z	. 	. 	. 	. 
EG	2013	W	Z	. 	. 	C	C
;

data want;
  /* LOCF all variables not in the BY statement */
  update have (obs=0) have;
  by ticker year;

  /* At the end of the ticker/year, reread all the variables 
     original values, except those of LOCF interest*/
  if last.year then do until (last.year);
    set have (drop=aud exec com nom);
	by ticker year;
	output;
  end;
run;
  

Note this assumes your data are sorted by ticker/year, and that no variable of interest has more than 1 non-missing value for a given ticker/year.

 

This works because the 2nd reading of the dataset (i.e. in the loop starting with "if last.year then do until (last.year);") does NOT re-read the variables of interest (aud exec com nom) due to the DROP= parameter.  So they are preserved LOCF results produced by the first reading (in the UPDATE by-group).

 

Now fname and lname in your data are constant, so keeping the last valid value for them wouldn't matter,  But if the fname or lname were to have changes, the code above preserves those changes (because fname and lname are not in the DROP= parameter).  

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1093 views
  • 1 like
  • 3 in conversation