BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
caveman529
Calcite | Level 5

I'm going to borrow the example from the stackoverflow: http://stackoverflow.com/questions/14837253/filling-in-gaps-in-data-with-a-merge-in-sas

Have:

id t x
1 1 3.7
1 3 1.2
1 4 2.4
2 2 6.0
2 4 6.1
2 5 6.2

WANT

id t x
1 1 3.7
1 2 .
1 3 1.2
1 4 2.4
1 5 .
2 1 .
2 2 6.0
2 3 .
2 4 6.1
2 5 6.2

There are two proposed solutions.  I tried it on my data and they give different results:

Solution 1:

data have;
input id t x;
cards;
1 1 3.7
1 3 1.2
1 4 2.4
2 2 6.0
2 4 6.1
2 5 6.2
;
run;

proc summary data=have nway completetypes;
class id t;
var x;
output out=want (drop=_:) max=;
run;

Solution 2:

proc expand data=have out=want from=daily method=none extrapolate;
by id;
id t;
run;

Could someone here tell me what is causing the difference?  Is it possible that the second solution fill the gap for each time series for each id, while the first solution just fill any gap in the rectangular panel data (using the longest time series as the base)?  Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

Solution 3 :

data have;
input id t x;
cards;
1 1 3.7 
1 3 1.2 
1 4 2.4 
2 2 6.0 
2 4 6.1 
2 5 6.2 
;
run;
proc sql;
create table want as
 select a.*,b.x 
  from (select * from (select distinct id from have),(select distinct t from have))     as a natural left join have as b ;
quit; 

Xia Keshan

View solution in original post

6 REPLIES 6
Ksharp
Super User

Solution 3 :

data have;
input id t x;
cards;
1 1 3.7 
1 3 1.2 
1 4 2.4 
2 2 6.0 
2 4 6.1 
2 5 6.2 
;
run;
proc sql;
create table want as
 select a.*,b.x 
  from (select * from (select distinct id from have),(select distinct t from have))     as a natural left join have as b ;
quit; 

Xia Keshan

Haikuo
Onyx | Level 15

You popped my eyes again, !

Well, at the risk of being obnoxious (not that I care Smiley Happy), I need to point out that your solution requires the Ts in the data will cover the whole spectrum with no gaps. For instance, It won't work well on the following data:

data have;

input id t x;

cards;

1 1 3.7

1 3 1.2

1 4 2.4

2 3 6.0

2 4 6.1

2 5 6.2

;

But Great work nonetheless,

Haikuo

Ksharp
Super User

Yeah. But OP didn't mention what output he would like for this scenario . Maybe mine was also right . Smiley Happy

caveman529
Calcite | Level 5

I basically want continuous time series for each id, from the min date for the id to the max date for that id.  Will Ksharp's solution do the job?  I think Hai.kuo's pointed out situation when there is gap in the time.  I basically want to repair the time series for each id.  Thank, guys ~

mohamed_zaki
Barite | Level 11

"Could someone here tell me what is causing the difference?  Is it possible that the second solution fill the gap for each time series for each id, while the first solution just fill any gap in the rectangular panel data (using the longest time series as the base)?"

If i understood your question right your asking for the difference between the two solutions, and here you are the difference:

The first solution using (PROC SUMMARY):

     Filling the gap for each id with all the missing t values only if they are listed in other id.

The second solution using (PROC EXPAND)

     Filling the gap for each id with all the missing values beginning with the first non-missing value and end with the last non missing value, for each id alone not considering other t values in other id. As PROC EXPAND is SAS/ETS procedure, and SAS/ETS: "SAS/ETS procedures ignore missing values at the beginning or end of a series. That is, the series is considered to begin with the first nonmissing value and end with the last nonmissing value."

Hope this is the answer for your questions.

Try to run the codes with the following data set and you will notice the difference easily.

data have;

input id t x;

cards;

1 1 3.7

1 3 1.2

1 4 2.4

2 4 6.1

2 5 6.2

2 10 3

;

run;

mohamed zaki

Ksharp
Super User

OK. You just need add some salt and sauce .

 
data have;
input id t x;
cards;
1 1 3.7 
1 3 1.2 
1 4 2.4 
2 2 6.0 
2 4 6.1 
2 5 6.2 
;
run;
proc summary data=have;
var t;
output out=temp min=min max=max;
run;
data t(keep=t);
 set temp;
 do t=min to max;
  output;
 end;
run;

proc sql;
create table want as
 select a.*,b.x 
  from (select * from (select distinct id from have),(select t from t))     as a natural left join have as b ;
quit; 

Xia Keshan

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 3765 views
  • 6 likes
  • 4 in conversation