BookmarkSubscribeRSS Feed
RandyStan
Fluorite | Level 6
Dear All:
These are my two data sets

ID    Time          Value
A     9:30:01       16
A     9:30:02       17
A     9:30:04       18
A     9:30:06       18
B     9:30:01        5
B     9:30:03        6
B     9:30:06        7


I have a time table

Time
9:30:01
9:30:02
9:30:03
9:30:04
9:30:05
9:30:06

and so forth

What I want is

ID    Time          Value
A     9:30:01       16
A     9:30:02       17
A     9:30:03       17
A     9:30:04       18
A     9:30:05       18
A     9:30:06       18
B     9:30:01        5
B     9:30:02        5
B     9:30:03        6
B     9:30:04        6
B     9:30:05        6
B     9:30:06        7

Can you please help me. 

 

 

7 REPLIES 7
anoopmohandas7
Quartz | Level 8
Just been curious. In real world what are you trying to achieve with this ?
mkeintz
PROC Star

I assume

  1. Your time table is intended to represent every exact second between 09:30:01 and 09:30:06. It has no holes
  2. You want last observation carried forwad for missing time points
  3. Every ID starts with a non-missing record at time=09:30:01

 

data one;
  input ID :$1.   Time :time8.0         Value;
  format time time8.0;
datalines;
A     9:30:01       16
A     9:30:02       17
A     9:30:04       18
A     9:30:06       18
B     9:30:01        5
B     9:30:03        6
B     9:30:06        7
run;
data time_table;
  input time time8.0;
  format time time8.0;
datalines;
9:30:01
9:30:02
9:30:03
9:30:04
9:30:05
9:30:06
run;

data want (keep=id time value);
  if _n_=1 then set time_table (rename=(time=tim_end)) nobs=ntimes point=ntimes;

  set one;
  by id;
  merge one one (firstobs=2 keep=time rename=(time=nxt_tim));

  if last.id then nxt_tim=tim_end+1;
  do time=time to nxt_tim-1; output; end;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
RandyStan
Fluorite | Level 6

Thank you so much. 

Ksharp
Super User
data one;
  input ID :$1.   Time :time8.0         Value;
  format time time8.0;
datalines;
A     9:30:01       16
A     9:30:02       17
A     9:30:04       18
A     9:30:06       18
B     9:30:01        5
B     9:30:03        6
B     9:30:06        7
run;
data time_table;
  input time time8.0;
  format time time8.0;
datalines;
9:30:01
9:30:02
9:30:03
9:30:04
9:30:05
9:30:06
run;

proc sql;
create table temp as
select a.*,b.value
from (
select * from
(select distinct id from one),(select distinct time from time_table) 
) as a
left join one as b 
on a.id=b.id and a.time=b.time;
quit;

data want;
 set temp;
 by id;
 retain v;
 if first.id then v=.;
 if not missing(value) then v=value;
 drop value;
run;
gamotte
Rhodochrosite | Level 12

Note that if you want to fill each second between consecutive times, you don't need

a time table. Using @mkeintz input dataset "one", this should give the wanted result :

 

/* We sort by descending time because there is no "lead" operator in SAS */
proc sort data=one out=eno; by ID descending time; run; data eno_filled; set eno(rename=(time=oldTime)); by ID; format time time8.0; drop oldTime; time=lag(oldTime)-1; if first.ID then do; time=oldTime; output; end; else do; do while (time ge oldTime); output; time=time-1; end; end; run; proc sort data=eno_filled out=want; by ID time; run;
mkeintz
PROC Star

True, SAS provides no lead function.  After all it can't be a simple "lookahead" like in Excel.  But then  lag is not a "lookback", it is a queue manager, which is a big difference when dealing with by groups, or interleaved series.

 

But the absence of a lead function is usually not enough of a reason to sort by descending time, use the lag function, and then re-sort to ascending order - a big penaly for large data sets.  It's usually far more efficient to take advantage of the "firstobs=" parameter for data set names as objects of the SET or MERGE statements.  That's why I submitted the code with merge.

 

Implementing a lead function could not be analogous to the lag function.  It' would not be too burdomsome for "normal" leads (constant lead length and no filtering of upcoming observations), but how would you be able to deal with the following?

 

  data want;

    set sashelp.class;

    if sex='F' then last_age=lag(age);

    else if sex='M' then last_age=lag(age);

  run;

 

Because lag is a queue manager, there are two queues being maintained above, one for each sex.  Note that there is no extra disk input processing required.  Now imagine there were such a thing as a lead function, i.e.

     next_age=lead(age)

You still would not have a way to maintain two separate series of leads, because the lead function would provide sas with no guidance on how to filter the upcoming observations. 

 

<promotion>

If you are coming to SGF next month, these are some of the issues I'll be presenting (Leads and Lags: Static and Dynamic Queues in the SAS® DATA STEP) Tuesday, April 4 5pm-6pm.

</promotion>

 

 

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
gamotte
Rhodochrosite | Level 12

Thanks @mkeintz for your interesting comments.

My goal was to show that the time table was not necessary because it merely increments a counter.

Using a merge as you suggest, the program becomes :

 

data want;
	set one;
	by ID;
	merge one one (firstobs=2 keep=time rename=(time=nextTime));
	drop nextTime;

	if last.ID then do;
		output;
	end;
	else do;
		do while (time lt nextTime);			
		    output;
		    time=time+1;
		end;
	end;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1102 views
  • 3 likes
  • 5 in conversation