BookmarkSubscribeRSS Feed
nans_p
Calcite | Level 5
Hi,
I can’t get my head around to fix this piece of code, my data looks somewhat like this with 4 variable
Data have;
Subjid collapse count seq
101 1
101 1 1 2
101 1 1 3
101 4
101 5
101 1 1 6
101 7

I want to retain the variable collapse and count at missing seq 4 &5.
For e.g I want my dataset to look like:

Data have;
Subjid collapse count seq
101 1
101 1 1 2
101 1 1 3
101 1 1 4
101 1 1 5
101 1 1 6
101 7

Please let me know how I can fix this, I cannot use the seq variable to code because the situation can be different for different subjids.

Thanks in advance
7 REPLIES 7
Patrick
Opal | Level 21

Please explain the required logic a bit more in detail - especially why seq remains missing for the first and last obs in your sample data.

nans_p
Calcite | Level 5

@Patrick @mkeintz 

 

Sorry for not being so specific - below is a clear explanation. Data Have is how my dataset looks like and Data want is how I want my data too look like.

Because I have collapsed a record highlighted in green, I want to flag all the associated date records which fall from the 02-Dec-17 to 16-Jan-18. My question is how do I retain the collapsed and the count variable at observation 5&6 of the seq variable. 

 

NanitaPatel_1-1617523406475.png

 

 

Kurt_Bremser
Super User

So you simply want to carry forward the last non-missing observation?

data want;
set have;
by subjid;
retain _coll _cnt;
if first.subjid
then do;
  _coll = .;
  _cnt = .;
end;
if collapsed ne .
then _coll = collapsed;
else collapsed = _coll;
if count ne .
then _cnt = count;
else count = _cnt;
drop _:;
run;

Untested, for lack of usable example data.

mkeintz
PROC Star

So you have now given us some more information - namely about the position of SEQ.

 

You have NOT stated why collapsed and count are propagated from obs 4 through 5 and 6, but obs 7 is not propagated. Is it because ex='Y' in obs 4, but not in obs 7? 

 

When does propagation stop?  In the above case, obs 7 has data that stops propagation of data from obs 4, but what if there were no such data after obs 4?  When would propagation stop, if at all?

 

Also you show only only one subjid in your data.  You didn't say whether data should be propagated across subjects (probably not, but it would be good to have it stated).

 

In short, help us help you - by providing a sufficient description of the problem such that you won't have to answer lots of follow-up questions.  That's probably a more valuable skill to develop than sas coding, even in this forum.

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
nans_p
Calcite | Level 5

@Patrick @Kurt_Bremser 

Hi,

 

Any idea how I can get the missing observation where highlighted in seq variable 5&6, but it should not retain the values in seq 8 and 9 of my orignal post.

I tried to use retain statement but it does not provide the output I want.

 

I am sharing another example below  -  I only want collap, count, collapse where highlighted in yellow. The reason behind is because I have collapsed dates where collapex =Y . All the dates associated with the collapsing should have collap = 1, count =1 or count =2 if a second collapsing of date has happened within same subject and collapse = Y.

nans_p_0-1618137004343.png

Hope this is clear. Apart from retain what is the best method I can use?

 

 

Kurt_Bremser
Super User

Please post your data in usable form, which means a data step with datalines, which we can copy/paste to our environment and submit.

Test your data step code before posting to make sure it recreates a sufficient example of your data, and use the "little running man" button to open a code box into which you paste the code.

mkeintz
PROC Star

I think you need to make a working data step to show the initial condition of your data.  It appears that SEQ is missing for obs 1 and 7.  Instead you have collapse=1 for the first obs and collapse=7 for the last.

 

But it that is not your intention, then you should show missing values for collapse and count  (two .'s  between subjid and seq).

 

If I am correct (this is untested until in absence of actual startup data in a data step), then this code should work.

 

data want (drop=_:);
  set have;
  retain _last_collapse _last_count;

  if collapse^=. then _last_collapse=collapse;
  if count^=.    then _last_count=count;

  if collapse=.  then collapse=_last_collapse;
  if count=.     then count=_last_count;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1279 views
  • 1 like
  • 4 in conversation