<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Adding missing values to long data set in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Adding-missing-values-to-long-data-set/m-p/604735#M175349</link>
    <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/138205"&gt;@novinosrin&lt;/a&gt;&amp;nbsp;, thanks for all your help. I can't tell you how much I appreciate the time you've taken to help me in this. I ran the code, and yes, this is what I was looking for. Thank you so much!&lt;/P&gt;</description>
    <pubDate>Sat, 16 Nov 2019 19:29:15 GMT</pubDate>
    <dc:creator>smithb</dc:creator>
    <dc:date>2019-11-16T19:29:15Z</dc:date>
    <item>
      <title>Adding missing values to long data set</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Adding-missing-values-to-long-data-set/m-p/604715#M175336</link>
      <description>&lt;P&gt;Greetings,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a data set of patients with scores corresponding to a specific visit (from 1 to 3). Here is a snippet of the data:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capture.PNG" style="width: 426px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/34023i43B344641A0577EE/image-size/large?v=v2&amp;amp;px=999" role="button" title="Capture.PNG" alt="Capture.PNG" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;As you can see, id 1 has three separate visits. However, id 3 has only 1 visit recorded. What I would like to do is keep the original format of the data set (long) but insert "visit = 2, score = ." or&amp;nbsp;"visit = 2, score = ."&amp;nbsp; and&amp;nbsp;"visit = 3, score = ." for subjects who have either one or two subsequent missing visits, respectively. I can do this by converting the data set to long format, with three separate scores (score1, score2, score3), but I would like to keep the original format. Also, is there a way to do this in a single data step?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here's my code (converting from long to wide currently):&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data center_a;
	input id gender age visit score @@;
	datalines;
1 1 49 1 1.33333
1 1 49 2 4.55556
1 1 49 3 0.55556
2 1 51 1 2.66667
2 1 51 2 4.88889
2 1 51 3 4.22222
3 0 52 1 3.44444
4 0 70 1 3.44444
4 0 70 2 0.44444
4 0 70 3 4.88889
5 0 49 1 3.33333
;
	CENTER = 1;
run;
proc sort data = center_a;
	by id visit;
run;
data center_a_revised;
	set center_a;
	by id visit;
	array subject{3} score1 score2 score3;
	retain score1 score2 score3;
	if (first.id) then do i = 1 to 3;
		subject{i} = .;
	end;
	subject{visit} = score;
	if (lsat.id) then output;
	keep id gender age visit score CENTER;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sat, 16 Nov 2019 17:18:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Adding-missing-values-to-long-data-set/m-p/604715#M175336</guid>
      <dc:creator>smithb</dc:creator>
      <dc:date>2019-11-16T17:18:07Z</dc:date>
    </item>
    <item>
      <title>Re: Adding missing values to long data set</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Adding-missing-values-to-long-data-set/m-p/604716#M175337</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/299930"&gt;@smithb&lt;/a&gt;&amp;nbsp; &amp;nbsp;By any chance are you asking for this?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
 data center_a;
	input id gender age visit score ;
	datalines;
1 1 49 1 1.33333
1 1 49 2 4.55556
1 1 49 3 0.55556
2 1 51 1 2.66667
2 1 51 2 4.88889
2 1 51 3 4.22222
3 0 52 1 3.44444
4 0 70 1 3.44444
4 0 70 2 0.44444
4 0 70 3 4.88889
5 0 49 1 3.33333
;
%let n=3;/*number of visits*/
data want;
 do until(last.id);
  set center_a;
  by id;
  output;
 end;
 score=.;
 do visit=visit+1 to &amp;amp;n;
  output;
 end;
run;
 &lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sat, 16 Nov 2019 17:25:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Adding-missing-values-to-long-data-set/m-p/604716#M175337</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2019-11-16T17:25:53Z</dc:date>
    </item>
    <item>
      <title>Re: Adding missing values to long data set</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Adding-missing-values-to-long-data-set/m-p/604717#M175338</link>
      <description>&lt;P&gt;Something like this. I'm sure you can do it in one step, but this is how I think about it using proc transpose.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data center_a;
	input id gender age visit score @@;
	datalines;
1 1 49 1 1.33333
1 1 49 2 4.55556
1 1 49 3 0.55556
2 1 51 1 2.66667
2 1 51 2 4.88889
2 1 51 3 4.22222
3 0 52 1 3.44444
4 0 70 1 3.44444
4 0 70 2 0.44444
4 0 70 3 4.88889
5 0 49 1 3.33333
;
run;
proc sort data=center_a; by id visit; run;

proc transpose data=center_a out=center_trx(drop=_name_) prefix=vst_;
by id gender age;
id visit;
var score;
run;

proc transpose data=center_trx out=center_trx_back(rename=col1=score);
by id gender age;
run; 

data want(drop=_name_);
set center_trx_back;
visit = 1*scan(_name_,2,"_");
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sat, 16 Nov 2019 17:34:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Adding-missing-values-to-long-data-set/m-p/604717#M175338</guid>
      <dc:creator>unison</dc:creator>
      <dc:date>2019-11-16T17:34:48Z</dc:date>
    </item>
    <item>
      <title>Re: Adding missing values to long data set</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Adding-missing-values-to-long-data-set/m-p/604718#M175339</link>
      <description>&lt;P&gt;Are the missing visits always at the end?&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data center_a_revised;
  set center_a;
  by id visit;
  output;
  if last.id then do visit=visit+1 to 3 ;
    score=.;
    output;
  end;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Or is it possible that some visits are missing in the middle.&amp;nbsp; So someone has visits 1 and 3 and is missing visit 2?&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data center_a_all_visits;
  set center_a(keep=id visit);
  by id;
  if first.id then do visit=1 to 3;
    output;
  end;
run;

data center_a_revised;
  merge center_a center_a_all_visits;
  by id visit;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sat, 16 Nov 2019 17:36:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Adding-missing-values-to-long-data-set/m-p/604718#M175339</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2019-11-16T17:36:07Z</dc:date>
    </item>
    <item>
      <title>Re: Adding missing values to long data set</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Adding-missing-values-to-long-data-set/m-p/604735#M175349</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/138205"&gt;@novinosrin&lt;/a&gt;&amp;nbsp;, thanks for all your help. I can't tell you how much I appreciate the time you've taken to help me in this. I ran the code, and yes, this is what I was looking for. Thank you so much!&lt;/P&gt;</description>
      <pubDate>Sat, 16 Nov 2019 19:29:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Adding-missing-values-to-long-data-set/m-p/604735#M175349</guid>
      <dc:creator>smithb</dc:creator>
      <dc:date>2019-11-16T19:29:15Z</dc:date>
    </item>
    <item>
      <title>Re: Adding missing values to long data set</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Adding-missing-values-to-long-data-set/m-p/604736#M175350</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159"&gt;@Tom&lt;/a&gt;&amp;nbsp;, the data includes subjects whom have an initial first visit, but some of them did not appear for any further subsequent visits. So no, none of the observations are missing visit two but have visits 1 and 3 accounted for. Thank you for this interesting take on the data. And thank you for taking the time to work on this code, I really appreciate it. I ran your top code, and that is the output I am looking for. Thanks again&lt;/P&gt;</description>
      <pubDate>Sat, 16 Nov 2019 19:33:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Adding-missing-values-to-long-data-set/m-p/604736#M175350</guid>
      <dc:creator>smithb</dc:creator>
      <dc:date>2019-11-16T19:33:25Z</dc:date>
    </item>
    <item>
      <title>Re: Adding missing values to long data set</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Adding-missing-values-to-long-data-set/m-p/604764#M175362</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt; data center_a;
	input id gender age visit score ;
	datalines;
1 1 49 1 1.33333
1 1 49 2 4.55556
1 1 49 3 0.55556
2 1 51 1 2.66667
2 1 51 2 4.88889
2 1 51 3 4.22222
3 0 52 1 3.44444
4 0 70 1 3.44444
4 0 70 2 0.44444
4 0 70 3 4.88889
5 0 49 1 3.33333
;
run;

data want;
	set center_a;
	
	by id;
	
	output;
	
	if last.id then do;
	do while(visit &amp;lt; 3);
		visit=visit +1;
		score=.;
		output;
	end;
	end;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sun, 17 Nov 2019 00:34:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Adding-missing-values-to-long-data-set/m-p/604764#M175362</guid>
      <dc:creator>r_behata</dc:creator>
      <dc:date>2019-11-17T00:34:26Z</dc:date>
    </item>
    <item>
      <title>Re: Adding missing values to long data set</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Adding-missing-values-to-long-data-set/m-p/604774#M175371</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/299930"&gt;@smithb&lt;/a&gt;:&lt;/P&gt;
&lt;P&gt;I strongly dislike hard-coding, so instead of setting the max number of visits to 3, I'd rather first determine from the data itself what that number really is and then use it downstream in the step:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have ;                                                                                                                             
  input id gender age visit score ;                                                                                                     
  cards ;                                                                                                                               
1 1 49 1 1.33333                                                                                                                        
1 1 49 2 4.55556                                                                                                                        
1 1 49 3 0.55556                                                                                                                        
2 1 51 1 2.66667                                                                                                                        
2 1 51 2 4.88889                                                                                                                        
2 1 51 3 4.22222                                                                                                                        
3 0 52 1 3.44444                                                                                                                        
4 0 70 1 3.44444                                                                                                                        
4 0 70 2 0.44444                                                                                                                        
4 0 70 3 4.88889                                                                                                                        
5 0 49 1 3.33333                                                                                                                        
;                                                                                                                                       
run ;                                                                                                                                   
                                                                                                                                        
data want ;                                                                                                                             
  if _n_ = 1 then do until (z) ;                                                                                                        
    set have (keep = visit) end = z ;                                                                                                   
    last.visit = max (last.visit, visit) ;                                                                                              
  end ;                                                                                                                                 
  set have ;                                                                                                                            
  by id ;                                                                                                                               
  output ;                                                                                                                              
  call missing (score) ;                                                                                                                
  if last.id and visit &amp;lt; last.visit then do visit = visit + 1 to last.visit ;                                                           
    output ;                                                                                                                            
  end ;                                                                                                                                 
run ;                  
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;An interesting question is whether or not the same can be done in a single step if the input data set is completely disordered. The answer is that with a modicum of memorizing stuff in a hash table, it is affirmative:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have ;                                                                                                                             
  input id gender age visit score ;                                                                                                     
  cards ;                                                                                                                               
4 0 70 3 4.88889                                                                                                                        
1 1 49 3 0.55556                                                                                                                        
2 1 51 1 2.66667                                                                                                                        
1 1 49 2 4.55556                                                                                                                        
2 1 51 3 4.22222                                                                                                                        
4 0 70 1 3.44444                                                                                                                        
2 1 51 2 4.88889                                                                                                                        
3 0 52 1 3.44444                                                                                                                        
4 0 70 2 0.44444                                                                                                                        
5 0 49 1 3.33333                                                                                                                        
1 1 49 1 1.33333                                                                                                                        
;                                                                                                                                       
run ;                                                                                                                                   
                                                                                                                                        
data want ;                                                                                                                             
  if _n_ = 1 then do ;                                                                                                                  
    dcl hash h (ordered:"a") ;                                                                                                          
    h.definekey ("id") ;                                                                                                                
    h.definedata ("id", "_iorc_", "_n_") ;                                                                                              
    h.definedone () ;                                                                                                                   
    do until (z) ;                                                                                                                      
      set have (keep = id visit) curobs = q end = z ;                                                                                   
      last.visit = max (last.visit, visit) ;                                                                                            
      if h.find() ne 0 then call missing (_iorc_, _n_) ;                                                                                
      if _iorc_ &amp;lt; visit then do ;                                                                                                       
        _iorc_ = visit ;                                                                                                                
        _n_ = q ;                                                                                                                       
      end ;                                                                                                                             
      h.replace() ;                                                                                                                     
    end ;                                                                                                                               
  end ;                                                                                                                                 
  set have curobs = q ;                                                                                                                 
  output ;                                                                                                                              
  h.find() ;                                                                                                                            
  call missing (score) ;                                                                                                                
  if _n_ = q then do visit = _iorc_ + 1 to last.visit ;                                                                                 
    output ;                                                                                                                            
  end ;                                                                                                                                 
run ;                                                
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Though it might look like a codeful, the scheme is actually pretty simple:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Read the input file, find the highest visit across all records, and store it in the variable LAST.VISIT.&lt;/LI&gt;
&lt;LI&gt;For each ID, memorize its highest visit and the record number on which it occurs.&lt;/LI&gt;
&lt;LI&gt;Read the file again and output each record. If for the current record's ID, the stored highest visit is less than LAST.VISIT, augment the output with visit numbers from the stored value up by 1 to last.visit.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Note that the variable LAST.VISIT has nothing to do with BY processing. It's just an arbitrary temporary variable, whose convenience lies in the fact that it is set to 0 at compile, auto-retained, and auto-dropped. &amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Kind regards&lt;/P&gt;
&lt;P&gt;Paul D.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 17 Nov 2019 04:58:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Adding-missing-values-to-long-data-set/m-p/604774#M175371</guid>
      <dc:creator>hashman</dc:creator>
      <dc:date>2019-11-17T04:58:00Z</dc:date>
    </item>
    <item>
      <title>Re: Adding missing values to long data set</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Adding-missing-values-to-long-data-set/m-p/604802#M175382</link>
      <description>&lt;P&gt;Nice one Guru&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/21262"&gt;@hashman&lt;/a&gt;&amp;nbsp; after paying attention to your post, my leaning was little bit towards utilizing SPARSE&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;tables id*visit/out=temp(keep=id visit) sparse;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;in proc freq to get all max combinations in an ordered form and perform a direct dynamic look up.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;However when i tested on some large data, the performance of SPARSE was worse than awful.&amp;nbsp; Yours was lightning fast. Of course as expected.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Okay, my post isn't about just words of praise that you anyway are used to receiving from SAS world, but many procedure options are not really performing at least at citizens SAS installation. Is it only at citizens resources or just expected normal behavior that you've seen from your enormous experiences? I guess you or Tom are likely to be best judge of it.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;PS&lt;/STRONG&gt; Thank you for the text. SAS certification seems far easier than passing road test. lol. We'll see i i make it in achieving my drivers. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 17 Nov 2019 15:56:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Adding-missing-values-to-long-data-set/m-p/604802#M175382</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2019-11-17T15:56:08Z</dc:date>
    </item>
  </channel>
</rss>

