BookmarkSubscribeRSS Feed
smeeker
Calcite | Level 5

I've posted what I have and what I want in a table below. Here's the code I'm currently using. Wondering if there is a way to only print the vaccine_x variable on the first line of the id. Then I plan to only keep first.id. I can create new variables if needed. Thanks in advance.

Current code: 

data want;
set have;
format vaccine_1 vaccine_2 vaccine_3 vaccine_4 vaccine_5 mmddyy10.;
if vaccine_sequence=1 then vaccine_1 = vaccine_admin_date;
else vaccine_1=.;
if vaccine_sequence=2 then vaccine_2 = vaccine_admin_date;
else vaccine_2=.;
if vaccine_sequence=3 then vaccine_3 = vaccine_admin_date;
else vaccine_3=.;
if vaccine_sequence=4 then vaccine_4 = vaccine_admin_date;
else vaccine_4=.;
if vaccine_sequence=5 then vaccine_5 = vaccine_admin_date;
else vaccine_5=.; proc print;

HAVE      
idvaccine_admin_datevaccine_sequencevaccine_1vaccine_2vaccine_3vaccine_4
17/1/202417/1/2024   
18/1/20242 8/1/2024  
19/1/20243  9/1/2024 
110/1/20244   10/1/2024
21/2/202411/2/2024   
22/2/20242 2/2/2024  
23/2/20243  3/2/2024 
24/2/20244   4/2/2024
       
WANT      
idvaccine_admin_datevaccine_sequencevaccine_1vaccine_2vaccine_3vaccine_4
17/1/202417/1/20248/1/20249/1/202410/1/2024
18/1/20242    
19/1/20243    
110/1/20244    
21/2/202411/2/20242/2/20243/2/20244/2/2024
22/2/20242    
23/2/20243    
24/2/20244    
7 REPLIES 7
quickbluefish
Barite | Level 11
proc sql noprint;
select max(nvax) into :maxvax trimmed from (select id, count(*) as nvax from HAVE group by id);
quit;

proc sort data=want; by id vax_admin_date; run;

data want;
set have;
by id;
length vax1-vax&maxvax 4 n 3;
format vax1-vax&maxvax date9.;
array V {*} vax1-vax&maxvax;
retain V n;
if first.id then do;
    call missing(of V[*]);
    n=0;
end;
n+1;
V[n]=vaccine_admin_date;
if last.id then output;
keep id vax1-vax&maxvax;
run;
quickbluefish
Barite | Level 11
Sorry, the PROC SORT should be on HAVE, not WANT. And I accidentally wrote 'vax_admin_date' instead of 'vaccine_admin_date'
Quentin
Super User

It sounds like you HAVE a lovely vertical dataset.  That is what I would normally like to have:

 

data have ;
  input id vaccine_admin_date	mmddyy10. vaccine_sequence ;
  format vaccine_admin_date	mmddyy10. ;
  cards ;
1	7/1/2024	1
1	8/1/2024	2
1	9/1/2024	3
1	10/1/2024	4
2	1/2/2024	1
2	2/2/2024	2
2	3/2/2024	3
2	4/2/2024	4
;

If you want to transpose it into a wide dataset with one record per ID, you could use PROC TRANPOSE, e.g.:

 

proc transpose data=have out=want prefix=Vaccine_;
  var vaccine_admin_date ;
  by id ;
run ;

proc print ;
run ;

which will output:

Obs    id          _NAME_           Vaccine_1     Vaccine_2     Vaccine_3     Vaccine_4

 1      1    vaccine_admin_date    07/01/2024    08/01/2024    09/01/2024    10/01/2024
 2      2    vaccine_admin_date    01/02/2024    02/02/2024    03/02/2024    04/02/2024

Usually it's easiest to work with the vertical dataset.  Sometimes it's helpful to transpose into the wide dataset.  But he WANT dataset you showed, with extra rows with blank records for Vaccine_1-Vaccine_4, does not look very useful to me.  Are you sure you really want that?

Patrick
Opal | Level 21

As a data structure (a table) your Want is sub-optimal and will be hard to work with. It's most of the time much better to organise the data in a long and narrow structure. 

Patrick_0-1729546830261.png

Such data structures can then get used with report procs like Proc Report:Patrick_1-1729546909404.png

And here the code for above:

data have;
  input id $ vaccine_admin_date :ddmmyy10. vaccine_sequence;
  format vaccine_admin_date ddmmyy10.;
  datalines;
1 07/01/2024 1
1 08/01/2024 2
1 09/01/2024 3
1 10/01/2024 4
2 01/02/2024 1
2 02/02/2024 2
2 03/02/2024 3
2 04/02/2024 4
;
run;

proc report data=have;
  columns id vaccine_sequence, vaccine_admin_date;
  define id / group;
  define vaccine_sequence / group across;
run;

 



Tom
Super User Tom
Super User

The values in your vaccine_1 - vaccine_5 variables all have the exact same values are are in the vaccine_admin_date variable.  

 

Is that always the case?  If not what would it mean when the two dates did no agree?

 

How did they get that way?

Why do you want them that way?

 

Tom
Super User Tom
Super User

Seem like a silly thing to do.  But here goes.

First let's re-create your strange dataset. (I suspect something like this was the cause of the strange layout).

data have;
  input id vaccine_admin_date :mmddyy. vaccine_sequence;
  array vaccine_ [4];
  format vaccine_1-vaccine_4 vaccine_admin_date yymmdd10.;
  vaccine_[vaccine_sequence]=vaccine_admin_date;
cards;
1  7/1/2024 1 
1  8/1/2024 2 
1  9/1/2024 3 
1 10/1/2024 4
2  1/2/2024 1 
2  2/2/2024 2 
2  3/2/2024 3 
2  4/2/2024 4 
;

Now we can use the UPDATE statement to collapse the VACCINE_1 to VACCINE_4 variables into one observation.

data wide;
  update have(obs=0 keep=id vaccine_1-vaccine_4) 
         have(keep=id vaccine_1-vaccine_4)
  ;
  by id;
run;

And then merge it back with the original dataset.

data want;
  merge have wide;
  by id;
  if not first.id then call missing(of vaccine_1-vaccine_4); 
run;

Tom_0-1729548627093.pngTom_1-1729548646236.png

Tom_0-1729548818790.png

 

 

Astounding
PROC Star

Here's a variation on Tom's program:

data wide;
  update have(obs=0 keep=id) 
         have(keep=id vaccine_1-vaccine_4 vaccine_sequence)
  ;
  by id;
run;

It gives you a wide form to the data, with all the useful information.  Its not the form that you asked for, but it does contain everything you might need.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1335 views
  • 3 likes
  • 6 in conversation