Re: Print variable only on first line

smeeker · Posted 10-21-2024 12:51 PM

I've posted what I have and what I want in a table below. Here's the code I'm currently using. Wondering if there is a way to only print the vaccine_x variable on the first line of the id. Then I plan to only keep first.id. I can create new variables if needed. Thanks in advance.

Current code:

data want;
set have;
format vaccine_1 vaccine_2 vaccine_3 vaccine_4 vaccine_5 mmddyy10.;
if vaccine_sequence=1 then vaccine_1 = vaccine_admin_date;
else vaccine_1=.;
if vaccine_sequence=2 then vaccine_2 = vaccine_admin_date;
else vaccine_2=.;
if vaccine_sequence=3 then vaccine_3 = vaccine_admin_date;
else vaccine_3=.;
if vaccine_sequence=4 then vaccine_4 = vaccine_admin_date;
else vaccine_4=.;
if vaccine_sequence=5 then vaccine_5 = vaccine_admin_date;
else vaccine_5=.; proc print;

HAVE
id	vaccine_admin_date	vaccine_sequence	vaccine_1	vaccine_2	vaccine_3	vaccine_4
1	7/1/2024	1	7/1/2024
1	8/1/2024	2		8/1/2024
1	9/1/2024	3			9/1/2024
1	10/1/2024	4				10/1/2024
2	1/2/2024	1	1/2/2024
2	2/2/2024	2		2/2/2024
2	3/2/2024	3			3/2/2024
2	4/2/2024	4				4/2/2024

WANT
id	vaccine_admin_date	vaccine_sequence	vaccine_1	vaccine_2	vaccine_3	vaccine_4
1	7/1/2024	1	7/1/2024	8/1/2024	9/1/2024	10/1/2024
1	8/1/2024	2
1	9/1/2024	3
1	10/1/2024	4
2	1/2/2024	1	1/2/2024	2/2/2024	3/2/2024	4/2/2024
2	2/2/2024	2
2	3/2/2024	3
2	4/2/2024	4

quickbluefish · Posted 10-21-2024 04:10 PM

proc sql noprint;
select max(nvax) into :maxvax trimmed from (select id, count(*) as nvax from HAVE group by id);
quit;

proc sort data=want; by id vax_admin_date; run;

data want;
set have;
by id;
length vax1-vax&maxvax 4 n 3;
format vax1-vax&maxvax date9.;
array V {*} vax1-vax&maxvax;
retain V n;
if first.id then do;
    call missing(of V[*]);
    n=0;
end;
n+1;
V[n]=vaccine_admin_date;
if last.id then output;
keep id vax1-vax&maxvax;
run;

quickbluefish · Posted 10-21-2024 04:13 PM

Sorry, the PROC SORT should be on HAVE, not WANT. And I accidentally wrote 'vax_admin_date' instead of 'vaccine_admin_date'

Quentin · Posted 10-21-2024 04:42 PM

It sounds like you HAVE a lovely vertical dataset. That is what I would normally like to have:

data have ;
  input id vaccine_admin_date	mmddyy10. vaccine_sequence ;
  format vaccine_admin_date	mmddyy10. ;
  cards ;
1	7/1/2024	1
1	8/1/2024	2
1	9/1/2024	3
1	10/1/2024	4
2	1/2/2024	1
2	2/2/2024	2
2	3/2/2024	3
2	4/2/2024	4
;

If you want to transpose it into a wide dataset with one record per ID, you could use PROC TRANPOSE, e.g.:

proc transpose data=have out=want prefix=Vaccine_;
  var vaccine_admin_date ;
  by id ;
run ;

proc print ;
run ;

which will output:

Obs    id          _NAME_           Vaccine_1     Vaccine_2     Vaccine_3     Vaccine_4

 1      1    vaccine_admin_date    07/01/2024    08/01/2024    09/01/2024    10/01/2024
 2      2    vaccine_admin_date    01/02/2024    02/02/2024    03/02/2024    04/02/2024

Usually it's easiest to work with the vertical dataset. Sometimes it's helpful to transpose into the wide dataset. But he WANT dataset you showed, with extra rows with blank records for Vaccine_1-Vaccine_4, does not look very useful to me. Are you sure you really want that?

Patrick · Posted 10-21-2024 05:42 PM

As a data structure (a table) your Want is sub-optimal and will be hard to work with. It's most of the time much better to organise the data in a long and narrow structure.

Such data structures can then get used with report procs like Proc Report:

And here the code for above:

data have;
  input id $ vaccine_admin_date :ddmmyy10. vaccine_sequence;
  format vaccine_admin_date ddmmyy10.;
  datalines;
1 07/01/2024 1
1 08/01/2024 2
1 09/01/2024 3
1 10/01/2024 4
2 01/02/2024 1
2 02/02/2024 2
2 03/02/2024 3
2 04/02/2024 4
;
run;

proc report data=have;
  columns id vaccine_sequence, vaccine_admin_date;
  define id / group;
  define vaccine_sequence / group across;
run;

Tom · Posted 10-21-2024 05:57 PM

The values in your vaccine_1 - vaccine_5 variables all have the exact same values are are in the vaccine_admin_date variable.

Is that always the case? If not what would it mean when the two dates did no agree?

How did they get that way?

Why do you want them that way?

Tom · Posted 10-21-2024 06:11 PM

Seem like a silly thing to do. But here goes.

First let's re-create your strange dataset. (I suspect something like this was the cause of the strange layout).

data have;
  input id vaccine_admin_date :mmddyy. vaccine_sequence;
  array vaccine_ [4];
  format vaccine_1-vaccine_4 vaccine_admin_date yymmdd10.;
  vaccine_[vaccine_sequence]=vaccine_admin_date;
cards;
1  7/1/2024 1 
1  8/1/2024 2 
1  9/1/2024 3 
1 10/1/2024 4
2  1/2/2024 1 
2  2/2/2024 2 
2  3/2/2024 3 
2  4/2/2024 4 
;

Now we can use the UPDATE statement to collapse the VACCINE_1 to VACCINE_4 variables into one observation.

data wide;
  update have(obs=0 keep=id vaccine_1-vaccine_4) 
         have(keep=id vaccine_1-vaccine_4)
  ;
  by id;
run;

And then merge it back with the original dataset.

data want;
  merge have wide;
  by id;
  if not first.id then call missing(of vaccine_1-vaccine_4); 
run;

Astounding · Posted 10-21-2024 11:08 PM

Here's a variation on Tom's program:

data wide;
  update have(obs=0 keep=id) 
         have(keep=id vaccine_1-vaccine_4 vaccine_sequence)
  ;
  by id;
run;

It gives you a wide form to the data, with all the useful information. Its not the form that you asked for, but it does contain everything you might need.

Registration is open

SAS Training: Just a Click Away