BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tburus
Obsidian | Level 7

I have a dataset that I am using to create multiple time-series like graphs (the x-variable is academic years, Year2 in the data). My data doesn't contain a complete sequence of years so I am trying to find a way to fill them in. This data set is part of a program being used to automate report building, so I need it to be as generic as possible, as the results may change from run to run. I have attached a picture of my current data table. I need to complete the sequence of Year from 1 to 15 for each distinct combination of Sport, Injcomb and Event_Type (I can convert to academic years and fill in missing ratios with 0 later).

 

data.png

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

How about

 

data have;
	length sport $30 injcomb $30;
	infile datalines delimiter=' ';
   	input sport $ injcomb $ event_type year ratio;
   	datalines;
boys_swimming Muscle_Strain 1 5 0.12
boys_swimming Muscle_Strain 1 8 0.05
boys_swimming Muscle_Strain 1 9 0.05
boys_swimming Muscle_Strain 1 11 0.06
boys_swimming Muscle_Strain 1 15 0.28
boys_swimming Concussion 1 6 0.28
boys_swimming Concussion 1 10 0.28
boys_swimming Concussion 1 12 0.28
boys_swimming Concussion 1 14 0.28
boys_swimming Tendonitis 2 4 0.28
boys_swimming Tendonitis 2 5 0.28
boys_swimming Tendonitis 2 6 0.28
boys_swimming Tendonitis 2 7 0.28
boys_swimming Tendonitis 2 9 0.28
boys_swimming Tendonitis 2 13 0.28
boys_swimming Tendonitis 2 14 0.28
boys_swimming Tendonitis 2 15 0.28
boys_swimming Concussion 2 7 0.28
boys_swimming Concussion 2 8 0.28
boys_swimming Concussion 2 10 0.28
boys_swimming Concussion 2 11 0.28
boys_swimming Concussion 2 12 0.28
boys_swimming Concussion 2 13 0.28
boys_swimming Concussion 2 14 0.28
boys_swimming Concussion 2 15 0.28
;

data want;
   if _N_ = 1 then do;
      dcl hash h(dataset : "have");
      h.definekey("sport", "injcomb", "year");
      h.definedata("ratio");
      h.definedone();
   end;

   set have;
   by sport injcomb notsorted;

   if first.injcomb then do year = 1 to 15;
      if h.find() ne 0 then ratio = .;
      output;
   end;
run;

 

Result:

 

Spoiler
sport          injcomb        event_type  year  ratio 
boys_swimming  Muscle_Strain  1           1     . 
boys_swimming  Muscle_Strain  1           2     . 
boys_swimming  Muscle_Strain  1           3     . 
boys_swimming  Muscle_Strain  1           4     . 
boys_swimming  Muscle_Strain  1           5     0.12 
boys_swimming  Muscle_Strain  1           6     . 
boys_swimming  Muscle_Strain  1           7     . 
boys_swimming  Muscle_Strain  1           8     0.05 
boys_swimming  Muscle_Strain  1           9     0.05 
boys_swimming  Muscle_Strain  1           10    . 
boys_swimming  Muscle_Strain  1           11    0.06 
boys_swimming  Muscle_Strain  1           12    . 
boys_swimming  Muscle_Strain  1           13    . 
boys_swimming  Muscle_Strain  1           14    . 
boys_swimming  Muscle_Strain  1           15    0.28 
boys_swimming  Concussion     1           1     . 
boys_swimming  Concussion     1           2     . 
boys_swimming  Concussion     1           3     . 
boys_swimming  Concussion     1           4     . 
boys_swimming  Concussion     1           5     . 
boys_swimming  Concussion     1           6     0.28 
boys_swimming  Concussion     1           7     0.28 
boys_swimming  Concussion     1           8     0.28 
boys_swimming  Concussion     1           9     . 
boys_swimming  Concussion     1           10    0.28 
boys_swimming  Concussion     1           11    0.28 
boys_swimming  Concussion     1           12    0.28 
boys_swimming  Concussion     1           13    0.28 
boys_swimming  Concussion     1           14    0.28 
boys_swimming  Concussion     1           15    0.28 
boys_swimming  Tendonitis     2           1     . 
boys_swimming  Tendonitis     2           2     . 
boys_swimming  Tendonitis     2           3     . 
boys_swimming  Tendonitis     2           4     0.28 
boys_swimming  Tendonitis     2           5     0.28 
boys_swimming  Tendonitis     2           6     0.28 
boys_swimming  Tendonitis     2           7     0.28 
boys_swimming  Tendonitis     2           8     . 
boys_swimming  Tendonitis     2           9     0.28 
boys_swimming  Tendonitis     2           10    . 
boys_swimming  Tendonitis     2           11    . 
boys_swimming  Tendonitis     2           12    . 
boys_swimming  Tendonitis     2           13    0.28 
boys_swimming  Tendonitis     2           14    0.28 
boys_swimming  Tendonitis     2           15    0.28 
boys_swimming  Concussion     2           1     . 
boys_swimming  Concussion     2           2     . 
boys_swimming  Concussion     2           3     . 
boys_swimming  Concussion     2           4     . 
boys_swimming  Concussion     2           5     . 
boys_swimming  Concussion     2           6     0.28 
boys_swimming  Concussion     2           7     0.28 
boys_swimming  Concussion     2           8     0.28 
boys_swimming  Concussion     2           9     . 
boys_swimming  Concussion     2           10    0.28 
boys_swimming  Concussion     2           11    0.28 
boys_swimming  Concussion     2           12    0.28 
boys_swimming  Concussion     2           13    0.28 
boys_swimming  Concussion     2           14    0.28 
boys_swimming  Concussion     2           15    0.28 

View solution in original post

5 REPLIES 5
andreas_lds
Jade | Level 19

Please post the data you have in usable form and show what you expect as result after missing values have been added.

tburus
Obsidian | Level 7

Here's a minimal example:

 

data have;
	length sport $30 injcomb $30;
	infile datalines delimiter=' ';
   	input sport $ injcomb $ event_type year ratio;
   	datalines;
boys_swimming Muscle_Strain 1 5 0.12
boys_swimming Muscle_Strain 1 8 0.05
boys_swimming Muscle_Strain 1 9 0.05
boys_swimming Muscle_Strain 1 11 0.06
boys_swimming Muscle_Strain 1 15 0.28
boys_swimming Concussion 1 6 0.28
boys_swimming Concussion 1 10 0.28
boys_swimming Concussion 1 12 0.28
boys_swimming Concussion 1 14 0.28
boys_swimming Tendonitis 2 4 0.28
boys_swimming Tendonitis 2 5 0.28
boys_swimming Tendonitis 2 6 0.28
boys_swimming Tendonitis 2 7 0.28
boys_swimming Tendonitis 2 9 0.28
boys_swimming Tendonitis 2 13 0.28
boys_swimming Tendonitis 2 14 0.28
boys_swimming Tendonitis 2 15 0.28
boys_swimming Concussion 2 7 0.28
boys_swimming Concussion 2 8 0.28
boys_swimming Concussion 2 10 0.28
boys_swimming Concussion 2 11 0.28
boys_swimming Concussion 2 12 0.28
boys_swimming Concussion 2 13 0.28
boys_swimming Concussion 2 14 0.28
boys_swimming Concussion 2 15 0.28
;

I want this with a complete sequence of years 1 to 15 for each sport, injcomb and event_type combo (i.e. boys_swimming muscle_strain 1). For added lines, ratio can be missing.

PeterClemmensen
Tourmaline | Level 20

How about

 

data have;
	length sport $30 injcomb $30;
	infile datalines delimiter=' ';
   	input sport $ injcomb $ event_type year ratio;
   	datalines;
boys_swimming Muscle_Strain 1 5 0.12
boys_swimming Muscle_Strain 1 8 0.05
boys_swimming Muscle_Strain 1 9 0.05
boys_swimming Muscle_Strain 1 11 0.06
boys_swimming Muscle_Strain 1 15 0.28
boys_swimming Concussion 1 6 0.28
boys_swimming Concussion 1 10 0.28
boys_swimming Concussion 1 12 0.28
boys_swimming Concussion 1 14 0.28
boys_swimming Tendonitis 2 4 0.28
boys_swimming Tendonitis 2 5 0.28
boys_swimming Tendonitis 2 6 0.28
boys_swimming Tendonitis 2 7 0.28
boys_swimming Tendonitis 2 9 0.28
boys_swimming Tendonitis 2 13 0.28
boys_swimming Tendonitis 2 14 0.28
boys_swimming Tendonitis 2 15 0.28
boys_swimming Concussion 2 7 0.28
boys_swimming Concussion 2 8 0.28
boys_swimming Concussion 2 10 0.28
boys_swimming Concussion 2 11 0.28
boys_swimming Concussion 2 12 0.28
boys_swimming Concussion 2 13 0.28
boys_swimming Concussion 2 14 0.28
boys_swimming Concussion 2 15 0.28
;

data want;
   if _N_ = 1 then do;
      dcl hash h(dataset : "have");
      h.definekey("sport", "injcomb", "year");
      h.definedata("ratio");
      h.definedone();
   end;

   set have;
   by sport injcomb notsorted;

   if first.injcomb then do year = 1 to 15;
      if h.find() ne 0 then ratio = .;
      output;
   end;
run;

 

Result:

 

Spoiler
sport          injcomb        event_type  year  ratio 
boys_swimming  Muscle_Strain  1           1     . 
boys_swimming  Muscle_Strain  1           2     . 
boys_swimming  Muscle_Strain  1           3     . 
boys_swimming  Muscle_Strain  1           4     . 
boys_swimming  Muscle_Strain  1           5     0.12 
boys_swimming  Muscle_Strain  1           6     . 
boys_swimming  Muscle_Strain  1           7     . 
boys_swimming  Muscle_Strain  1           8     0.05 
boys_swimming  Muscle_Strain  1           9     0.05 
boys_swimming  Muscle_Strain  1           10    . 
boys_swimming  Muscle_Strain  1           11    0.06 
boys_swimming  Muscle_Strain  1           12    . 
boys_swimming  Muscle_Strain  1           13    . 
boys_swimming  Muscle_Strain  1           14    . 
boys_swimming  Muscle_Strain  1           15    0.28 
boys_swimming  Concussion     1           1     . 
boys_swimming  Concussion     1           2     . 
boys_swimming  Concussion     1           3     . 
boys_swimming  Concussion     1           4     . 
boys_swimming  Concussion     1           5     . 
boys_swimming  Concussion     1           6     0.28 
boys_swimming  Concussion     1           7     0.28 
boys_swimming  Concussion     1           8     0.28 
boys_swimming  Concussion     1           9     . 
boys_swimming  Concussion     1           10    0.28 
boys_swimming  Concussion     1           11    0.28 
boys_swimming  Concussion     1           12    0.28 
boys_swimming  Concussion     1           13    0.28 
boys_swimming  Concussion     1           14    0.28 
boys_swimming  Concussion     1           15    0.28 
boys_swimming  Tendonitis     2           1     . 
boys_swimming  Tendonitis     2           2     . 
boys_swimming  Tendonitis     2           3     . 
boys_swimming  Tendonitis     2           4     0.28 
boys_swimming  Tendonitis     2           5     0.28 
boys_swimming  Tendonitis     2           6     0.28 
boys_swimming  Tendonitis     2           7     0.28 
boys_swimming  Tendonitis     2           8     . 
boys_swimming  Tendonitis     2           9     0.28 
boys_swimming  Tendonitis     2           10    . 
boys_swimming  Tendonitis     2           11    . 
boys_swimming  Tendonitis     2           12    . 
boys_swimming  Tendonitis     2           13    0.28 
boys_swimming  Tendonitis     2           14    0.28 
boys_swimming  Tendonitis     2           15    0.28 
boys_swimming  Concussion     2           1     . 
boys_swimming  Concussion     2           2     . 
boys_swimming  Concussion     2           3     . 
boys_swimming  Concussion     2           4     . 
boys_swimming  Concussion     2           5     . 
boys_swimming  Concussion     2           6     0.28 
boys_swimming  Concussion     2           7     0.28 
boys_swimming  Concussion     2           8     0.28 
boys_swimming  Concussion     2           9     . 
boys_swimming  Concussion     2           10    0.28 
boys_swimming  Concussion     2           11    0.28 
boys_swimming  Concussion     2           12    0.28 
boys_swimming  Concussion     2           13    0.28 
boys_swimming  Concussion     2           14    0.28 
boys_swimming  Concussion     2           15    0.28 
tburus
Obsidian | Level 7
Thanks. I made a slight adjustment to fit my actual data and it worked great.
PeterClemmensen
Tourmaline | Level 20

I'm glad to hear that 🙂

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1885 views
  • 0 likes
  • 3 in conversation