Solved: Repeating numbers in cell

hk2013 · Posted 07-26-2019 02:36 PM

Data: I have a dataset that has a lot of empty cells and i want those cell to repeat for the ID.

ID	o1	o2	o3
1	20	.	.
1	.	3	.
1	.	.	.
1	.	.	5
2	.	.	.
2	33	.	.
2	.	.	9
2	.	36	.
3	.	.	.
3	.	.	5
3	.	65	.
3	43	.	.

want output to look like this:

ID	o1	o2	o3
1	20	3	5
1	20	3	5
1	20	3	5
1	20	3	5
2	33	36	9
2	33	36	9
2	33	36	9
2	33	36	9
3	43	65	5
3	43	65	5
3	43	65	5
3	43	65	5

Tom · Posted 07-26-2019 03:25 PM

First let's turn your web table into a SAS data set.

data have;
  input ID 	o1 	o2	o3 ;
cards;
1	20	.	.
1	.	3	.
1	.	.	.
1	.	.	5
2	.	.	.
2	33	.	.
2	.	.	9
2	.	36	.
3	.	.	.
3	.	.	5
3	.	65	.
3	43	.	.
;

If you want to get the last non-missing value for a BY group then the UPDATE statement is a good method. You need a master and a transaction dataset, but you can just use OBS=0 dataset option to make your single dataset serve both roles.

data small ;
  update have(obs=0) have;
  by id;
run;

Result:

Obs    ID    o1    o2    o3

 1      1    20     3     5
 2      2    33    36     9
 3      3    43    65     5

Now if you really need to get all of the original observations output (perhaps there are other variables you didn't provide that vary from observation to observation for the same value of ID) then add a little more logic to merge back on the detailed records. Remember to not re-read the variables that you wanted to collapse.

data big;
  do _n_=1 by 1 until (last.id);
    update have(obs=0) have;
    by id;
  end;
  do _n_=1 to _n_;
    set have (drop=o1 o2 o3);
    output;
  end;
run;

View solution in original post

PaigeMiller · Posted 07-26-2019 02:52 PM

Is it always true that for a given ID, there will be only one number in each column? Is it ever possible that for a given ID, a column might have 2 (or more) numbers?

--
Paige Miller

novinosrin · Posted 07-26-2019 03:08 PM

Sir @PaigeMiller You sound fresh at 3pm , great question and I am concerned about that too.

@hk2013 If we can assume one non missing value for each column, the below solution is easy and straight forward

data have;
input ID 	o1 	o2	o3;
cards;
1	20	.	.
1	.	3	.
1	.	.	.
1	.	.	5
2	.	.	.
2	33	.	.
2	.	.	9
2	.	36	.
3	.	.	.
3	.	.	5
3	.	65	.
3	43	.	.
;

data want;
array temp(3) _temporary_;
call missing(of temp(*));
do _n_=1 by 1 until(last.id);
 set have;
 by id;
 array t (i) o1-o3;
 do over t;
 if not missing(t) then temp(i)=t;
 end;
end;
do _n_=1 to _n_;
 set have;
 by id;
 do over t;
 if missing(t) then t=temp(i);
 end;
 output;
end;
drop i;
run;

hk2013 · Posted 07-26-2019 03:09 PM

it will always be one number for the column

Tom · Posted 07-26-2019 03:16 PM

Why do you need multiple observations per group if they all end up being the same?

hk2013 · Posted 07-26-2019 03:19 PM

i dont really need them to repeat but i do need to pull out those individual number for the IDs but they are all in different rows

PaigeMiller · Posted 07-26-2019 03:34 PM

@hk2013 wrote:

i dont really need them to repeat but i do need to pull out those individual number for the IDs but they are all in different rows

Then you just need to run PROC SUMMARY in my earlier reply.

--
Paige Miller

PaigeMiller · Posted 07-26-2019 03:19 PM

A common theme: Whenever @novinosrin writes a data step solution, I write a PROC SUMMARY solution. That's why people call me Mr. PROC SUMMARY.

proc summary data=have nway;
    class id;
	var o1-o3;
	output out=max max=;
run;

data want;
    merge have(drop=o1-o3) max;
	by id;
run;

--
Paige Miller

Tom · Posted 07-26-2019 03:25 PM

First let's turn your web table into a SAS data set.

data have;
  input ID 	o1 	o2	o3 ;
cards;
1	20	.	.
1	.	3	.
1	.	.	.
1	.	.	5
2	.	.	.
2	33	.	.
2	.	.	9
2	.	36	.
3	.	.	.
3	.	.	5
3	.	65	.
3	43	.	.
;

If you want to get the last non-missing value for a BY group then the UPDATE statement is a good method. You need a master and a transaction dataset, but you can just use OBS=0 dataset option to make your single dataset serve both roles.

data small ;
  update have(obs=0) have;
  by id;
run;

Result:

Obs    ID    o1    o2    o3

 1      1    20     3     5
 2      2    33    36     9
 3      3    43    65     5

Now if you really need to get all of the original observations output (perhaps there are other variables you didn't provide that vary from observation to observation for the same value of ID) then add a little more logic to merge back on the detailed records. Remember to not re-read the variables that you wanted to collapse.

data big;
  do _n_=1 by 1 until (last.id);
    update have(obs=0) have;
    by id;
  end;
  do _n_=1 to _n_;
    set have (drop=o1 o2 o3);
    output;
  end;
run;

Repeating numbers in cell

Re: Repeating numbers in cell

Re: Repeating numbers in cell

Re: Repeating numbers in cell

Re: Repeating numbers in cell

Re: Repeating numbers in cell

Re: Repeating numbers in cell

Re: Repeating numbers in cell

Re: Repeating numbers in cell

Re: Repeating numbers in cell

ID	o1	o2	o3
1	20	.	.
1	.	3	.
1	.	.	.
1	.	.	5
2	.	.	.
2	33	.	.
2	.	.	9
2	.	36	.
3	.	.	.
3	.	.	5
3	.	65	.
3	43	.	.

ID	o1	o2	o3
1	20	3	5
1	20	3	5
1	20	3	5
1	20	3	5
2	33	36	9
2	33	36	9
2	33	36	9
2	33	36	9
3	43	65	5
3	43	65	5
3	43	65	5
3	43	65	5

ID	o1	o2	o3
1	20	.	.
1	.	3	.
1	.	.	.
1	.	.	5
2	.	.	.
2	33	.	.
2	.	.	9
2	.	36	.
3	.	.	.
3	.	.	5
3	.	65	.
3	43	.	.

ID	o1	o2	o3
1	20	3	5
1	20	3	5
1	20	3	5
1	20	3	5
2	33	36	9
2	33	36	9
2	33	36	9
2	33	36	9
3	43	65	5
3	43	65	5
3	43	65	5
3	43	65	5

SAS Innovate 2025: Call for Content

Classroom Training Available!

ID	o1	o2	o3
1	20	.	.
1	.	3	.
1	.	.	.
1	.	.	5
2	.	.	.
2	33	.	.
2	.	.	9
2	.	36	.
3	.	.	.
3	.	.	5
3	.	65	.
3	43	.	.

ID	o1	o2	o3
1	20	3	5
1	20	3	5
1	20	3	5
1	20	3	5
2	33	36	9
2	33	36	9
2	33	36	9
2	33	36	9
3	43	65	5
3	43	65	5
3	43	65	5
3	43	65	5