Solved: Filling up missing rows with previous rows value

jungjein · Posted 01-25-2019 08:27 AM

I have an excel file with data looking like this:

The data shows the company and it's operating locations.

How do I fill in those rows with missing Company and Profit with the previous values but if it hits a row with non-missing Company and Profit then it stops?

i.e. rows 3 to 8 should be Company='Microsoft' and Profit=1000

row 10 should be Company='Apple' and Profit=2000

PeterClemmensen · Posted 01-25-2019 08:54 AM

data have;
informat Company $20.Profit 8. Location $20.;
input Company $ Profit Location;
infile datalines dlm=',' dsd missover;
datalines;
Microsoft,1000,United States
,,India
,,Hong Kong
,,United Kingdom
,,Ireland
,,Korea
,,Japan
Apple,2000,United States
,,China
Alphabet,20,United States
,,Netherlands
,,Ireland
,,China
Samsung,3,Korea
,,Japan
,,China
;

data temp;
   set have;
   if not missing(Company) then group+1;
run;

data want(drop=group);
   update temp(obs=0) temp;
   by group;
   output;
run;

The DATA to DATA Step Macro
Blog: SASnrd

View solution in original post

Kurt_Bremser · Posted 01-25-2019 08:32 AM

Use retained variables:

data want;
set have;
retain _company _profit;
if company ne ''
then _company = company;
else company = _company;
if profit ne .
then _profit = profit;
else profit = _profit;
drop _company _profit;
run;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Ali_Ouda · Posted 08-21-2019 11:10 AM

Hello
Please Can you tell me what's the solution if the missing was before ?
If I want to fill the missing from the next row

---------------------------------------------------------------------

Kurt_Bremser · Posted 08-21-2019 12:20 PM

The simplest solution is to reverse sort the dataset, apply the logic, and then sort it back to the original order.

Edit: changed "log" to "logic".

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Ali_Ouda · Posted 08-21-2019 09:05 PM

Thank you very much.

PeterClemmensen · Posted 01-25-2019 08:54 AM

data have;
informat Company $20.Profit 8. Location $20.;
input Company $ Profit Location;
infile datalines dlm=',' dsd missover;
datalines;
Microsoft,1000,United States
,,India
,,Hong Kong
,,United Kingdom
,,Ireland
,,Korea
,,Japan
Apple,2000,United States
,,China
Alphabet,20,United States
,,Netherlands
,,Ireland
,,China
Samsung,3,Korea
,,Japan
,,China
;

data temp;
   set have;
   if not missing(Company) then group+1;
run;

data want(drop=group);
   update temp(obs=0) temp;
   by group;
   output;
run;

The DATA to DATA Step Macro
Blog: SASnrd

novinosrin · Posted 01-25-2019 09:10 AM

If your pattern holds true in all cases, the group creation and update will work shown by draycut and so will mine. But if your pattern differs, the best "controlled approach" by @Kurt_Bremser is probably the safest.

Just to start my morning with some hash fun, here you go->

data have;
informat Company $20.Profit 8. Location $20.;
input Company $ Profit Location;
infile datalines dlm=',' dsd missover;
datalines;
Microsoft,1000,United States
,,India
,,Hong Kong
,,United Kingdom
,,Ireland
,,Korea
,,Japan
Apple,2000,United States
,,China
Alphabet,20,United States
,,Netherlands
,,Ireland
,,China
Samsung,3,Korea
,,Japan
,,China
;

data want;
if _n_=1 then do;
   dcl hash H () ;
   h.definekey  ("_iorc_") ;
   h.definedata ("company", "profit") ;
   h.definedone () ;
   end;
set have;
if not missing(company) and not missing(profit) then do;
_iorc_+1;
h.add();
end;
else h.find();
run;

Thank you @PeterClemmensen for creating the data have sample.

mkeintz · Posted 01-25-2019 01:14 PM

If

You have lots of variables that could be missing, not just 2 or 3, making the RETAIN statement approach tedious.
and
Those variables are always missing simultaneously.

then you could take advantage of the SET with POINT= statement, and the DROP and RENAME parameters.

Here's an example using a dataset HAVE from SASHELP.CLASS in which each obs is completely written once, and then written 3 times with missing values SEX, WEIGHT, and HEIGHT. Those latter records need the data to be carried forward from the non-missing collection.

data have ;
  set sashelp.class;
  output;
  call missing(sex, weight,height);
  do copy=1 to 3; output; end;
run;

data want (drop=_sex);
  set have (drop=weight height  rename=(sex=_sex));
  if _sex^=' ' then set have (keep=sex height weight) point=_n_;
run;

The SET HAVE statement reads in all the variables except the variables in question. But it does read in and renames one of those variables (sex renamed to _sex). The provide a single data point to determine whether all the variables-of-interest are missing.
The "IF ... THEN SET" statement only occurs when the variables-of-interest are actually present. Because they are read by a SET statement they are automatically retained (and therefore output) until the next non-missing collection is encountered.

Note the DATA WANT step only takes 4 lines, no matter how many variables are "variables-of-interest".

But I repeat, this is for cases in which you know those variables are synchronized in there missing value occurances. Of course you could break this approach down if you had, say, 2 distinct collections of such variables.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Filling up missing rows with previous rows value

Re: Filling up missing rows with previous rows value

Re: Filling up missing rows with previous rows value

Re: Filling up missing rows with previous rows value

Re: Filling up missing rows with previous rows value

Re: Filling up missing rows with previous rows value

Re: Filling up missing rows with previous rows value

Re: Filling up missing rows with previous rows value

Re: Filling up missing rows with previous rows value

Registration is open

Registration is open

SAS Training: Just a Click Away