Solved: Duplicate rows in a data procedure

Demographer · Posted 10-09-2019 02:35 AM

Hi,

I'd like to duplicate rows in my dataset that have a weight over a certain threshold (1000 for instance), and then reduce the weight by 2.

For instance, this:

PERWT	SEX	agegr	edu	region
1100	0	65	e1	Andhra Prades
294.9	0	40	e3	Andhra Prades

Should be like this:

PERWT	SEX	agegr	edu	region
550	0	30	e1	Andhra Prades
550	0	30	e1	Andhra Prades
294.9	0	40	e3	Andhra Prades

I know I can do it with the output statement, but I don't really get how this works.

Thank you

PeterClemmensen · Posted 10-09-2019 02:55 AM

Ok. As I said, you can simply write a new data step and apply the logic there. For the two posted statements, they can easily be applied before the do loop as

data have;
input PERWT SEX agegr edu $ region :$20.;
datalines;
1100 0 65 e1 AndhraPrades
294.9 0 40 e3 AndhraPrades
294.9 9 40 e3 AndhraPrades
;

data want;
    set have;
    if sex ne 9;
    sex=sex-1;
    if PERWT ge 1000 then do;
        PERWT=PERWT/2;
        output;
    end;
    output;
run;

The DATA to DATA Step Macro
Blog: SASnrd

View solution in original post

PeterClemmensen · Posted 10-09-2019 02:39 AM

like this?

data have;
input PERWT SEX agegr edu $ region :$20.;
datalines;
1100 0 65 e1 AndhraPrades
294.9 0 40 e3 AndhraPrades
;

data want;
    set have;
    if PERWT ge 1000 then do;
        PERWT=PERWT/2;
        output;
    end;
    output;
run;

Result:

PERWT SEX agegr edu region
550   0   65    e1  AndhraPrades
550   0   65    e1  AndhraPrades
294.9 0   40    e3  AndhraPrades

The DATA to DATA Step Macro
Blog: SASnrd

Demographer · Posted 10-09-2019 02:43 AM

It works, but only when those lines are the last ones before runs; If there are other commands below, it doesn't work (values become missing).

PeterClemmensen · Posted 10-09-2019 02:45 AM

What other commands/statements? It's hard to write working code when I don't know what it should work for 🙂

In any case, you could simply write a new data step and apply your logic there if your data is not very large.

The DATA to DATA Step Macro
Blog: SASnrd

Demographer · Posted 10-09-2019 02:52 AM

Other lines are just recoding of variables, such as

if sex=9 then delete;
sex=sex-1;

PeterClemmensen · Posted 10-09-2019 02:55 AM

Ok. As I said, you can simply write a new data step and apply the logic there. For the two posted statements, they can easily be applied before the do loop as

data have;
input PERWT SEX agegr edu $ region :$20.;
datalines;
1100 0 65 e1 AndhraPrades
294.9 0 40 e3 AndhraPrades
294.9 9 40 e3 AndhraPrades
;

data want;
    set have;
    if sex ne 9;
    sex=sex-1;
    if PERWT ge 1000 then do;
        PERWT=PERWT/2;
        output;
    end;
    output;
run;

The DATA to DATA Step Macro
Blog: SASnrd

Demographer · Posted 10-09-2019 03:17 AM

If I want to create a loop that will make that a duplicate the row and divided by 2 the weight as long as the weight is >1000, what should I do?

I tried this:

data work.want;
set work.have;
do while (perwt>1000);
perwt=perwt/2;
output;
end;
output;

run;

However, the result is not what expected. If perwt is 10000, the value of perwt for duplicated lines will be:

5000

2500

775

What I'd want is 10 lines with a weight of 1000 for each of them.

Duplicate rows in a data procedure

Re: Duplicate rows in a data procedure

Re: Duplicate rows in a data procedure

Re: Duplicate rows in a data procedure

Re: Duplicate rows in a data procedure

Re: Duplicate rows in a data procedure

Re: Duplicate rows in a data procedure

Re: Duplicate rows in a data procedure

Registration is open