Hi,
I'd like to duplicate rows in my dataset that have a weight over a certain threshold (1000 for instance), and then reduce the weight by 2.
For instance, this:
PERWT | SEX | agegr | edu | region |
1100 | 0 | 65 | e1 | Andhra Prades |
294.9 | 0 | 40 | e3 | Andhra Prades |
Should be like this:
PERWT | SEX | agegr | edu | region |
550 | 0 | 30 | e1 | Andhra Prades |
550 | 0 | 30 | e1 | Andhra Prades |
294.9 | 0 | 40 | e3 | Andhra Prades |
I know I can do it with the output statement, but I don't really get how this works.
Thank you
Ok. As I said, you can simply write a new data step and apply the logic there. For the two posted statements, they can easily be applied before the do loop as
data have;
input PERWT SEX agegr edu $ region :$20.;
datalines;
1100 0 65 e1 AndhraPrades
294.9 0 40 e3 AndhraPrades
294.9 9 40 e3 AndhraPrades
;
data want;
set have;
if sex ne 9;
sex=sex-1;
if PERWT ge 1000 then do;
PERWT=PERWT/2;
output;
end;
output;
run;
like this?
data have;
input PERWT SEX agegr edu $ region :$20.;
datalines;
1100 0 65 e1 AndhraPrades
294.9 0 40 e3 AndhraPrades
;
data want;
set have;
if PERWT ge 1000 then do;
PERWT=PERWT/2;
output;
end;
output;
run;
Result:
PERWT SEX agegr edu region 550 0 65 e1 AndhraPrades 550 0 65 e1 AndhraPrades 294.9 0 40 e3 AndhraPrades
What other commands/statements? It's hard to write working code when I don't know what it should work for 🙂
In any case, you could simply write a new data step and apply your logic there if your data is not very large.
Other lines are just recoding of variables, such as
if sex=9 then delete;
sex=sex-1;
Ok. As I said, you can simply write a new data step and apply the logic there. For the two posted statements, they can easily be applied before the do loop as
data have;
input PERWT SEX agegr edu $ region :$20.;
datalines;
1100 0 65 e1 AndhraPrades
294.9 0 40 e3 AndhraPrades
294.9 9 40 e3 AndhraPrades
;
data want;
set have;
if sex ne 9;
sex=sex-1;
if PERWT ge 1000 then do;
PERWT=PERWT/2;
output;
end;
output;
run;
If I want to create a loop that will make that a duplicate the row and divided by 2 the weight as long as the weight is >1000, what should I do?
I tried this:
data work.want;
set work.have;
do while (perwt>1000);
perwt=perwt/2;
output;
end;
output;
run;
However, the result is not what expected. If perwt is 10000, the value of perwt for duplicated lines will be:
5000
2500
775
775
What I'd want is 10 lines with a weight of 1000 for each of them.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.