Hi,
I'd like to duplicate rows in my dataset that have a weight over a certain threshold (1000 for instance), and then reduce the weight by 2.
For instance, this:
PERWT | SEX | agegr | edu | region |
1100 | 0 | 65 | e1 | Andhra Prades |
294.9 | 0 | 40 | e3 | Andhra Prades |
Should be like this:
PERWT | SEX | agegr | edu | region |
550 | 0 | 30 | e1 | Andhra Prades |
550 | 0 | 30 | e1 | Andhra Prades |
294.9 | 0 | 40 | e3 | Andhra Prades |
I know I can do it with the output statement, but I don't really get how this works.
Thank you
Ok. As I said, you can simply write a new data step and apply the logic there. For the two posted statements, they can easily be applied before the do loop as
data have;
input PERWT SEX agegr edu $ region :$20.;
datalines;
1100 0 65 e1 AndhraPrades
294.9 0 40 e3 AndhraPrades
294.9 9 40 e3 AndhraPrades
;
data want;
set have;
if sex ne 9;
sex=sex-1;
if PERWT ge 1000 then do;
PERWT=PERWT/2;
output;
end;
output;
run;
like this?
data have;
input PERWT SEX agegr edu $ region :$20.;
datalines;
1100 0 65 e1 AndhraPrades
294.9 0 40 e3 AndhraPrades
;
data want;
set have;
if PERWT ge 1000 then do;
PERWT=PERWT/2;
output;
end;
output;
run;
Result:
PERWT SEX agegr edu region 550 0 65 e1 AndhraPrades 550 0 65 e1 AndhraPrades 294.9 0 40 e3 AndhraPrades
What other commands/statements? It's hard to write working code when I don't know what it should work for 🙂
In any case, you could simply write a new data step and apply your logic there if your data is not very large.
Other lines are just recoding of variables, such as
if sex=9 then delete;
sex=sex-1;
Ok. As I said, you can simply write a new data step and apply the logic there. For the two posted statements, they can easily be applied before the do loop as
data have;
input PERWT SEX agegr edu $ region :$20.;
datalines;
1100 0 65 e1 AndhraPrades
294.9 0 40 e3 AndhraPrades
294.9 9 40 e3 AndhraPrades
;
data want;
set have;
if sex ne 9;
sex=sex-1;
if PERWT ge 1000 then do;
PERWT=PERWT/2;
output;
end;
output;
run;
If I want to create a loop that will make that a duplicate the row and divided by 2 the weight as long as the weight is >1000, what should I do?
I tried this:
data work.want;
set work.have;
do while (perwt>1000);
perwt=perwt/2;
output;
end;
output;
run;
However, the result is not what expected. If perwt is 10000, the value of perwt for duplicated lines will be:
5000
2500
775
775
What I'd want is 10 lines with a weight of 1000 for each of them.
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.