Hello,
I am trying to expand the number of observations in a dataset based on certain conditions and am little stuck. I think it will be easier to explain through an example:
Have Need
Date | Denom | Numer | Date | Denom | Numer | |
1/1/2008 | 4 | 2 | 1/1/2008 | 1 | 1 | |
4/1/2008 | 3 | 1 | 1/1/2008 | 1 | 0 | |
7/1/2008 | 2 | 2 | 1/1/2008 | 1 | 1 | |
10/1/2008 | 4 | 1 | 1/1/2008 | 1 | 0 | |
1/1/2008 | 1 | 0 | ||||
4/1/2008 | 1 | 0 | ||||
4/1/2008 | 1 | 1 | ||||
4/1/2008 | 1 | 0 | ||||
. | ||||||
. | ||||||
. |
So, I am expanding the first observation based on the value of Denom, i.e. 4, and randomly distributing the value of Numer, i.e. 2, across the four observations. Any help to solve this is greatly appreciated.
Thanks
Try this:
data have;
informat date mmddyy10. ;
format date mmddyy10.;
input Date Denom Numer;
datalines;
1/1/2008 4 2
4/1/2008 3 1
7/1/2008 2 2
10/1/2008 4 1
;
run;
data want;
set have (rename=(denom=d numer=n));
do i = 1 to d;
denom = 1;
numer= (i le n);
output;
end;
drop d n i;
run;
Does your Need example happen to have an accidental extra record? Denom for the 1/1/2008 date is 4 but you show 5 records in the need.
I am somewhat interested in what you will do with the resulting data set. Are there other variables not mentioned involved in your actual problem? Otherwise I don't see why there would be any need for "random" distribution.
Good observation ballardw, the need table should have only 4 observations. The distribution of Numer does not need to be "random". In the example given, the Numer for first two observations can be populated as 1 and the remaining 2 will have a value of 0.
Try this:
data have;
informat date mmddyy10. ;
format date mmddyy10.;
input Date Denom Numer;
datalines;
1/1/2008 4 2
4/1/2008 3 1
7/1/2008 2 2
10/1/2008 4 1
;
run;
data want;
set have (rename=(denom=d numer=n));
do i = 1 to d;
denom = 1;
numer= (i le n);
output;
end;
drop d n i;
run;
Thanks for the solution ballardw
Hey @ballardw I'm curious to know exactly how this works. I understand the rename and you are essentially recreating the denom and numer variables but what does (i le n) mean? Does it mean set numer to i if it's less than or equal to n? I don't get how numer comes out as ones and then subsequent 0's? for the first date, on the second iteration of the do loop you would have 2 as i. And that would be less than or equal to 2(n) so why doesn't it output 2?
The code like:
Var = (a < b);
works with any logical comparison. SAS will return 1 for true and 0 for false.
Since the OP had wanted values of 1 and needed the count of ones to be equal to the Numer variable the comparison worked. Which is why I had asked about the "random order" which would have been much more complicated code.
In a similar way SAS treats any not-missing and non-zero value as "true" as revealed by this code:
Data _null_;
do i = .,-5, 0.3, 0, 1, 12;
if i then put "the value " i "is treated as True";
else put "the value " i "is treated as False";
end;
run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.