- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi All,
I am trying to convert the aggregate-level data into individual-level data. I would really appreciate the help.
I have the following data:
data temp;
input clinic $ Agegrp$ TotalPatient TotalFemale FemaleMedReceived TotalMale MaleMedReceived;
datalines;
ABC 1-5 104 56 2 48 0
ABC 6-10 232 111 16 121 13
ABC 11-15 192 108 4 84 3;
run;
Table I have:
Clinic | Age Group | Total Patients | Total Female | Female Medication received | Total Male | Male Medication received |
ABC | 1-5 | 104 | 56 | 2 | 48 | 0 |
6-10 | 232 | 111 | 16 | 121 | 13 | |
11-15 | 192 | 108 | 4 | 84 | 3 |
I want to create individual-level data from the above numbers. For example, the table I want should have each row with clinic and age group (1-5), with 56 females - 2 of those should receive Medication (Yes), and 54 should not receive medication (No). The same is true for males 1-5 years old (48 males—0, medication (Yes) and 48 medication (No)).
The table I want has a total of 528 rows:
Clinic | Age Group | Gender | Medication Received |
ABC | 1-5 | F | Yes |
ABC | 1-5 | F | Yes |
....... | .... | .... | .... |
ABC | 1-5 | M | No |
ABC | 1-5 | M | No |
...... | .... | .... | .... |
ABC | 5-10 | F | Yes |
Thanks,
@sandyzman1
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Well, I can't say I know why you'd do such a thing, but you could do it like this:
data temp;
input clinic $ Agegrp$ TotalPatient TotalFemale FemaleMedReceived TotalMale MaleMedReceived;
datalines;
ABC 1-5 104 56 2 48 0
ABC 6-10 232 111 16 121 13
ABC 11-15 192 108 4 84 3;
run;
data want;
set have;
MedReceived='Y';
Sex='F';
do i=1 to TotalFemale;
if i>FemaleMedReceived then MedReceived='N';
output;
end;
MedReceived='Y';
Sex='M';
do i=1 to TotalMale;
if i>MaleMedReceived then MedReceived='N';
output;
end;
keep clinic ageGrp Sex MedReceived;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
So do you want to generate one record per "person" base on the values of TotalFemale and TotalMale columns?
I would probably use two nested do-loops (one for each gender and with/without MedReceived). In each loop an explicit OUTPUT statment.
do f = 1 to TotalFemale;
do fr = 1 to FemaleMedReceived;
MedReceived = 'Yes';
Output;
end;
do fnr = 1 to TotalFemale - FemaleMedReceived;
MedReceived = 'No';
Output;
end;
end;
/* repeat same logic for male below */
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Well, I can't say I know why you'd do such a thing, but you could do it like this:
data temp;
input clinic $ Agegrp$ TotalPatient TotalFemale FemaleMedReceived TotalMale MaleMedReceived;
datalines;
ABC 1-5 104 56 2 48 0
ABC 6-10 232 111 16 121 13
ABC 11-15 192 108 4 84 3;
run;
data want;
set have;
MedReceived='Y';
Sex='F';
do i=1 to TotalFemale;
if i>FemaleMedReceived then MedReceived='N';
output;
end;
MedReceived='Y';
Sex='M';
do i=1 to TotalMale;
if i>MaleMedReceived then MedReceived='N';
output;
end;
keep clinic ageGrp Sex MedReceived;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Are there any other variables?
I ask because the data set you create has the possibility of implying precision for other variables, or your variables combined with the one shown, that would very likely be false.
I would be interested in what the resulting data set is to be used for. I may be that you think you need to dis-aggregate the values you have when use of other options may allow use of the existing data set.
For what little it may be worth you may want to consider using numeric 1 and 0 values instead of text "yes" and "no". The sum of a 1/0 coded variable is the number of '1' values, the mean is a decimal percentage and some other tricks involving "any" "all" or "none" of the 1 and 0 values can be done with the functions max, min, and range that involve moderately ugly programming with character 'yes' and 'no'.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content