SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
sandyzman1
Obsidian | Level 7

Hi All,

 

I am trying to convert the aggregate-level data into individual-level data. I would really appreciate the help.

 

I have the following data:

 

data temp;

input clinic $ Agegrp$ TotalPatient TotalFemale FemaleMedReceived TotalMale MaleMedReceived;

datalines;

ABC 1-5 104 56 2 48 0

ABC 6-10 232 111 16 121 13

ABC 11-15 192 108 4 84 3;

run;

 

Table I have:

Clinic Age Group Total Patients Total Female Female Medication received Total Male Male Medication received
ABC 1-5 104 56 2 48 0
  6-10 232 111 16 121 13
  11-15 192 108 4 84 3

 

I want to create individual-level data from the above numbers. For example, the table I want should have each row with clinic and age group (1-5), with 56 females - 2 of those should receive Medication (Yes), and 54 should not receive medication (No). The same is true for males 1-5 years old (48 males—0, medication (Yes) and 48 medication (No)). 

 

The table I want has a total of 528 rows:

Clinic Age Group Gender Medication Received
ABC 1-5 Yes
ABC 1-5 F Yes
....... .... .... ....
ABC 1-5 M No
ABC 1-5 M No
...... .... .... ....
ABC 5-10 F Yes

 

 

Thanks,
@sandyzman1

1 ACCEPTED SOLUTION

Accepted Solutions
quickbluefish
Pyrite | Level 9

Well, I can't say I know why you'd do such a thing, but you could do it like this:

 

data temp;
input clinic $ Agegrp$ TotalPatient TotalFemale FemaleMedReceived TotalMale MaleMedReceived;
datalines;
ABC 1-5 104 56 2 48 0
ABC 6-10 232 111 16 121 13
ABC 11-15 192 108 4 84 3;
run;

data want;
set have;
MedReceived='Y';
Sex='F';
do i=1 to TotalFemale;
    if i>FemaleMedReceived then MedReceived='N';
    output;
end;
MedReceived='Y';
Sex='M';
do i=1 to TotalMale;
    if i>MaleMedReceived then MedReceived='N';
    output;
end;
keep clinic ageGrp Sex MedReceived;
run;

View solution in original post

5 REPLIES 5
LinusH
Tourmaline | Level 20

So do you want to generate one record per "person" base on the values of TotalFemale and TotalMale columns?

I would probably use two nested do-loops (one for each gender and with/without MedReceived). In each loop an explicit OUTPUT statment.

do f = 1 to TotalFemale;
   do fr = 1 to FemaleMedReceived;
      MedReceived = 'Yes';
      Output;
   end;
   do fnr = 1 to TotalFemale - FemaleMedReceived;
      MedReceived = 'No';
      Output;
   end;
end;
/* repeat same logic for male below */
Data never sleeps
sandyzman1
Obsidian | Level 7
Thanks !
quickbluefish
Pyrite | Level 9

Well, I can't say I know why you'd do such a thing, but you could do it like this:

 

data temp;
input clinic $ Agegrp$ TotalPatient TotalFemale FemaleMedReceived TotalMale MaleMedReceived;
datalines;
ABC 1-5 104 56 2 48 0
ABC 6-10 232 111 16 121 13
ABC 11-15 192 108 4 84 3;
run;

data want;
set have;
MedReceived='Y';
Sex='F';
do i=1 to TotalFemale;
    if i>FemaleMedReceived then MedReceived='N';
    output;
end;
MedReceived='Y';
Sex='M';
do i=1 to TotalMale;
    if i>MaleMedReceived then MedReceived='N';
    output;
end;
keep clinic ageGrp Sex MedReceived;
run;
ballardw
Super User

Are there any other variables?

I ask because the data set you create has the possibility of implying precision for other variables, or your variables combined with the one shown, that would very likely be false.

 

I would be interested in what the resulting data set is to be used for. I may be that you think you need to dis-aggregate the values you have when use of other options may allow use of the existing data set.

 

For what little it may be worth you may want to consider using numeric 1  and 0 values instead of text "yes" and "no". The sum of a 1/0 coded variable is the number of '1' values, the mean is a decimal percentage and some other tricks involving "any" "all" or "none" of the 1 and 0 values can be done with the functions max, min, and range that involve moderately ugly programming with character 'yes' and 'no'.

sandyzman1
Obsidian | Level 7
@ballardw Thanks for the reply. I get what you are trying to point out. I do have other variables. This is just a snippet of the dataset. However, I created a separate dataset using the above-suggested code and merged it by Patient ID. It seems to work fine. Also, I think considering using numeric values is a great choice. I did use it as a numeric instead of text. I planned to use the variables for the mixed model logistic regression. Thanks.

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 333 views
  • 1 like
  • 4 in conversation