Hi there,
I have an input data set, which is structured something like the sample data set below. Would someone have a suggestion for the best way to identify the food item with the least amount of each nutrient type at each school-meal-day combination If the data set is food-level, has up to 400 food items at a school throughout the week, and almost 2000 schools ?
Some details:
food item - food-level file where each different food item anew for each school where it's offered and each meal within that school it's offfered. So if a food item, an apple, is offered at 100 schools, for both breakfast and lunch, it'll be listed 200 times in the fole, one row for each school and meal combo where it's offered.
meal- breakfast, lunch
day - weekday
nutrient- the amount of a nutrient in that food item
Fooditem School Meal Day Nutrient1 Nutrient2
Apple School1 B Mon 5 cals 10 cals
Apple School1 M Mon 5 cals 10 cals
Apple School1 B Tues 5 cals 10 cals
Apple School1 M Tues 5 cals 10 cals
Apple School1 B Wed 5 cals 10 cals
Apple School1 M Wed 5 cals 10 cals
Apple School1 B Thurs 5 cals 10 cals
Apple School1 M Thurs 5 cals 10 cals
Apple School2 B Mon 5 cals 10 cals
Apple School2 M Mon 5 cals 10 cals
Apple School2 B Tues 5 cals 10 cals
Apple School2 M Tues 5 cals 10 cals
Apple School2 B Wed 5 cals 10 cals
Apple School2 M Wed 5 cals 10 cals
Apple School2 B Thurs 5 cals 10 cals
Apple School2 M Thurs 5 cals 10 cals
Milk School1 B Mon 5 cals 10 cals
Milk School1 M Mon 5 cals 10 cals
Milk School1 B Tues 5 cals 10 cals
Milk School1 M Tues 5 cals 10 cals
Milk School1 B Wed 5 cals 10 cals
Milk School1 M Wed 5 cals 10 cals
Milk School1 B Thurs 5 cals 10 cals
Milk School1 M Thurs 5 cals 10 cals
Milk School2 B Mon 5 cals 10 cals
Milk School2 M Mon 5 cals 10 cals
Milk School2 B Tues 5 cals 10 cals
Milk School2 M Tues 5 cals 10 cals
Milk School2 B Wed 5 cals 10 cals
Milk School2 M Wed 5 cals 10 cals
Milk School2 B Thurs 5 cals 10 cals
Milk School2 M Thurs 5 cals 10 cals
Woulld there be another way aside from
- assigning a macro and then callin a proc sort process for each school/day/meal/nutrient/food combo?
- and aside from sorting the dataset by a list of BY vars then creating a concatenated var for each school/day/meal/food nutrient combo then identifying each unique first combo as the one with the least of each nutrient level?
Thank you in advance
I use SAS 9.4
Note, just seen your reply, still the below applies, just need to change the sum to min, then transpose back up. I am leaving for the day now, so will see if I can get code later.
At a guess, it would help if you first normalised your data, i.e. have:
Fooditem School Meal Day Nutrient_Number Cals
Apple School1 B Mon 1 5
Apple School1 B Mon 2 10
...
Then do you sums, example:
data have; length fooditem school meal day nutrient1 nutrient2 $50; input fooditem $ school $ meal $ day $ nutrient1 $ nutrient2 $; datalines; Apple School1 B Mon 5cals 10cals Apple School1 M Mon 5cals 10cals Apple School1 B Tues 5cals 10cals Apple School1 M Tues 5cals 10cals Apple School1 B Wed 5cals 10cals ; run; data inter (drop=nutrient1-nutrient2); set have; nutrient_number=1; nutrient=input(compress(nutrient1,,"kd"),best.); output; nutrient_number=2; nutrient=input(compress(nutrient2,,"kd"),best.); output; run; proc sql; create table STATS as select FOODITEM, SCHOOL, MEAL, DAY, NUTRIENT_NUMBER, sum(NUTRIENT) as RES from INTER group by FOODITEM, SCHOOL, MEAL, DAY, NUTRIENT_NUMBER; quit;
Could you provide an example of the output you want?
HI ballardw,
The output dataset would be school-level, one record per school ID and it would look something like this:
School_ID Lunch_Day1_Lowestfat_fruit Lunch_Day2_Lowestfat_fruit Lunch_Day3_Lowestfat_fruit ...
SCHID1 apple pear apple
SCHID2 apple pear pear
SCHID3 apple apple apple
where the other variables would be:
Lunch_Day1_Lowestfat_fruit
Lunch_Day2_Lowestfat_fruit
Lunch_Day3_Lowestfat_fruit
Lunch_Day4_Lowestfat_fruit
Lunch_Day5_Lowestfat_fruit
Lunch_Day1_Lowestfat_grain
Lunch_Day2_Lowestfat_grain
Lunch_Day3_Lowestfat_grain
Lunch_Day4_Lowestfat_grain
Lunch_Day5_Lowestfat_grain
Lunch_Day1_Lowestfat_meat
Lunch_Day2_Lowestfat_meat
Lunch_Day3_Lowestfat_meat
Lunch_Day4_Lowestfat_meat
Lunch_Day5_Lowestfat_meat
Let me know if I can clarify and thank you!
To get column headings such as LowestFat_fruit the input data set will require something that identifies each food as to its category: fruite, vegetable, meat, dairy or what ever categories you may be using AND identification as to which nutrient is being considered. Since your example data mentioned nutrient1 and nutrient2 - which is Fat? or other category?
If there are ties, since ALL of your example data had the exact same values does it matter which gets reported as the lowest?
Note, just seen your reply, still the below applies, just need to change the sum to min, then transpose back up. I am leaving for the day now, so will see if I can get code later.
At a guess, it would help if you first normalised your data, i.e. have:
Fooditem School Meal Day Nutrient_Number Cals
Apple School1 B Mon 1 5
Apple School1 B Mon 2 10
...
Then do you sums, example:
data have; length fooditem school meal day nutrient1 nutrient2 $50; input fooditem $ school $ meal $ day $ nutrient1 $ nutrient2 $; datalines; Apple School1 B Mon 5cals 10cals Apple School1 M Mon 5cals 10cals Apple School1 B Tues 5cals 10cals Apple School1 M Tues 5cals 10cals Apple School1 B Wed 5cals 10cals ; run; data inter (drop=nutrient1-nutrient2); set have; nutrient_number=1; nutrient=input(compress(nutrient1,,"kd"),best.); output; nutrient_number=2; nutrient=input(compress(nutrient2,,"kd"),best.); output; run; proc sql; create table STATS as select FOODITEM, SCHOOL, MEAL, DAY, NUTRIENT_NUMBER, sum(NUTRIENT) as RES from INTER group by FOODITEM, SCHOOL, MEAL, DAY, NUTRIENT_NUMBER; quit;
How you define " the least amount of each nutrient type" ?
I think you need transpose your data like this:
School Meal Day AppleNutrient1 AppleNutrient2 MilkNutrient1 MilkNutrient2
School1 B Mon 5 cals 10 cals 5 cals 10 cals
School1 M Mon 5 cals 10 cals 5 cals 10 cals
and then GRAYCODE() may do that trick .
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.