Hi, I have a data with Produce as my BY Variable /unique variable
My goal is to give potential Amounts to Produce that does not have an amount.
I started off by creating another data set with Produce that have an amount and then later thought of merging with the main data, but my dilemma is I have 3 Chicken with 2 different Manufacturer and 2 brands giving 3 amounts, similarly for MILK.
Ideally I would like to show amounts of Chicken with the corresponding Manufacturer, Brand and Amount.
Produce | Manufacturer | Brand | Amount |
Beans | CLICxxx | Clicxxxx | $0 |
Kidneys | Pantry | Pantry Shelf | $0 |
Brussel Sprouts | Limson | Limson | $0 |
Cabbage | Bear | Bear Creek | $0 |
Chicken | Grannys | Grannys Poultry Cooperative | $0 |
Milk | Global | Global Direct | $0 |
Beans | Ital | Ital-Pasta | $4 |
Kidneys | Primo | Primo | $5 |
Brussel Sprouts | Alasko | Alasko | $6 |
Cabbage | Keybrand | Laurie's | $7 |
Chicken | Galco | Flamingo | $2 |
Chicken | Maple | $3 | |
Chicken | Maple | Kitchen Essential | $6 |
Milk | Agropur | Sealtest | $5 |
Milk | Saputo | Neilson | $6 |
Thank you for your response
Can you please explain what you mean with an example
Hi, I feel this subject may not be the appropriate title.sorry
I thought I would explain my concern once again. I have changed the data slightly. Produce is my unique variable/by variable
Produce | Manufacturer | Brand | Amount |
Beans | CLICxxx | Clicxxxx | $0 |
Kidneys | Pantry | Pantry Shelf | $0 |
Brussel Sprouts | Limson | Limson | $0 |
Cabbage | Bear | Bear Creek | $0 |
Chicken | Grannys | Grannys Poultry Cooperative | $0 |
Milk | Global | Global Direct | $0 |
Beans | Ital | Ital-Pasta | $4 |
Kidneys | Primo | Primo | $5 |
Brussel Sprouts | Alasko | Alasko | $6 |
Cabbage | Keybrand | Laurie's | $7 |
Chicken | Galco | Flamingo | $2 |
Chicken | Maple | Sofina | $3 |
Chicken | Maple | Kitchen Essential | $6 |
Milk | Agropur | Sealtest | $5 |
Milk | Saputo | Neilson | $6 |
What I would like is the table below:
Produce | Manufacturer | Brand | Amount | Manufacturer_1 | New_Brand_1 | Potential_Amount_1 | Manufacturer_2 | New_Brand_2 | Potential_Amount_2 | Manufacturer_3 | New_Brand_3 | Potential_Amount_3 |
Beans | CLICxxx | Clicxxxx | $0 | Ital | Ital-Pasta | $4 | ||||||
Kidneys | Pantry | Pantry Shelf | $0 | Primo | Primo | $5 | ||||||
Brussel Sprouts | Limson | Limson | $0 | Alasko | Alasko | $6 | ||||||
Cabbage | Bear | Bear Creek | $0 | Keybrand | Laurie's | $7 | ||||||
Chicken | Grannys | Grannys Poultry Cooperative | $0 | Galco | Flamingo | $2 | Maple | Sofina | $3 | Maple | Kitchen Essential | $6 |
Milk | Global | Global Direct | $0 | Agropur | Sealtest | $5 | Saputo | Neilson | $6 |
Basically for the same Produce variable that does not have an Amount, I would like to give Potential amounts with the corresponding Manufacturer and Brand and if there are multiple Manufactuers, Brands and Amounts for the Produce variable...in this case Chicken and Milk, I would like to see those potential Manufacturers, Brands, Amounts.
Thank you, is it possible for you to help me with a sample code. I think it will be easier for me to understand
Untested but this should get you started. You'll end up with a data set, step 3 that you can then decide how to impute the missing data.
Note that you can end up with duplicate matches so you'll need to decide how to filter/limit when you have multiple matches. You may want to add a key to your 'have' table to uniquely identify the rows.
DATA HAVE; INFILE CARDS DLM=',' TRUNCOVER; LENGTH PRODUCE MANUFACTURER BRAND $20; INPUT Produce $ Manufacturer $ Brand $ Amount; CARDS; Beans, CLICxxx, Clicxxxx, 0 Kidneys, Pantry, Pantry Shelf, 0 Brussel Sprouts, Limson, Limson, 0 Cabbage, Bear, Bear Creek, 0 Chicken, Grannys, Grannys Poultry Cooperative, 0 Milk, Global, Global Direct, 0 Beans, Ital, Ital-Pasta, 4 Kidneys, Primo, Primo, 5 Brussel Sprouts, Alasko, Alasko, 6 Cabbage, Keybrand, Laurie's, 7 Chicken, Galco, Flamingo, 2 Chicken, Maple, , 3 Chicken, Maple, Kitchen Essential, 6 Milk, Agropur, Sealtest, 5 Milk, Saputo, Neilson, 6 ;;;; *MERGE IN BY MANUFACTURER AND BRAND; proc sql; create table step1 as select t1.*, t2.amount as amount1 from have as t1 left join have as t2 on t1.produce = t2.produce and t1.manufacturer=t2.manufacturer and t1.brand=t2.brand; quit; *MERGE IN BY BRAND; proc sql; create table step2 as select t1.*, t2.amount as amount2 from step1 as t1 left join have as t2 on t1.produce = t2.produce and t1.brand=t2.brand; quit; *MERGE IN BY Manufacturer; proc sql; create table step3 as select t1.*, t2.amount as amount3 from step2 as t1 left join have as t2 on t1.produce = t2.produce and t1.manufacturer=t2.manufacturer; quit;
/*insert data step to fill in logic here*/
data want;
set step3;
if amount = 0 the coalesce(amount1, amount2, amount3);
run;
@Gladis6680 wrote:
Thank you, is it possible for you to help me with a sample code. I think it will be easier for me to understand
Thank you very much, I forgot to mention, that ..yes i am still figuring out which logic to use first(I may use the min AMOUNT, or max AMOUNT) for duplicates (like for Chicken,which Manufacture and which Brand I should use) .....but for the time being I would like the output to have all the the Chicken - Manufacture, Brand and Amount side by side like Manufacture_1 Brand _1 Manufacture_2 Brand_2.
Also I am so sorry is it also possible to give this to me in a DATA STEP as I not familiar with SQL.
Thank for your help
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.