Hi, I have a data with Produce as my BY Variable /unique variable
My goal is to give potential Amounts to Produce that does not have an amount.
I started off by creating another data set with Produce that have an amount and then later thought of merging with the main data, but my dilemma is I have 3 Chicken with 2 different Manufacturer and 2 brands giving 3 amounts, similarly for MILK.
Ideally I would like to show amounts of Chicken with the corresponding Manufacturer, Brand and Amount.
Produce | Manufacturer | Brand | Amount |
Beans | CLICxxx | Clicxxxx | $0 |
Kidneys | Pantry | Pantry Shelf | $0 |
Brussel Sprouts | Limson | Limson | $0 |
Cabbage | Bear | Bear Creek | $0 |
Chicken | Grannys | Grannys Poultry Cooperative | $0 |
Milk | Global | Global Direct | $0 |
Beans | Ital | Ital-Pasta | $4 |
Kidneys | Primo | Primo | $5 |
Brussel Sprouts | Alasko | Alasko | $6 |
Cabbage | Keybrand | Laurie's | $7 |
Chicken | Galco | Flamingo | $2 |
Chicken | Maple | $3 | |
Chicken | Maple | Kitchen Essential | $6 |
Milk | Agropur | Sealtest | $5 |
Milk | Saputo | Neilson | $6 |
Thank you for your response
Can you please explain what you mean with an example
Hi, I feel this subject may not be the appropriate title.sorry
I thought I would explain my concern once again. I have changed the data slightly. Produce is my unique variable/by variable
Produce | Manufacturer | Brand | Amount |
Beans | CLICxxx | Clicxxxx | $0 |
Kidneys | Pantry | Pantry Shelf | $0 |
Brussel Sprouts | Limson | Limson | $0 |
Cabbage | Bear | Bear Creek | $0 |
Chicken | Grannys | Grannys Poultry Cooperative | $0 |
Milk | Global | Global Direct | $0 |
Beans | Ital | Ital-Pasta | $4 |
Kidneys | Primo | Primo | $5 |
Brussel Sprouts | Alasko | Alasko | $6 |
Cabbage | Keybrand | Laurie's | $7 |
Chicken | Galco | Flamingo | $2 |
Chicken | Maple | Sofina | $3 |
Chicken | Maple | Kitchen Essential | $6 |
Milk | Agropur | Sealtest | $5 |
Milk | Saputo | Neilson | $6 |
What I would like is the table below:
Produce | Manufacturer | Brand | Amount | Manufacturer_1 | New_Brand_1 | Potential_Amount_1 | Manufacturer_2 | New_Brand_2 | Potential_Amount_2 | Manufacturer_3 | New_Brand_3 | Potential_Amount_3 |
Beans | CLICxxx | Clicxxxx | $0 | Ital | Ital-Pasta | $4 | ||||||
Kidneys | Pantry | Pantry Shelf | $0 | Primo | Primo | $5 | ||||||
Brussel Sprouts | Limson | Limson | $0 | Alasko | Alasko | $6 | ||||||
Cabbage | Bear | Bear Creek | $0 | Keybrand | Laurie's | $7 | ||||||
Chicken | Grannys | Grannys Poultry Cooperative | $0 | Galco | Flamingo | $2 | Maple | Sofina | $3 | Maple | Kitchen Essential | $6 |
Milk | Global | Global Direct | $0 | Agropur | Sealtest | $5 | Saputo | Neilson | $6 |
Basically for the same Produce variable that does not have an Amount, I would like to give Potential amounts with the corresponding Manufacturer and Brand and if there are multiple Manufactuers, Brands and Amounts for the Produce variable...in this case Chicken and Milk, I would like to see those potential Manufacturers, Brands, Amounts.
Thank you, is it possible for you to help me with a sample code. I think it will be easier for me to understand
Untested but this should get you started. You'll end up with a data set, step 3 that you can then decide how to impute the missing data.
Note that you can end up with duplicate matches so you'll need to decide how to filter/limit when you have multiple matches. You may want to add a key to your 'have' table to uniquely identify the rows.
DATA HAVE; INFILE CARDS DLM=',' TRUNCOVER; LENGTH PRODUCE MANUFACTURER BRAND $20; INPUT Produce $ Manufacturer $ Brand $ Amount; CARDS; Beans, CLICxxx, Clicxxxx, 0 Kidneys, Pantry, Pantry Shelf, 0 Brussel Sprouts, Limson, Limson, 0 Cabbage, Bear, Bear Creek, 0 Chicken, Grannys, Grannys Poultry Cooperative, 0 Milk, Global, Global Direct, 0 Beans, Ital, Ital-Pasta, 4 Kidneys, Primo, Primo, 5 Brussel Sprouts, Alasko, Alasko, 6 Cabbage, Keybrand, Laurie's, 7 Chicken, Galco, Flamingo, 2 Chicken, Maple, , 3 Chicken, Maple, Kitchen Essential, 6 Milk, Agropur, Sealtest, 5 Milk, Saputo, Neilson, 6 ;;;; *MERGE IN BY MANUFACTURER AND BRAND; proc sql; create table step1 as select t1.*, t2.amount as amount1 from have as t1 left join have as t2 on t1.produce = t2.produce and t1.manufacturer=t2.manufacturer and t1.brand=t2.brand; quit; *MERGE IN BY BRAND; proc sql; create table step2 as select t1.*, t2.amount as amount2 from step1 as t1 left join have as t2 on t1.produce = t2.produce and t1.brand=t2.brand; quit; *MERGE IN BY Manufacturer; proc sql; create table step3 as select t1.*, t2.amount as amount3 from step2 as t1 left join have as t2 on t1.produce = t2.produce and t1.manufacturer=t2.manufacturer; quit;
/*insert data step to fill in logic here*/
data want;
set step3;
if amount = 0 the coalesce(amount1, amount2, amount3);
run;
@Gladis6680 wrote:
Thank you, is it possible for you to help me with a sample code. I think it will be easier for me to understand
Thank you very much, I forgot to mention, that ..yes i am still figuring out which logic to use first(I may use the min AMOUNT, or max AMOUNT) for duplicates (like for Chicken,which Manufacture and which Brand I should use) .....but for the time being I would like the output to have all the the Chicken - Manufacture, Brand and Amount side by side like Manufacture_1 Brand _1 Manufacture_2 Brand_2.
Also I am so sorry is it also possible to give this to me in a DATA STEP as I not familiar with SQL.
Thank for your help
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.