data WORK.SURVEY;
infile datalines dsd truncover;
input NetStation:BEST12. Replicate:BEST12. HardPlastics:BEST12. SoftPlastics:BEST12. PlasticLines:BEST12. Styrofoam:BEST12. Pellets:BEST12. TotalPlastics:BEST12. DriftingWood:BEST12. Pumice:BEST12. Cs:BEST12.;
format NetStation BEST12. Replicate BEST12. HardPlastics BEST12. SoftPlastics BEST12. PlasticLines BEST12. Styrofoam BEST12. Pellets BEST12. TotalPlastics BEST12. DriftingWood BEST12. Pumice BEST12. Cs BEST12.;
datalines;
1 1 1 0 0 0 0 1 0 0 653.5089
1 2 1 0 0 0 0 1 0 0 691.9619
1 3 1 0 0 0 0 1 0 4 566.7916
2 1 1 0 0 0 0 1 1 0 950.4462
2 2 3 0 0 0 0 3 3 0 2198.9015
;;;;
The subject sounds a bit hazy so let me explain
I have a data set with 5 plastic categories eg hard plastic , soft plastic , pellet , plastic lines, styrofoam with counts of those categories under them. These are grouped under total plastics and a count. Then there is driftwood and count and Pumice and count
All these plastics have been collected at three different instances titled net tow under one net station.
My task is to create a category 1 , 2, 3 under material_captured with 1 - total plastics , 2 - driftwood and 3- pumice which i can then recode into another variable called substance . the recoding using case is simple .
What i need help with is getting total plastics , driftwood and pumice under "materials collected" with 1,2,3 assigned and a count displayed for each instance of replicate (net tow) at the net station .
Attached is a sample of the data set
Looking forward to hearing from the experts !
Thanks
Below implemented how I understand your requirement.
data WORK.SURVEY;
infile datalines dsd truncover dlm=' ';
input NetStation:BEST12. Replicate:BEST12. HardPlastics:BEST12. SoftPlastics:BEST12. PlasticLines:BEST12. Styrofoam:BEST12. Pellets:BEST12. TotalPlastics:BEST12. DriftingWood:BEST12. Pumice:BEST12. Cs:BEST12.;
format NetStation BEST12. Replicate BEST12. HardPlastics BEST12. SoftPlastics BEST12. PlasticLines BEST12. Styrofoam BEST12. Pellets BEST12. TotalPlastics BEST12. DriftingWood BEST12. Pumice BEST12. Cs BEST12.;
datalines;
1 1 1 0 0 0 0 1 0 0 653.5089
1 2 1 0 0 0 0 1 0 0 691.9619
1 3 1 0 0 0 0 1 0 4 566.7916
2 1 1 0 0 0 0 1 1 0 950.4462
2 2 3 0 0 0 0 3 3 0 2198.9015
;;;;
data want(keep=NetStation Replicate materials_collected count);
set SURVEY;
materials_collected='1';
count=sum(HardPlastics, SoftPlastics, PlasticLines, Styrofoam, Pellets);
output;
materials_collected='2';
count=DriftingWood;
output;
materials_collected='3';
count=Pumice;
output;
run;
It is very hard to code from pictures.
Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.
It appears to me that what you may want to do is use a data step to create a new variable with something like
TotalPlastics = sum(Hardplastics, softplastics, plasticlines, Styrofoam, pettets)
If that won't work you really need to provide example of your current SAS data set as "categories under them" doesn't make much sense without an actual example.
hey ballardw mate , i have edited my post
hope you can see the data now !
Kind Regards
So what is the output for that example data supposed to look like?
I want the respective counts of total plastics (which is the sum of hard , soft , pellet , line & styrofoam) , driftwood and pumice to come under one column ( materials collected ) mapped against each replicates ( 1,2 ,3 respectively) for their individual netstations.
please find attached snip from excel for how it should look like
my other question is how do i get replicates to repeat 3 times ( total 9 times ) each for each netstation ?
"My task is to create a category 1 , 2, 3 under material_captured with 1 - total plastics , 2 - driftwood and 3- pumice "
What value would this variable Material_Captured contain for row 3 in your sample data where you've got HardPlastics and Pumice?
please find attached what I want my output to look like
You need to explain a bit more how you get from the source data you've posted to the output.
Can you please use row 3 in your source, show how the output should look like AND explain the logic how you get from source to target state.
Hi Patrick
I want the respective counts of total plastics (which is the sum of hard , soft , pellet , line & styrofoam) , driftwood and pumice to come under one column ( materials collected ) mapped against each replicates ( 1,2 ,3 respectively) for their individual netstations.
hope that clarifies !
I have also attached an excel grab of what i want my output to look like in a previous post ! let me know if there is still some gap between getting from source to target state
Below implemented how I understand your requirement.
data WORK.SURVEY;
infile datalines dsd truncover dlm=' ';
input NetStation:BEST12. Replicate:BEST12. HardPlastics:BEST12. SoftPlastics:BEST12. PlasticLines:BEST12. Styrofoam:BEST12. Pellets:BEST12. TotalPlastics:BEST12. DriftingWood:BEST12. Pumice:BEST12. Cs:BEST12.;
format NetStation BEST12. Replicate BEST12. HardPlastics BEST12. SoftPlastics BEST12. PlasticLines BEST12. Styrofoam BEST12. Pellets BEST12. TotalPlastics BEST12. DriftingWood BEST12. Pumice BEST12. Cs BEST12.;
datalines;
1 1 1 0 0 0 0 1 0 0 653.5089
1 2 1 0 0 0 0 1 0 0 691.9619
1 3 1 0 0 0 0 1 0 4 566.7916
2 1 1 0 0 0 0 1 1 0 950.4462
2 2 3 0 0 0 0 3 3 0 2198.9015
;;;;
data want(keep=NetStation Replicate materials_collected count);
set SURVEY;
materials_collected='1';
count=sum(HardPlastics, SoftPlastics, PlasticLines, Styrofoam, Pellets);
output;
materials_collected='2';
count=DriftingWood;
output;
materials_collected='3';
count=Pumice;
output;
run;
Thank you Patrick ! that was amazing
I have one more question for you if you dont mind
I need to produce a graph that compares the freq of the new variable substance by nettype that shows the substance
wondering which sort of a graph would be most appropriate
i tried a 3 way proc freq table which didnt really look good.
Im thinking more in terms of a vbar for each nettype by substance that has count on the y axis
Please also find embedded the data set that im referring to
data WORK.A3;
infile datalines dsd truncover;
input Nettype:$23. Substance:$12. count:32.;
datalines;
Neuston/ 335/ 1.2 x 0.6 natural 0
Neuston/ 335/ 1.2 x 0.6 man-made 1
Neuston/ 335/ 1.2 x 0.6 mixed sample 0
Neuston/ 335/ 1.2 x 0.6 mixed sample 0
Neuston/ 335/ 1.2 x 0.6 natural 0
;;;;
Reporting is not really my home turf. I suggest you post this as a new questions - eventually under "ODS and Base Reporting".
Ideally each discussion addresses one question/topic. You can always post a link to a previous discussion if you feel questions are linked or previous discussions will help to clarify what you're after.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.