Hello all,
I am an occasional dabbler of SAS and I am currently stuck on trying to find a solution to my problem.
I have a rather large excel sheet with many variables which I managed to import into SAS. For convenience sake, I am going to say that I have a dataset with 640 observations and 2 variables. The two variables are plate and well. I want to create a new dataset where I am combining every 4 plates into one plate and I want to add 2 new variables row and column.
Part of my Dataset :
Plate | Well |
1 | A02 |
2 | A02 |
3 | A02 |
4 | A02 |
1 | A03 |
2 | A03 |
3 | A03 |
4 | A03 |
portion of Dataset i need:
Plate | Well | Row | Column | new plate no |
1 | A02 | A | 3 | 1 |
2 | A02 | A | 4 | 1 |
3 | A02 | B | 3 | 1 |
4 | A02 | B | 4 | 1 |
1 | A03 | A | 5 | 1 |
2 | A03 | A | 6 | 1 |
3 | A03 | B | 5 | 1 |
4 | A03 | B | 6 | 1 |
Every plate in the original dataset has 80 observations and I want to combine 4x80 into 1 plate with 320 values, combine the next 320 values into a second plate and so on.
My original plan was to create a second dataset with the additional 3 observations and do a one to one merge, save it, delete those observations from original dataset and repeat but I have too many observations to do it this way and my method is also very error prone. I am sure that there is a better way of doing this, I am just not able to figure it out.
I have attached an excel sheet which has 80 observations to better understand the pattern that I am trying to create.Would someone be able to help me out with this? Please let me know if I need to provide more information/make my question clearer. Thank you so much.
Can you explain the logic for the Column calculation? Not quite following that.
data want;
set have;
retain new_plate_number 0;
counter=_n_;
if plate in (1,2) then row='A';
else if plate in (3,4) then row='B';
if mod(_n_, 320)=1 then new_plate_number + 1;
run;
I'm sorry I wasn't very clear.
What I am trying to do with the column calculation is trying to arrange the first value of the 4 plates into a square. Hence A01 of plate 1 becomes A01 of newplate1, A01 of plate 2 becomes A02 of newplate1, A01 of plate3 becomes B01 of newplate 1, and A01 of plate4 becomes B02 of newplate1.
The new plate should look something like this. Every row is labeled A-P and every column on the new plate is labeled 1-24. Please let me know if this is helpful. Thanks!
1A1 | 2A1 | 1A2 | 2A2 | 1A3 | 2A3 | 1A4 | 2A4 | 1A5 | 2A5 | 1A6 | 2A6 | 1A7 | 2A7 | 1A8 | 2A8 | 1A9 | 2A9 | 1A10 | 2A10 | 1A11 | 2A11 | 1A12 | 2A12 |
3A1 | 4A1 | 3A2 | 4A2 | 3A3 | 4A3 | 3A4 | 4A4 | 3A5 | 4A5 | 3A6 | 4A6 | 3A7 | 4A7 | 3A8 | 4A8 | 3A9 | 4A9 | 3A10 | 4A10 | 3A11 | 4A11 | 3A12 | 4A12 |
1B1 | 2B1 | 1B2 | 2B2 | 1B3 | 2B3 | 1B4 | 2B4 | 1B5 | 2B5 | 1B6 | 2B6 | 1B7 | 2B7 | 1B8 | 2B8 | 1B9 | 2B9 | 1B10 | 2B10 | 1B11 | 2B11 | 1B12 | 2B12 |
3B1 | 4B1 | 3B2 | 4B2 | 3B3 | 4B3 | 3B4 | 4B4 | 3B5 | 4B5 | 3B6 | 4B6 | 3B7 | 4B7 | 3B8 | 4B8 | 3B9 | 4B9 | 3B10 | 4B10 | 3B11 | 4B11 | 3B12 | 4B12 |
1C1 | 2C1 | 1C2 | 2C2 | 1C3 | 2C3 | 1C4 | 2C4 | 1C5 | 2C5 | 1C6 | 2C6 | 1C7 | 2C7 | 1C8 | 2C8 | 1C9 | 2C9 | 1C10 | 2C10 | 1C11 | 2C11 | 1C12 | 2C12 |
3C1 | 4C1 | 3C2 | 4C2 | 3C3 | 4C3 | 3C4 | 4C4 | 3C5 | 4C5 | 3C6 | 4C6 | 3C7 | 4C7 | 3C8 | 4C8 | 3C9 | 4C9 | 3C10 | 4C10 | 3C11 | 4C11 | 3C12 | 4C12 |
1D1 | 2D1 | 1D2 | 2D2 | 1D3 | 2D3 | 1D4 | 2D4 | 1D5 | 2D5 | 1D6 | 2D6 | 1D7 | 2D7 | 1D8 | 2D8 | 1D9 | 2D9 | 1D10 | 2D10 | 1D11 | 2D11 | 1D12 | 2D12 |
3D1 | 4D1 | 3D2 | 4D2 | 3D3 | 4D3 | 3D4 | 4D4 | 3D5 | 4D5 | 3D6 | 4D6 | 3D7 | 4D7 | 3D8 | 4D8 | 3D9 | 4D9 | 3D10 | 4D10 | 3D11 | 4D11 | 3D12 | 4D12 |
1E1 | 2E1 | 1E2 | 2E2 | 1E3 | 2E3 | 1E4 | 2E4 | 1E5 | 2E5 | 1E6 | 2E6 | 1E7 | 2E7 | 1E8 | 2E8 | 1E9 | 2E9 | 1E10 | 2E10 | 1E11 | 2E11 | 1E12 | 2E12 |
3E1 | 4E1 | 3E2 | 4E2 | 3E3 | 4E3 | 3E4 | 4E4 | 3E5 | 4E5 | 3E6 | 4E6 | 3E7 | 4E7 | 3E8 | 4E8 | 3E9 | 4E9 | 3E10 | 4E10 | 3E11 | 4E11 | 3E12 | 4E12 |
1F1 | 2F1 | 1F2 | 2F2 | 1F3 | 2F3 | 1F4 | 2F4 | 1F5 | 2F5 | 1F6 | 2F6 | 1F7 | 2F7 | 1F8 | 2F8 | 1F9 | 2F9 | 1F10 | 2F10 | 1F11 | 2F11 | 1F12 | 2F12 |
3F1 | 4F1 | 3F2 | 4F2 | 3F3 | 4F3 | 3F4 | 4F4 | 3F5 | 4F5 | 3F6 | 4F6 | 3F7 | 4F7 | 3F8 | 4F8 | 3F9 | 4F9 | 3F10 | 4F10 | 3F11 | 4F11 | 3F12 | 4F12 |
1G1 | 2G1 | 1G2 | 2G2 | 1G3 | 2G3 | 1G4 | 2G4 | 1G5 | 2G5 | 1G6 | 2G6 | 1G7 | 2G7 | 1G8 | 2G8 | 1G9 | 2G9 | 1G10 | 2G10 | 1G11 | 2G11 | 1G12 | 2G12 |
3G1 | 4G1 | 3G2 | 4G2 | 3G3 | 4G3 | 3G4 | 4G4 | 3G5 | 4G5 | 3G6 | 4G6 | 3G7 | 4G7 | 3G8 | 4G8 | 3G9 | 4G9 | 3G10 | 4G10 | 3G11 | 4G11 | 3G12 | 4G12 |
1H1 | 2H1 | 1H2 | 2H2 | 1H3 | 2H3 | 1H4 | 2H4 | 1H5 | 2H5 | 1H6 | 2H6 | 1H7 | 2H7 | 1H8 | 2H8 | 1H9 | 2H9 | 1H10 | 2H10 | 1H11 | 2H11 | 1H12 | 2H12 |
3H1 | 4H1 | 3H2 | 4H2 | 3H3 | 4H3 | 3H4 | 4H4 | 3H5 | 4H5 | 3H6 | 4H6 | 3H7 | 4H7 | 3H8 | 4H8 | 3H9 | 4H9 | 3H10 | 4H10 | 3H11 | 4H11 | 3H12 | 4H12 |
Just looking at what you posted (it's dangerous to open posted spreadsheets), here's an approach:
data want;
set have;
if mod(_n_, 80) = 1 then do;
new_plate + 1;
well_count = 2;
end;
else if mod(_n_, 4) = 1 then well_count + 2;
if mod(_n_, 2) in (1, 3) then column = well_count + 1;
else column = well_count + 2;
retain row 'A';
back1 = lag(row);
back2 = lag2(row);
if back1 = back2 = 'A' then row='B';
else if back1=back2='B' then row='A';
drop well_count;
run;
It's untested code, and done without seeing the spreadsheet, so may need a little tweaking. It assumes there are always four observations for a well.
I haven't had a chance to test this code yet, I will try it and let you know if it works. But thank you so much for your reply.
Assuming col is just alternating numbers offset by 2 for some reason, this works on your sample data at least:
*create fake data;
data have;
input Plate Well $3.;
cards;
1 A02
2 A02
3 A02
4 A02
1 A03
2 A03
3 A03
4 A03
;
run;
*ensure sort is correct;
proc sort data=have;
by well plate;
run;
data want;
set have;
by well;
*set initial variables and keep them until explicitly changed;
retain new_plate_number 0 start_col 0;
*plate calculation;
if plate in (1,2) then row='A';
else if plate in (3,4) then row='B';
*plate number;
if mod(_n_, 320)=1 then new_plate_number + 1;
*column calculation;
if first.well then start_col+2;
if plate in (1,3) then col=start_col+1;
else if plate in (2,4) then col=start_col+2;
drop start_col;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.