What are 9 times , 8 times and so on? In your data set given below
331 repeats 9 times. Do you mean this repetition? Just create some sample data set similar to the one you have given. Pretend to give some imaginary values to hide real values. It is then possible to check the program to give you the answer.
data source;
INFILE DATALINES ;
input ID VAR & $19. CLASSIFICATION $ ;
DATALINES;
201 PepsiCo XYZ
201 111 main street XYZ
201 Charity XYZ
331 CocaCola XYZ
331 1823 unicorn street XYZ
331 Charity XYZ
331 Nike XYZ
331 123 brock avenue XYZ
331 Charity XYZ
331 Adidas XYZ
331 14 Market street XYZ
331 Charity XYZ
354 McDonalds XYZ
354 111 Obesity road XYZ
354 Charity XYZ
;
RUN;
You should have placed a SAMPLE Data Set that represent your real Data SET. I have given a solution based on your sample Data Set.
Now you say there will be varying number of records - 2 records instead of 3. I hope you don't come with newer issues with your original Data Set.
The solution is based on first creating triplets with adding "NONE" when "Charity" is missing. Next the same program without using LAST.ID. We can simply check for either "NONE" or "Charity" to output a record at a time.
data source;
INFILE DATALINES ;
input ID VAR & $19. CLASSIFICATION $ ;
DATALINES;
201 PepsiCo XYZ
201 111 main street XYZ
201 Charity XYZ
331 CocaCola XYZ
331 1823 unicorn street XYZ
331 Charity XYZ
331 Nike XYZ
331 123 brock avenue XYZ
331 Charity XYZ
331 Adidas XYZ
331 14 Market street XYZ
331 Charity XYZ
354 McDonalds XYZ
354 111 Obesity road XYZ
354 Charity XYZ
354 KFC XYZ
354 111 Fat Avenue XYZ
441 Burger King XYZ
441 112 Skinny ROAD XYZ
441 Adidas XYZ
441 Socks Blvd XYZ
441 Tiege Henley XYZ
441 Skincare Road XYZ
441 Charity XYZ
;
RUN;
proc print data = source;
run;
data have;
set source;
if Var = "Charity" then i = 0;
else i+1;
if i = 3 and Var ^= "Charity" then do;
temp = Var;
Var = "NONE";
output;
Var = temp;
output;
temp =' ';
i = 1;
end;
else output;
drop temp i;
run;
proc print data = have;
run;
The Data Set, HAVE, will have triplets either with "NONE" or "Charity".
The next the old program without LAST.ID checks.
data clean;
length ID 8 Name $10 Address $19 Classification $8 Var $19;
do i = 1 by 1 until(VAR in ("NONE","Charity"));
set have;
select(mod(i,3));
when(1) Name=Var;
when(2) Address = Var;
when(0) Classification = Var;
otherwise;
end;
end;
drop i Var;
run;
proc print data = clean;
run;
Thanks for this - apologies about the mix-up. The data I am working with is actually huge, it wasn't until I implemented the first solution that I realised this issue persisted in the data.
I'll give this a try; and at the same time educate myself about loops! They look a fair bit trickier in SAS than other programs I've worked with haha.
Cheers.
Hi,
Data have;
length class $7;
Set unclean;
X=mod(_n_, 3);
If x=1 then class="name";
Else if x =2 then class = "address";
Else if x=0 then class = "type";
Retain record;
If x=1 then record+1;
Run;
Proc transpose data=have out=want(drop=CLASSIFICATION _name_ record);
By Id record;
Var var;
Id class;
Run;
that was a good one, changing source data to have addental issue not see in the sample.
Please post data that represents the data for testing.
Is the next set of data going to be missing the row 2 address line or the row 1 name line and just have the row 3?
cleaning data is our job but have data that represents the data needing cleaning is a must.
this is a Q that management would ask because they would only tell me part of the story a little at a time. Maybe that is because they just like to reconnect.
Is the next set of data going to be missing the row 2 address line or the row 1 name line and just have the row 3?
I'm hoping this isn't the case! The data I'm working with isn't as predictable as I was expecting, unfortunately. Although, I think with Datasp's response I should be able to come up with some creative solution to handle any other unforeseen irregularities (fingers crossed).
Thanks again for all your help guys, I really appreciate it! I'll try out the latest solution.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.