What are 9 times , 8 times and so on? In your data set given below
331 repeats 9 times. Do you mean this repetition? Just create some sample data set similar to the one you have given. Pretend to give some imaginary values to hide real values. It is then possible to check the program to give you the answer.
data source;
INFILE DATALINES ;
 input ID VAR & $19. CLASSIFICATION $ ;
 DATALINES;
201  PepsiCo  XYZ  
201  111 main street  XYZ  
201  Charity  XYZ  
331  CocaCola  XYZ  
331  1823 unicorn street  XYZ  
331  Charity  XYZ  
331  Nike  XYZ  
331  123 brock avenue  XYZ  
331  Charity  XYZ  
331  Adidas  XYZ  
331  14 Market street  XYZ  
331  Charity  XYZ  
354  McDonalds  XYZ  
354  111 Obesity road  XYZ  
354  Charity  XYZ  
;
RUN;
You should have placed a SAMPLE Data Set that represent your real Data SET. I have given a solution based on your sample Data Set.
Now you say there will be varying number of records - 2 records instead of 3. I hope you don't come with newer issues with your original Data Set.
The solution is based on first creating triplets with adding "NONE" when "Charity" is missing. Next the same program without using LAST.ID. We can simply check for either "NONE" or "Charity" to output a record at a time.
data source;
INFILE DATALINES ;
 input ID VAR & $19. CLASSIFICATION $ ;
 DATALINES;
201  PepsiCo  XYZ  
201  111 main street  XYZ  
201  Charity  XYZ  
331  CocaCola  XYZ  
331  1823 unicorn street  XYZ  
331  Charity  XYZ  
331  Nike  XYZ  
331  123 brock avenue  XYZ  
331  Charity  XYZ  
331  Adidas  XYZ  
331  14 Market street  XYZ  
331  Charity  XYZ  
354  McDonalds  XYZ  
354  111 Obesity road  XYZ  
354  Charity  XYZ  
354  KFC  XYZ  
354 111 Fat Avenue  XYZ  
441  Burger King  XYZ  
441  112 Skinny ROAD  XYZ
441  Adidas  XYZ
441  Socks Blvd  XYZ
441  Tiege Henley  XYZ
441  Skincare Road  XYZ
441  Charity  XYZ
;
RUN;
proc print data = source;
run;
data have;
   set source;
   if Var = "Charity" then i = 0;
   else i+1;
   if i = 3 and Var ^= "Charity" then do; 
      temp = Var; 
      Var = "NONE";  
      output;
      Var = temp;
      output;
      temp =' ';
      i = 1;
   end;
   else output;
drop temp i;
run;
proc print data = have;
run;
The Data Set, HAVE, will have triplets either with "NONE" or "Charity".
The next the old program without LAST.ID checks.
data clean;
length ID 8 Name $10 Address $19 Classification $8 Var $19; 
   do i = 1 by 1 until(VAR in ("NONE","Charity"));
      set have;
      select(mod(i,3));
         when(1) Name=Var;
         when(2) Address = Var;
         when(0) Classification = Var;
         otherwise;
      end;
   end;
drop i Var;
run;
proc print data = clean;
run;
Thanks for this - apologies about the mix-up. The data I am working with is actually huge, it wasn't until I implemented the first solution that I realised this issue persisted in the data.
I'll give this a try; and at the same time educate myself about loops! They look a fair bit trickier in SAS than other programs I've worked with haha.
Cheers.
Hi,
Data have;
length class $7;
Set unclean;
X=mod(_n_, 3);
If x=1 then class="name";
Else if x =2 then class = "address";
Else if x=0 then class = "type";
Retain record;
If x=1 then record+1;
Run;
Proc transpose data=have out=want(drop=CLASSIFICATION _name_ record);
By Id record;
Var var;
Id class;
Run;
that was a good one, changing source data to have addental issue not see in the sample.
Please post data that represents the data for testing.
Is the next set of data going to be missing the row 2 address line or the row 1 name line and just have the row 3?
cleaning data is our job but have data that represents the data needing cleaning is a must.
this is a Q that management would ask because they would only tell me part of the story a little at a time. Maybe that is because they just like to reconnect.
Is the next set of data going to be missing the row 2 address line or the row 1 name line and just have the row 3?
I'm hoping this isn't the case! The data I'm working with isn't as predictable as I was expecting, unfortunately. Although, I think with Datasp's response I should be able to come up with some creative solution to handle any other unforeseen irregularities (fingers crossed).
Thanks again for all your help guys, I really appreciate it! I'll try out the latest solution.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.