So, what I'm asking is not exactly a transpose (from my knowledge) and I'm finding it very tricky! Any help will be immensely appreciated.
I have some sample data:
DATA Unclean ;
 
INFILE DATALINES ;
 input ID VAR & $19. CLASSIFICATION $ ;
 DATALINES;
201  PepsiCo  XYZ  
201  111 main street  XYZ  
201  Charity  XYZ  
331  CocaCola  XYZ  
331  1823 unicorn street  XYZ  
331  Charity  XYZ  
331  Nike  XYZ  
331  123 brock avenue  XYZ  
331  Charity  XYZ  
331  Adidas  XYZ  
331  14 Market street  XYZ  
331  Charity  XYZ  
354  McDonalds  XYZ  
354  111 Obesity road  XYZ  
354  Charity  XYZ  
;
RUN;And my goal would be to get an output like this:
DATA CLEAN;
INFILE DATALINES ;
 input ID Name & $10. Address & $19. Classification $ ;
 DATALINES;
201  PepsiCo  111 main street	Charity 
331  CocaCola  1823 unicorn street  Charity   
331  Nike  123 brock avenue  Charity  
331  Adidas  14 Market street  Charity  
354  McDonalds  111 Obesity road  Charity  
;
RUN;Specifically, there are a few challenges I am having.
1) The IDs have uneven companies attached to them
2) The value "XYZ" needs to be changed to reflect things like Name, Address, and Classification.
      Regarding 2) My fix has been that whenever we see an observation "Charity", the next observation is always a "Name".
Is there another way, more flexible way to approach this?
3) What would be the best function or command to achieve this?
Any help is very much appreciated!
You should have placed a SAMPLE Data Set that represent your real Data SET. I have given a solution based on your sample Data Set.
Now you say there will be varying number of records - 2 records instead of 3. I hope you don't come with newer issues with your original Data Set.
The solution is based on first creating triplets with adding "NONE" when "Charity" is missing. Next the same program without using LAST.ID. We can simply check for either "NONE" or "Charity" to output a record at a time.
data source;
INFILE DATALINES ;
 input ID VAR & $19. CLASSIFICATION $ ;
 DATALINES;
201  PepsiCo  XYZ  
201  111 main street  XYZ  
201  Charity  XYZ  
331  CocaCola  XYZ  
331  1823 unicorn street  XYZ  
331  Charity  XYZ  
331  Nike  XYZ  
331  123 brock avenue  XYZ  
331  Charity  XYZ  
331  Adidas  XYZ  
331  14 Market street  XYZ  
331  Charity  XYZ  
354  McDonalds  XYZ  
354  111 Obesity road  XYZ  
354  Charity  XYZ  
354  KFC  XYZ  
354 111 Fat Avenue  XYZ  
441  Burger King  XYZ  
441  112 Skinny ROAD  XYZ
441  Adidas  XYZ
441  Socks Blvd  XYZ
441  Tiege Henley  XYZ
441  Skincare Road  XYZ
441  Charity  XYZ
;
RUN;
proc print data = source;
run;
data have;
   set source;
   if Var = "Charity" then i = 0;
   else i+1;
   if i = 3 and Var ^= "Charity" then do; 
      temp = Var; 
      Var = "NONE";  
      output;
      Var = temp;
      output;
      temp =' ';
      i = 1;
   end;
   else output;
drop temp i;
run;
proc print data = have;
run;
The Data Set, HAVE, will have triplets either with "NONE" or "Charity".
The next the old program without LAST.ID checks.
data clean;
length ID 8 Name $10 Address $19 Classification $8 Var $19; 
   do i = 1 by 1 until(VAR in ("NONE","Charity"));
      set have;
      select(mod(i,3));
         when(1) Name=Var;
         when(2) Address = Var;
         when(0) Classification = Var;
         otherwise;
      end;
   end;
drop i Var;
run;
proc print data = clean;
run;
this link should assist you with understanding how to transpose data:
https://stats.idre.ucla.edu/sas/modules/how-to-reshape-data-long-to-wide-using-proc-transpose/
It looks like the x = row 1, y = row 2, z = row 3
output your record.
start you count over for x,y.z;
basically looks like 3 records per final record.
yes you may need to hold record 1 and record 2 and output after obtaining record 3
proc transpose data=have out=want(drop=_name_);
   by id;
   var var;
run;you will need to pass the var-names to the columns
For now the adding the correct var names are not too important, so I can address that later.
But using a PROC TRANSPOSE gives me something like this:
It fails to re-iterate values for ID 331, instead just lumping them into the same row.
you need to clean your data as @Reeza has noted above first. Then use the proc transpose.
with cleaned up data based on @Reeza datastep
data source;
INFILE DATALINES ;
 input ID VAR & $19. CLASSIFICATION $ ;
 DATALINES;
201  PepsiCo  XYZ  
201  111 main street  XYZ  
201  Charity  XYZ  
331  CocaCola  XYZ  
331  1823 unicorn street  XYZ  
331  Charity  XYZ  
331  Nike  XYZ  
331  123 brock avenue  XYZ  
331  Charity  XYZ  
331  Adidas  XYZ  
331  14 Market street  XYZ  
331  Charity  XYZ  
354  McDonalds  XYZ  
354  111 Obesity road  XYZ  
354  Charity  XYZ  
;
RUN;
Data have;
length class $7;
Set source;
X=mod(_n_, 3);
If x=1 then class="name";
Else if x =2 then class = "address";
Else if x=0 then class = "type";
Retain record;
If x=1 then record+1;
Run;
Proc transpose data=have(drop=record _name_) out=want;
By Id record;
Var var;
Id class;
Run;
I suppose you don't insist on PROC TRANSPOSE. In one Data Step you will get the required output. Assumed that the Data Set is sorted by ID.
The sequential records provide one of the three Variables where LASTID can be used(except for ID = 331). For ID = 331, the alternate check with "Charity" breaks them into several records. The use of MOD(i,3) picks the relevant Value for Variables.
data clean;
length ID 8 Name $10 Address $19 Classification $8 Var $19; 
   do i = 1 by 1 until(last.id | Var = 'Charity');
      set unclean;
      by id notsorted;
  
      select(mod(i,3));
         when(1) Name=Var;
         when(2) Address = Var;
         when(0) Classification = Var;
         otherwise;
      end;
   end;
drop i Var;
run;
					
				
			
			
				
			
			
			
			
			
			
			
		Hey datasp,
So one thing I just realised was that the code is not quite flexible enough; I was wondering if you had a fix.
Some of my data is organised in a way that the first 1/4 might have the recurring pattern for 9 rows, the 2nd quarter for 8 rows, the third quarter for 9 rows again and the final quarter for 7 rows.
I understand it is a little vague since I don't have a sample of the dataset here (which I can provide if it's needed), but do you have an alternate strategy for the case above?
Would it my best bet be to break the data into separate pieces and then merge it again later, or is there another way?
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.