BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
KachiM
Rhodochrosite | Level 12

 

What are 9 times , 8 times and so on?  In your data set given below

331 repeats 9 times. Do you mean this repetition? Just create some sample data set similar to the one you have given. Pretend to give some imaginary values to hide real values. It is then possible to check the program to give you the answer.

 

 

data source;
INFILE DATALINES ;
 input ID VAR & $19. CLASSIFICATION $ ;
 DATALINES;
201  PepsiCo  XYZ  
201  111 main street  XYZ  
201  Charity  XYZ  
331  CocaCola  XYZ  
331  1823 unicorn street  XYZ  
331  Charity  XYZ  
331  Nike  XYZ  
331  123 brock avenue  XYZ  
331  Charity  XYZ  
331  Adidas  XYZ  
331  14 Market street  XYZ  
331  Charity  XYZ  
354  McDonalds  XYZ  
354  111 Obesity road  XYZ  
354  Charity  XYZ  
;
RUN;

 

KachiM
Rhodochrosite | Level 12

You should have placed a SAMPLE Data Set that represent your real Data SET. I have given a solution based on your sample Data Set. 

 

Now you say there will be varying number of records - 2 records instead of 3. I hope you don't come with newer issues with your original Data Set. 

 

The solution is based on first creating triplets with adding "NONE" when "Charity" is missing.  Next the same program without using LAST.ID. We can simply check for either "NONE" or "Charity" to output a record at a time.

 

data source;
INFILE DATALINES ;
 input ID VAR & $19. CLASSIFICATION $ ;
 DATALINES;
201  PepsiCo  XYZ  
201  111 main street  XYZ  
201  Charity  XYZ  
331  CocaCola  XYZ  
331  1823 unicorn street  XYZ  
331  Charity  XYZ  
331  Nike  XYZ  
331  123 brock avenue  XYZ  
331  Charity  XYZ  
331  Adidas  XYZ  
331  14 Market street  XYZ  
331  Charity  XYZ  
354  McDonalds  XYZ  
354  111 Obesity road  XYZ  
354  Charity  XYZ  
354  KFC  XYZ  
354 111 Fat Avenue  XYZ  
441  Burger King  XYZ  
441  112 Skinny ROAD  XYZ
441  Adidas  XYZ
441  Socks Blvd  XYZ
441  Tiege Henley  XYZ
441  Skincare Road  XYZ
441  Charity  XYZ
;
RUN;
proc print data = source;
run;

data have;
   set source;
   if Var = "Charity" then i = 0;
   else i+1;
   if i = 3 and Var ^= "Charity" then do; 
      temp = Var; 
      Var = "NONE";  
      output;
      Var = temp;
      output;
      temp =' ';
      i = 1;
   end;
   else output;
drop temp i;
run;

proc print data = have;
run;

The Data Set, HAVE, will have triplets either with "NONE" or "Charity". 

 

The next the old program without LAST.ID checks.

 

data clean;
length ID 8 Name $10 Address $19 Classification $8 Var $19; 
   do i = 1 by 1 until(VAR in ("NONE","Charity"));
      set have;
      select(mod(i,3));
         when(1) Name=Var;
         when(2) Address = Var;
         when(0) Classification = Var;
         otherwise;
      end;
   end;
drop i Var;
run;
proc print data = clean;
run;

 

 

UniversitySas
Quartz | Level 8

Thanks for this - apologies about the mix-up. The data I am working with is actually huge, it wasn't until I implemented the first solution that I realised this issue persisted in the data.


I'll give this a try; and at the same time educate myself about loops! They look a fair bit trickier in SAS than other programs I've worked with haha.


Cheers.

srinath3111
Quartz | Level 8

Hi,

 

 

Data have;
length class $7;
Set unclean;
X=mod(_n_, 3);
If x=1 then class="name";
Else if x =2 then class = "address";
Else if x=0 then class = "type";

Retain record;
If x=1 then record+1;
Run;

Proc transpose data=have out=want(drop=CLASSIFICATION _name_ record);
By Id record;
Var var;
Id class;
Run;

VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

that was a good one, changing source data to have addental issue not see in the sample.

Please post data that represents the data for testing.

Is the next set of data going to be missing the row 2 address line or the row 1 name line and just have the row 3?

 

cleaning data is our job but have data that represents the data needing cleaning is a must.

this is a Q that management would ask because they would only tell me part of the story a little at a time.  Maybe that is because they just like to reconnect.

 

UniversitySas
Quartz | Level 8

Is the next set of data going to be missing the row 2 address line or the row 1 name line and just have the row 3?

I'm hoping this isn't the case! The data I'm working with isn't as predictable as I was expecting, unfortunately. Although, I think with Datasp's response I should be able to come up with some creative solution to handle any other unforeseen irregularities (fingers crossed).


Thanks again for all your help guys, I really appreciate it! I'll try out the latest solution.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 20 replies
  • 2171 views
  • 0 likes
  • 5 in conversation