BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
KachiM
Rhodochrosite | Level 12

 

What are 9 times , 8 times and so on?  In your data set given below

331 repeats 9 times. Do you mean this repetition? Just create some sample data set similar to the one you have given. Pretend to give some imaginary values to hide real values. It is then possible to check the program to give you the answer.

 

 

data source;
INFILE DATALINES ;
 input ID VAR & $19. CLASSIFICATION $ ;
 DATALINES;
201  PepsiCo  XYZ  
201  111 main street  XYZ  
201  Charity  XYZ  
331  CocaCola  XYZ  
331  1823 unicorn street  XYZ  
331  Charity  XYZ  
331  Nike  XYZ  
331  123 brock avenue  XYZ  
331  Charity  XYZ  
331  Adidas  XYZ  
331  14 Market street  XYZ  
331  Charity  XYZ  
354  McDonalds  XYZ  
354  111 Obesity road  XYZ  
354  Charity  XYZ  
;
RUN;

 

KachiM
Rhodochrosite | Level 12

You should have placed a SAMPLE Data Set that represent your real Data SET. I have given a solution based on your sample Data Set. 

 

Now you say there will be varying number of records - 2 records instead of 3. I hope you don't come with newer issues with your original Data Set. 

 

The solution is based on first creating triplets with adding "NONE" when "Charity" is missing.  Next the same program without using LAST.ID. We can simply check for either "NONE" or "Charity" to output a record at a time.

 

data source;
INFILE DATALINES ;
 input ID VAR & $19. CLASSIFICATION $ ;
 DATALINES;
201  PepsiCo  XYZ  
201  111 main street  XYZ  
201  Charity  XYZ  
331  CocaCola  XYZ  
331  1823 unicorn street  XYZ  
331  Charity  XYZ  
331  Nike  XYZ  
331  123 brock avenue  XYZ  
331  Charity  XYZ  
331  Adidas  XYZ  
331  14 Market street  XYZ  
331  Charity  XYZ  
354  McDonalds  XYZ  
354  111 Obesity road  XYZ  
354  Charity  XYZ  
354  KFC  XYZ  
354 111 Fat Avenue  XYZ  
441  Burger King  XYZ  
441  112 Skinny ROAD  XYZ
441  Adidas  XYZ
441  Socks Blvd  XYZ
441  Tiege Henley  XYZ
441  Skincare Road  XYZ
441  Charity  XYZ
;
RUN;
proc print data = source;
run;

data have;
   set source;
   if Var = "Charity" then i = 0;
   else i+1;
   if i = 3 and Var ^= "Charity" then do; 
      temp = Var; 
      Var = "NONE";  
      output;
      Var = temp;
      output;
      temp =' ';
      i = 1;
   end;
   else output;
drop temp i;
run;

proc print data = have;
run;

The Data Set, HAVE, will have triplets either with "NONE" or "Charity". 

 

The next the old program without LAST.ID checks.

 

data clean;
length ID 8 Name $10 Address $19 Classification $8 Var $19; 
   do i = 1 by 1 until(VAR in ("NONE","Charity"));
      set have;
      select(mod(i,3));
         when(1) Name=Var;
         when(2) Address = Var;
         when(0) Classification = Var;
         otherwise;
      end;
   end;
drop i Var;
run;
proc print data = clean;
run;

 

 

UniversitySas
Quartz | Level 8

Thanks for this - apologies about the mix-up. The data I am working with is actually huge, it wasn't until I implemented the first solution that I realised this issue persisted in the data.


I'll give this a try; and at the same time educate myself about loops! They look a fair bit trickier in SAS than other programs I've worked with haha.


Cheers.

srinath3111
Quartz | Level 8

Hi,

 

 

Data have;
length class $7;
Set unclean;
X=mod(_n_, 3);
If x=1 then class="name";
Else if x =2 then class = "address";
Else if x=0 then class = "type";

Retain record;
If x=1 then record+1;
Run;

Proc transpose data=have out=want(drop=CLASSIFICATION _name_ record);
By Id record;
Var var;
Id class;
Run;

VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

that was a good one, changing source data to have addental issue not see in the sample.

Please post data that represents the data for testing.

Is the next set of data going to be missing the row 2 address line or the row 1 name line and just have the row 3?

 

cleaning data is our job but have data that represents the data needing cleaning is a must.

this is a Q that management would ask because they would only tell me part of the story a little at a time.  Maybe that is because they just like to reconnect.

 

UniversitySas
Quartz | Level 8

Is the next set of data going to be missing the row 2 address line or the row 1 name line and just have the row 3?

I'm hoping this isn't the case! The data I'm working with isn't as predictable as I was expecting, unfortunately. Although, I think with Datasp's response I should be able to come up with some creative solution to handle any other unforeseen irregularities (fingers crossed).


Thanks again for all your help guys, I really appreciate it! I'll try out the latest solution.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 20 replies
  • 1672 views
  • 0 likes
  • 5 in conversation