BookmarkSubscribeRSS Feed
vgriggs
Fluorite | Level 6

I was given an excel file where someone stored all of the information in a single column (var1). I need to pull information but it will be in random orders. Good thing is the person gave the information and then put a period after it. 

 

Example of Var1:

Type = 2. Size = 4 in x 12 in. Group = ABC grouping.

 

Group = A and B Holdings. Type = 1. 

 

Group = Mark H and Company. 

 

The variable I need to pull is group. It always starts with "Group = " and has a period in the end. But will be anywhere within the var1 (so you can't name a specific period. Sometimes it may not exist. This variable can be any length in words. I just need to pull the string between "Group = " and the period. 

 

This can't be done in excel due to the size of the dataset.

 

I have tried scan, find, splitting at the period, and I am not sure what to do at this point to organize it.  

4 REPLIES 4
yabwon
Amethyst | Level 16

Try this:

data have;
infile cards4 dlm = "0A0D"x;
input Var1 : $ 100.;
cards4;
Type = 2. Size = 4 in x 12 in. Group = ABC grouping.
Group = A and B Holdings. Type = 1. 
Group = Mark H and Company. 
;;;;
run;
proc print;
run;

data want;
  set have;

  do _N_ = 1 to countw(Var1,".");
    part = strip(scan(Var1,_N_,".")); drop part;

    if upcase(part) =: "GROUP" then
      do;
        group = strip(substr(part, index(part, "=")+1));
      end;
  end;
run;
proc print;
run;

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



data_null__
Jade | Level 19

You could read the data using NAMED Input.

 

Capture.PNG

 

data have;
   infile cards4;
   length type size group $48;
   input @;
   _infile_= transtrn(_infile_,' =','=');
   input (_all_)(=);
   list;
   cards4;
Type = 2. Size = 4 in x 12 in. Group = ABC grouping.
Group = A and B Holdings. Type = 1. 
Group = Mark H and Company. 
;;;;
run;
proc print;
run;
Ksharp
Super User
data have;
infile cards4 truncover;
input Var1 $ 100.;
cards4;
Type = 2. Size = 4 in x 12 in. Group = ABC grouping.
Group = A and B Holdings. Type = 1. 
Group = Mark H and Company. 
;;;;
run;

data want;
 set have;
pid=prxparse('/(?<=group)\s*=[^\.]+(?=\.)/io');
call prxsubstr(pid,var1,p,l);
if p then  want=scan(strip(substr(var1,p,l)),1,'=');
drop pid p l;
run;
vgriggs
Fluorite | Level 6
Thank you so much! This worked!!!

I have new code to learn about today!

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1354 views
  • 2 likes
  • 4 in conversation