BookmarkSubscribeRSS Feed
vgriggs
Fluorite | Level 6

I was given an excel file where someone stored all of the information in a single column (var1). I need to pull information but it will be in random orders. Good thing is the person gave the information and then put a period after it. 

 

Example of Var1:

Type = 2. Size = 4 in x 12 in. Group = ABC grouping.

 

Group = A and B Holdings. Type = 1. 

 

Group = Mark H and Company. 

 

The variable I need to pull is group. It always starts with "Group = " and has a period in the end. But will be anywhere within the var1 (so you can't name a specific period. Sometimes it may not exist. This variable can be any length in words. I just need to pull the string between "Group = " and the period. 

 

This can't be done in excel due to the size of the dataset.

 

I have tried scan, find, splitting at the period, and I am not sure what to do at this point to organize it.  

4 REPLIES 4
yabwon
Amethyst | Level 16

Try this:

data have;
infile cards4 dlm = "0A0D"x;
input Var1 : $ 100.;
cards4;
Type = 2. Size = 4 in x 12 in. Group = ABC grouping.
Group = A and B Holdings. Type = 1. 
Group = Mark H and Company. 
;;;;
run;
proc print;
run;

data want;
  set have;

  do _N_ = 1 to countw(Var1,".");
    part = strip(scan(Var1,_N_,".")); drop part;

    if upcase(part) =: "GROUP" then
      do;
        group = strip(substr(part, index(part, "=")+1));
      end;
  end;
run;
proc print;
run;

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



data_null__
Jade | Level 19

You could read the data using NAMED Input.

 

Capture.PNG

 

data have;
   infile cards4;
   length type size group $48;
   input @;
   _infile_= transtrn(_infile_,' =','=');
   input (_all_)(=);
   list;
   cards4;
Type = 2. Size = 4 in x 12 in. Group = ABC grouping.
Group = A and B Holdings. Type = 1. 
Group = Mark H and Company. 
;;;;
run;
proc print;
run;
Ksharp
Super User
data have;
infile cards4 truncover;
input Var1 $ 100.;
cards4;
Type = 2. Size = 4 in x 12 in. Group = ABC grouping.
Group = A and B Holdings. Type = 1. 
Group = Mark H and Company. 
;;;;
run;

data want;
 set have;
pid=prxparse('/(?<=group)\s*=[^\.]+(?=\.)/io');
call prxsubstr(pid,var1,p,l);
if p then  want=scan(strip(substr(var1,p,l)),1,'=');
drop pid p l;
run;
vgriggs
Fluorite | Level 6
Thank you so much! This worked!!!

I have new code to learn about today!

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1309 views
  • 2 likes
  • 4 in conversation