BookmarkSubscribeRSS Feed
vgriggs
Fluorite | Level 6

I was given an excel file where someone stored all of the information in a single column (var1). I need to pull information but it will be in random orders. Good thing is the person gave the information and then put a period after it. 

 

Example of Var1:

Type = 2. Size = 4 in x 12 in. Group = ABC grouping.

 

Group = A and B Holdings. Type = 1. 

 

Group = Mark H and Company. 

 

The variable I need to pull is group. It always starts with "Group = " and has a period in the end. But will be anywhere within the var1 (so you can't name a specific period. Sometimes it may not exist. This variable can be any length in words. I just need to pull the string between "Group = " and the period. 

 

This can't be done in excel due to the size of the dataset.

 

I have tried scan, find, splitting at the period, and I am not sure what to do at this point to organize it.  

4 REPLIES 4
yabwon
Onyx | Level 15

Try this:

data have;
infile cards4 dlm = "0A0D"x;
input Var1 : $ 100.;
cards4;
Type = 2. Size = 4 in x 12 in. Group = ABC grouping.
Group = A and B Holdings. Type = 1. 
Group = Mark H and Company. 
;;;;
run;
proc print;
run;

data want;
  set have;

  do _N_ = 1 to countw(Var1,".");
    part = strip(scan(Var1,_N_,".")); drop part;

    if upcase(part) =: "GROUP" then
      do;
        group = strip(substr(part, index(part, "=")+1));
      end;
  end;
run;
proc print;
run;

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



data_null__
Jade | Level 19

You could read the data using NAMED Input.

 

Capture.PNG

 

data have;
   infile cards4;
   length type size group $48;
   input @;
   _infile_= transtrn(_infile_,' =','=');
   input (_all_)(=);
   list;
   cards4;
Type = 2. Size = 4 in x 12 in. Group = ABC grouping.
Group = A and B Holdings. Type = 1. 
Group = Mark H and Company. 
;;;;
run;
proc print;
run;
Ksharp
Super User
data have;
infile cards4 truncover;
input Var1 $ 100.;
cards4;
Type = 2. Size = 4 in x 12 in. Group = ABC grouping.
Group = A and B Holdings. Type = 1. 
Group = Mark H and Company. 
;;;;
run;

data want;
 set have;
pid=prxparse('/(?<=group)\s*=[^\.]+(?=\.)/io');
call prxsubstr(pid,var1,p,l);
if p then  want=scan(strip(substr(var1,p,l)),1,'=');
drop pid p l;
run;
vgriggs
Fluorite | Level 6
Thank you so much! This worked!!!

I have new code to learn about today!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 407 views
  • 2 likes
  • 4 in conversation