I was given an excel file where someone stored all of the information in a single column (var1). I need to pull information but it will be in random orders. Good thing is the person gave the information and then put a period after it.
Example of Var1:
Type = 2. Size = 4 in x 12 in. Group = ABC grouping.
Group = A and B Holdings. Type = 1.
Group = Mark H and Company.
The variable I need to pull is group. It always starts with "Group = " and has a period in the end. But will be anywhere within the var1 (so you can't name a specific period. Sometimes it may not exist. This variable can be any length in words. I just need to pull the string between "Group = " and the period.
This can't be done in excel due to the size of the dataset.
I have tried scan, find, splitting at the period, and I am not sure what to do at this point to organize it.
Try this:
data have;
infile cards4 dlm = "0A0D"x;
input Var1 : $ 100.;
cards4;
Type = 2. Size = 4 in x 12 in. Group = ABC grouping.
Group = A and B Holdings. Type = 1.
Group = Mark H and Company.
;;;;
run;
proc print;
run;
data want;
set have;
do _N_ = 1 to countw(Var1,".");
part = strip(scan(Var1,_N_,".")); drop part;
if upcase(part) =: "GROUP" then
do;
group = strip(substr(part, index(part, "=")+1));
end;
end;
run;
proc print;
run;
Bart
You could read the data using NAMED Input.
data have;
infile cards4;
length type size group $48;
input @;
_infile_= transtrn(_infile_,' =','=');
input (_all_)(=);
list;
cards4;
Type = 2. Size = 4 in x 12 in. Group = ABC grouping.
Group = A and B Holdings. Type = 1.
Group = Mark H and Company.
;;;;
run;
proc print;
run;
data have; infile cards4 truncover; input Var1 $ 100.; cards4; Type = 2. Size = 4 in x 12 in. Group = ABC grouping. Group = A and B Holdings. Type = 1. Group = Mark H and Company. ;;;; run; data want; set have; pid=prxparse('/(?<=group)\s*=[^\.]+(?=\.)/io'); call prxsubstr(pid,var1,p,l); if p then want=scan(strip(substr(var1,p,l)),1,'='); drop pid p l; run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.