Dear Experts:
Hope you're doing great, asking for your help again; let me explain you the data.
I have a table like this:
string1 | type |
CUS5784John Smith5757 | CUS |
ACC67858394857684393 | ACC |
PER57483948576849303 | PER |
CUS4893Paul Jones5784 | CUS |
ACC5849605948304953 | ACC |
PER6758495060794843 | PER |
SHR7849586093837483 | SHR |
The string1 field contains the data I need to extract, the first three letters is the code to identify what kind of record is about (type), they can be CUS, ACC, PER, SHR and many others. The CUS code will be the first of the group of records until another CUS code is found.
What I'm trying to do is to get the client name in the CUS record and write it in a new field (name) for every record after but until a new CUS code is found, like this:
string1 | type | name |
CUS5784John Smith5757 | CUS | John Smith |
ACC67858394857684393 | ACC | John Smith |
PER57483948576849303 | PER | John Smith |
CUS4893Paul Jones5784 | CUS | Paul Jones |
ACC5849605948304953 | ACC | Paul Jones |
PER6758495060794843 | PER | Paul Jones |
SHR7849586093837483 | SHR | Paul Jones |
And the code I already have is:
%let vfile = "c:\tmp\file_20210617.Out";
data
work.prueba ;
infile
&vfile. ;
input
@1 string1 $char1638.;
type = substr(string1,1,3);
name = ifc( type = 'CUS', substr(string1,29,100), "");
run;
*Test1;
data work.prueba1 ;
do _n_ = 1 by 1 to (final) ;
set work.prueba end=final;
tag = ifc(type = 'CUS', name, "");
name = tag;
end;
run;
*Test2;
data work.prueba1 ;
set work.prueba ;
retain tag name ;
if type = 'CUS' then tag1 = tag ;
else tag1 = "" ;
run;
*Test3;
data work.prueba1 ;
set work.prueba ;
retain name ;
tag = name ;
run;
But I don't know if retain is usefull here or I need another command.
Any help would be greatly appreciated, thanks in advance.
Fernando
@japfvg wrote:
Dear Experts:
Hope you're doing great, asking for your help again; let me explain you the data.
I have a table like this:
string1 type CUS5784John Smith5757 CUS ACC67858394857684393 ACC PER57483948576849303 PER CUS4893Paul Jones5784 CUS ACC5849605948304953 ACC PER6758495060794843 PER SHR7849586093837483 SHR
The string1 field contains the data I need to extract, the first three letters is the code to identify what kind of record is about (type), they can be CUS, ACC, PER, SHR and many others. The CUS code will be the first of the group of records until another CUS code is found.
What I'm trying to do is to get the client name in the CUS record and write it in a new field (name) for every record after but until a new CUS code is found, like this:
string1 type name CUS5784John Smith5757 CUS John Smith ACC67858394857684393 ACC John Smith PER57483948576849303 PER John Smith CUS4893Paul Jones5784 CUS Paul Jones ACC5849605948304953 ACC Paul Jones PER6758495060794843 PER Paul Jones SHR7849586093837483 SHR Paul Jones
And the code I already have is:
%let vfile = "c:\tmp\file_20210617.Out"; data work.prueba ; infile &vfile. ; input @1 string1 $char1638.; type = substr(string1,1,3); name = ifc( type = 'CUS', substr(string1,29,100), ""); run; *Test1; data work.prueba1 ; do _n_ = 1 by 1 to (final) ; set work.prueba end=final; tag = ifc(type = 'CUS', name, ""); name = tag; end; run; *Test2; data work.prueba1 ; set work.prueba ; retain tag name ; if type = 'CUS' then tag1 = tag ; else tag1 = "" ; run; *Test3; data work.prueba1 ; set work.prueba ; retain name ; tag = name ; run;
But I don't know if retain is usefull here or I need another command.
Any help would be greatly appreciated, thanks in advance.
Fernando
Please see:
data have; infile datalines truncover; input string1 $ 1-22 ; datalines; CUS5784John Smith5757 ACC67858394857684393 PER57483948576849303 CUS4893Paul Jones5784 ACC5849605948304953 PER6758495060794843 SHR7849586093837483 ; data want; set have; type= substr(string1,1,3); length name $ 15; retain name ; if type='CUS' then name= substr(string1,8,10) ; run;
I realize that we don't have your entire string1 value but this works with the provided example.
Personally, I would hope there was some documentation about the source file and would have been able to read the values into proper variables from the beginning instead of having a single string variable.
The flakiest part of this, and your example, is the code to get name from the middle of the string. Names are seldom all the same length and without additional information you end up with extra stuff.
@japfvg wrote:
Dear Experts:
Hope you're doing great, asking for your help again; let me explain you the data.
I have a table like this:
string1 type CUS5784John Smith5757 CUS ACC67858394857684393 ACC PER57483948576849303 PER CUS4893Paul Jones5784 CUS ACC5849605948304953 ACC PER6758495060794843 PER SHR7849586093837483 SHR
The string1 field contains the data I need to extract, the first three letters is the code to identify what kind of record is about (type), they can be CUS, ACC, PER, SHR and many others. The CUS code will be the first of the group of records until another CUS code is found.
What I'm trying to do is to get the client name in the CUS record and write it in a new field (name) for every record after but until a new CUS code is found, like this:
string1 type name CUS5784John Smith5757 CUS John Smith ACC67858394857684393 ACC John Smith PER57483948576849303 PER John Smith CUS4893Paul Jones5784 CUS Paul Jones ACC5849605948304953 ACC Paul Jones PER6758495060794843 PER Paul Jones SHR7849586093837483 SHR Paul Jones
And the code I already have is:
%let vfile = "c:\tmp\file_20210617.Out"; data work.prueba ; infile &vfile. ; input @1 string1 $char1638.; type = substr(string1,1,3); name = ifc( type = 'CUS', substr(string1,29,100), ""); run; *Test1; data work.prueba1 ; do _n_ = 1 by 1 to (final) ; set work.prueba end=final; tag = ifc(type = 'CUS', name, ""); name = tag; end; run; *Test2; data work.prueba1 ; set work.prueba ; retain tag name ; if type = 'CUS' then tag1 = tag ; else tag1 = "" ; run; *Test3; data work.prueba1 ; set work.prueba ; retain name ; tag = name ; run;
But I don't know if retain is usefull here or I need another command.
Any help would be greatly appreciated, thanks in advance.
Fernando
Please see:
data have; infile datalines truncover; input string1 $ 1-22 ; datalines; CUS5784John Smith5757 ACC67858394857684393 PER57483948576849303 CUS4893Paul Jones5784 ACC5849605948304953 PER6758495060794843 SHR7849586093837483 ; data want; set have; type= substr(string1,1,3); length name $ 15; retain name ; if type='CUS' then name= substr(string1,8,10) ; run;
I realize that we don't have your entire string1 value but this works with the provided example.
Personally, I would hope there was some documentation about the source file and would have been able to read the values into proper variables from the beginning instead of having a single string variable.
The flakiest part of this, and your example, is the code to get name from the middle of the string. Names are seldom all the same length and without additional information you end up with extra stuff.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.