I am working on a project that will help children learn faster/better. I have got a raw data file with some dictionary data. The follwong have and want may be self explanatory:
Mostly the data is created systematically. Here are some characteristics I have found which may help in programming:
Word: Begin with no space
Pronounciation: in "(" and ")"
Etymology: in "[" and "]"
If pronounciation and etymology is given, the word is repeated for defintion
Defintion: start with a -space- character
Examples: starts with a year
Can someone please help? (Please see the screenshot for clarity)
Jijil Ramakrishnan
That is really uneasy.
data x;
infile '/folders/myfolders/samplerawfile.txt' lrecl=32767 length=len;
input x $varying32767. len;
if missing(x) then delete;
run;
data temp;
merge x x(firstobs=2 rename=(x=_x));
if prxmatch('/^\(.+\)\[.+\]$/',strip(_x)) then group+1;
drop _x;
run;
data temp;
set temp;
by group;
length name temp $ 80;
retain temp;
if first.group then do; n=0;call missing(temp);end;
n+1;
if n=1 then do;name='word';temp=x;end;
else if prxmatch('/^\(.+\)\[.+\]$/',strip(x)) then name='pronounce';
else if prxmatch('/^[\w\p]+:/',strip(x)) then name='example';
else name='meaning';
if temp=x and n ne 1 then delete;
drop temp n ;
run;
data temp;
set temp;
length _name $ 80;
by group;
if first.group then do;idx=0;n=0;end;
if name='meaning' then do;
idx+1;
_name=catx('_',name,idx);
end;
drop n;
run;
data temp;
set temp;
by group idx notsorted;
if first.idx then m=-1;
m+1;
if name='example' then _name=cats(cats('meaning',idx),'_',cats(name,m));
else if missing(_name) then _name=name;
run;
proc transpose data=temp out=want;
by group;
id _name;
var x;
run;
proc print data=want(obs=50) noobs;run;
What is the question here?
Any suggestion to program this?
Program what? You haven't explained what the question is or what you're trying to do. We can't read minds.
My bad. You wrote on the images. That wasn't clear at first.
Thats going to need regular expressions which is not my forte.
That is really uneasy.
data x;
infile '/folders/myfolders/samplerawfile.txt' lrecl=32767 length=len;
input x $varying32767. len;
if missing(x) then delete;
run;
data temp;
merge x x(firstobs=2 rename=(x=_x));
if prxmatch('/^\(.+\)\[.+\]$/',strip(_x)) then group+1;
drop _x;
run;
data temp;
set temp;
by group;
length name temp $ 80;
retain temp;
if first.group then do; n=0;call missing(temp);end;
n+1;
if n=1 then do;name='word';temp=x;end;
else if prxmatch('/^\(.+\)\[.+\]$/',strip(x)) then name='pronounce';
else if prxmatch('/^[\w\p]+:/',strip(x)) then name='example';
else name='meaning';
if temp=x and n ne 1 then delete;
drop temp n ;
run;
data temp;
set temp;
length _name $ 80;
by group;
if first.group then do;idx=0;n=0;end;
if name='meaning' then do;
idx+1;
_name=catx('_',name,idx);
end;
drop n;
run;
data temp;
set temp;
by group idx notsorted;
if first.idx then m=-1;
m+1;
if name='example' then _name=cats(cats('meaning',idx),'_',cats(name,m));
else if missing(_name) then _name=name;
run;
proc transpose data=temp out=want;
by group;
id _name;
var x;
run;
proc print data=want(obs=50) noobs;run;
I thank you very much, Mr. KSharp! You have done more than what I had requested. Is there any study indicating a correlation between knowledge and magnanimity? Thank you very much!
I can not guarantee my code can get what you want.
You need know what kind of rules you should take into account.
For example.
prxmatch('/^[\w\p]+/')
also could change into
prxmatch('/^\w+/')
it is depended on your data.
" Is there any study indicating a correlation between knowledge and magnanimity?"
You mean make some statistical model ?
Plz post it at Statisical forum .
" Is there any study indicating a correlation between knowledge and magnanimity?"
You mean make some statistical model ?
Plz post it at Statisical forum .
@Ksharp I believe that was a complement of sorts, not a question. Smarter people are helpful 🙂
Actually , I don't know what OP means.
I am considering about some category data analysis, such as contingency table analysis.
You mean it is a compliment ?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.