BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi, everyone.

I got a txt file with millions of observations. There are three variables for each obs, like name subname and number.
However, subname is optional without any special notice.

part of the data:

John (tech) (43) Johnson(econ) (32) Julian (24) Justin (34) Jo (math) (32)
Julia(econ) (33) June (93)
....

How can I manipulate this data?

Thank you for your time.

Jun
3 REPLIES 3
LinusH
Tourmaline | Level 20
This can be done in a number of ways. One is to read your data into three variables, then checking if your second variable contains any digits (using ANYDIGIT function), if so move the contents to the third variable.

/Linus
Data never sleeps
Patrick
Opal | Level 21
Hi Jun

Also the problem is in general not too difficult to solve I can think of a few challenges which might occur on how your raw data look like.
Is the record structure really the way you show it to us (several 'observations' in one line)? Could it be that the name is missing (quite possible if there are millions of records) and that therefore you could have 2 to 4 consecutive values in brackets belonging to 2 different 'observations'?

Please let me know as the concrete solution will depend on how the data looks like.

The easiest way I can think about right now is to use Regular Expressions (funcions PRX.. in SAS) to decide which substring makes up an 'observation' - but Regular Expressions need also some practice to use and understand.

Cheers, Patrick
deleted_user
Not applicable
a big thank you to linux and patrick
With your recommendations, I find my problem. It focus on the structure of the raw data, cauz the raw data is too rough.
It is solved now. Thank you for your time!

Jun

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1279 views
  • 0 likes
  • 3 in conversation