BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi,

I've trawled help files and internet articles to no avail on this subject and wondered if anyone has successfully carried out what I'm trying to do.

I have a field in a data set that contains information about the variable names, size and values in a long string. I'm trying to parse this data out into actual variables and values within the same observation.

For example:
The variable (event_data) I'm trying to parse contains the value;
26CRDT2010-05-13-07.30.00.000000

In the above, the first 2 characters always denote the length which in this case is 26. The characters that always start at the 3rd for 4 characters is the field name and from the 7th character to the length denoted from the first 2 (26 in this case) is always the value of the field.

So I'm trying to create the field of CRDT with the value of 2010-05-13-07.30.00.000000 in this observation.

I feel that SAS should be able to do this but I've not managed to do it successfully yet and would be most appreciative if anyone has done or can work out how to do it.

Thanks.
3 REPLIES 3
Peter_C
Rhodochrosite | Level 12
the main problem is that you seek to define column names in data (normally the job of syntax).
If the data volume allows you to read it all twice, then
1 first pass
read through the file, collecting names and expected widths
2 second pass
read event data into relevant names

How big is your data file?
How often is it refreshed and needing to be reloaded?

PeterC
sfsdtegsdsdgdffhgfh
Fluorite | Level 6
Neil,

try this


data t1;
format a $100.;
input a $;
datalines;
26CRDT2010-05-13-07.30.00.000000
23ABCD2010-05-13-07.30.00.000
;
run;

data t2;
set t1;
format l 8. field $10. value $30. ;
l = substr(a,1,2);
field = substr(a,3,4);
value = substr(a,7,l);
drop a;
run;


proc transpose data = t2 out = t3 (drop = _name_);
id field;
var value;
run;


GL.
deleted_user
Not applicable
Thank you for your comments on this, they were a massive help. I considered the TRANSPOSE but didn't realize it would be able to do it. I've managed to do it although I simplified the data above so a bit more has gone into it.

The data is more like this;

EVENT_ID // EVENT_DATA
event1 // 26CRDT2010-01-01-07.00.00.000000
event2 // 10AGDT2010-05-06
event3 // 26CRDT2010-02-28-12.35.30.658125
event4 // 08STATPROCESSD26CRDT2005-09-13-16.55.32.349572

The same fields within the EVENT_DATA could appear against different EVENT_IDs. In addition, as with the last observation here, event4, the EVENT_DATA field can contain multiple fields.

I basically used the logic you posted in the T2 dataset above to parse out each variable but then put it into a DO UNTIL loop which cut down the EVENT_ID each iteration and output observations within the loop thus the resultant dataset for the above data had 5 observations because event 4 was repeated with the second variable in the EVENT_ID.

I then simply used the TRANSPOSE procedure using ID as FIELD, VAR as VALUE and did it BY the EVENT_ID.

Thanks again for your help.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 720 views
  • 0 likes
  • 3 in conversation