DATA Step, Macro, Functions and more

How to categorize data in different variables ?

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 9
Accepted Solution

How to categorize data in different variables ?

Hi, I am a beginner who is working on student edition of SAS. I exported the WhatsApp chat data of a group for 6 months. A part of it is as shown below ;

 

1/18/17, 6:57:00 PM: TSV: Everyone claps for Diana who got selected for job πŸ’ƒπŸ»πŸ€—πŸŽ‰
1/18/17, 6:58:32 PM: Christopher Walken (techguy): πŸ‘πŸΌ
1/18/17, 6:58:55 PM: TSV: I'm so happy for you Diana More reasons to go shopping πŸ’ƒπŸ»πŸ’ƒπŸ»πŸ’ƒπŸ»
1/18/17, 7:02:37 PM: Gaggan Anand (next desk): Awesome Diana
Congrats πŸ‘πŸ½πŸ‘πŸΌ

 

I am looking to categorize it into 3 or 4 variables. Date / Time separate or combined, Name and Message.

 

I tried to do it but not able to increase the length of data in name and message variable. The last sentence "congrats" also stays in date variable.

 

Can someone help me in the codes so that I can separate them and apply if for whole 6 months?

 

I tried as shown in the pic below.

 

 

 


Accepted Solutions
Solution
β€Ž11-15-2017 02:59 PM
Super User
Posts: 13,941

Re: How to categorize data in different variables ?

When your code generates error is a good practice on this forum to post the log with the code and errors. Paste the result into a code box opened using the forum menu icon {I}.

SAS date / time reading informats do not like that , that was in the middle of the date and time. So you need to parse the data a bit.

 

One way:

DATA WHATSAPP;
   infile datalines dlm=":" ;
   INFORMAT dttxt $19. NAME $CHAR20. MESSAGE $CHAR100.;
   INPUT dttxt 1-19  NAME MESSAGE;
   date_time = input(compress(dttxt,','),anydtdtm.);
   format date_time datetime20.;
   drop dttxt;
DATALINES;
1/18/17, 6:57:00 PM: TSV: Everyone claps for Diana who got selected for job
1/18/17, 6:58:32 PM: Christopher Walken (techguy):
1/18/17, 6:58:55 PM: TSV: I'm so happy for you Diana More reasons to go shopping
1/18/17, 7:02:37 PM: Gaggan Anand (next desk): Awesome Diana Congrats
;
RUN;

I changed to code to use : as delimiter as the posted example did not actually correspond to the columns specified by @21 and @26 when copied to my editor.

 

View solution in original post


All Replies
Occasional Contributor
Posts: 9

Re: How to categorize data in different variables ?

I tried code as shown below but did not get sufficient output.


DATA WHATSAPP;
INFORMAT DATE_TIME MDYAMPM18. NAME $CHAR20. MESSAGE $CHAR100.;
INPUT DATE_TIME @21 NAME @26 MESSAGE;
DATALINES;
1/18/17, 6:57:00 PM: TSV: Everyone claps for Diana who got selected for job
1/18/17, 6:58:32 PM: Christopher Walken (techguy):
1/18/17, 6:58:55 PM: TSV: I'm so happy for you Diana More reasons to go shopping
1/18/17, 7:02:37 PM: Gaggan Anand (next desk): Awesome Diana
Congrats
;
RUN;

PROC PRINT DATA = WHATSAPP;
FORMAT DATE_TIME MDYAMPM18.;
FORMAT NAME $CHAR20.;
FORMAT MESSAGE $CHAR100.;
RUN;
Solution
β€Ž11-15-2017 02:59 PM
Super User
Posts: 13,941

Re: How to categorize data in different variables ?

When your code generates error is a good practice on this forum to post the log with the code and errors. Paste the result into a code box opened using the forum menu icon {I}.

SAS date / time reading informats do not like that , that was in the middle of the date and time. So you need to parse the data a bit.

 

One way:

DATA WHATSAPP;
   infile datalines dlm=":" ;
   INFORMAT dttxt $19. NAME $CHAR20. MESSAGE $CHAR100.;
   INPUT dttxt 1-19  NAME MESSAGE;
   date_time = input(compress(dttxt,','),anydtdtm.);
   format date_time datetime20.;
   drop dttxt;
DATALINES;
1/18/17, 6:57:00 PM: TSV: Everyone claps for Diana who got selected for job
1/18/17, 6:58:32 PM: Christopher Walken (techguy):
1/18/17, 6:58:55 PM: TSV: I'm so happy for you Diana More reasons to go shopping
1/18/17, 7:02:37 PM: Gaggan Anand (next desk): Awesome Diana Congrats
;
RUN;

I changed to code to use : as delimiter as the posted example did not actually correspond to the columns specified by @21 and @26 when copied to my editor.

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 233 views
  • 0 likes
  • 2 in conversation