BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
HabAM
Quartz | Level 8

Hello,

I am trying to convert a SAS dataset with nucleotide sequence to a fasta format. See sample FASTA formated file here:https://github.com/veg/hivtrace/blob/master/test/rsrc/TEST.FASTA

 

What I have is a table with two columns:

ID Seq

 

What I need is to create a new line for the corresponding Seq value as :

>ID1
Seq1

>ID2

Seq2

 

Any advise is much appreciated.Smiley Happy

 

HabAM
1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

You don't list the individual values, you list the variable names.

 

Run this, changing the filepath to be relevant as a starting point and see what is generated.

 

data _null_;

set sashelp.class;

file '/folders/myfolders/demo.txt' linesize=80 flowover;

put Name;
put Age;

run;

@HabAM wrote:

Thank you Reeza. The only issue is I have thousands of row values to individually list as suggested. 

samle set 

ID Seq
>12121 aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag
>343434 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
>4545e cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
>1111 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
>2321 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg

 

 

expected text file:

>12121
aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag
>343434
cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
>4545e
cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
>1111
cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
>2321
cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg

 

Thank you


 

View solution in original post

4 REPLIES 4
Reeza
Super User
Just use a PUT statement.

data _null_;
file 'myfile.txt';

put ID;
put Seq1;

run;

You may want to control line length so it defaults and flows over but otherwise that should really be all you need to get started.
HabAM
Quartz | Level 8

Thank you Reeza. The only issue is I have thousands of row values to individually list as suggested. 

samle set 

IDSeq
>12121aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag
>343434cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
>4545ecctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
>1111cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
>2321cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg

 

 

expected text file:

>12121
aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag
>343434
cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
>4545e
cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
>1111
cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
>2321
cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg

 

Thank you

HabAM
Reeza
Super User

You don't list the individual values, you list the variable names.

 

Run this, changing the filepath to be relevant as a starting point and see what is generated.

 

data _null_;

set sashelp.class;

file '/folders/myfolders/demo.txt' linesize=80 flowover;

put Name;
put Age;

run;

@HabAM wrote:

Thank you Reeza. The only issue is I have thousands of row values to individually list as suggested. 

samle set 

ID Seq
>12121 aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag
>343434 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
>4545e cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
>1111 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
>2321 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg

 

 

expected text file:

>12121
aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag
>343434
cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
>4545e
cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
>1111
cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
>2321
cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg

 

Thank you


 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 798 views
  • 0 likes
  • 3 in conversation