Hello,
I am trying to convert a SAS dataset with nucleotide sequence to a fasta format. See sample FASTA formated file here:https://github.com/veg/hivtrace/blob/master/test/rsrc/TEST.FASTA
What I have is a table with two columns:
ID Seq
What I need is to create a new line for the corresponding Seq value as :
>ID1
Seq1
>ID2
Seq2
Any advise is much appreciated.
You don't list the individual values, you list the variable names.
Run this, changing the filepath to be relevant as a starting point and see what is generated.
data _null_;
set sashelp.class;
file '/folders/myfolders/demo.txt' linesize=80 flowover;
put Name;
put Age;
run;
@HabAM wrote:
Thank you Reeza. The only issue is I have thousands of row values to individually list as suggested.
samle set
ID Seq >12121 aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag >343434 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >4545e cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >1111 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >2321 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
expected text file:
>12121 aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag >343434 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >4545e cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >1111 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >2321 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
Thank you
Thank you Reeza. The only issue is I have thousands of row values to individually list as suggested.
samle set
ID | Seq |
>12121 | aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag |
>343434 | cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
>4545e | cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
>1111 | cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
>2321 | cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
expected text file:
>12121 |
aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag |
>343434 |
cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
>4545e |
cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
>1111 |
cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
>2321 |
cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
Thank you
Add the necessary set statement to @Reeza 's code in order to read your SAS dataset.
You don't list the individual values, you list the variable names.
Run this, changing the filepath to be relevant as a starting point and see what is generated.
data _null_;
set sashelp.class;
file '/folders/myfolders/demo.txt' linesize=80 flowover;
put Name;
put Age;
run;
@HabAM wrote:
Thank you Reeza. The only issue is I have thousands of row values to individually list as suggested.
samle set
ID Seq >12121 aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag >343434 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >4545e cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >1111 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >2321 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
expected text file:
>12121 aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag >343434 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >4545e cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >1111 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >2321 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
Thank you
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: