GCG vs EMBOSS - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

GCG vs EMBOSS

Description:

sreverse bool reverse (if DNA) -sask bool ask for begin/end/reverse ... EMBOSS is FREE! GNU Public Licence. Open Source Software. THE END ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 23
Provided by: elg1
Category:

less

Transcript and Presenter's Notes

Title: GCG vs EMBOSS


1
GCG vs EMBOSS
  • Gary Williams

2
Which is better GCG or EMBOSS?
  • You must decide for yourselves
  • You may find other packages that do what you want
  • Use the tools that do the job
  • This is a comparison of GCG and EMBOSS to help
    you decide

3
Interfaces
  • Web
  • W2H available for both
  • EMBOSS W2H still has rough edges
  • PISE
  • Others under development
  • X-Windows
  • GCG - Seqlab
  • EMBOSS - SPIN, ( others coming)
  • Telnet/xterm/Character-based
  • emnu

4
Command line is very similar
  • The UNIX command line interfaces of GCG and
    EMBOSS are very similar.
  • You type the name of the program
  • You can add any options you want to the
    command-line
  • Press the RETURN key
  • Any mandatory information that was not on the
    command-line will be prompted for.

5
GCG command-line
  • name -otherthing
  • This is the name program that reads a sequence
    and writes out something.
  • NAME what sequence ? emblhsfau1
  • Begin ( 1 ) ?
  • End ( 2016 ) ?
  • Reverse ( No ) ?
  • What should I call the output (
    hsfau.name ) ?

6
EMBOSS command-line
  • name -other thing
  • Reads in sequences and writes a thing
  • Input sequence(s) emblhsfau1
  • Output data hsfau1.name
  • Use -ask to make EMBOSS programs prompt for the
    start and end of sequences

7
Some common options
  • Running in scripts, dont prompt, just fail if
    command-line is insufficient
  • GCG -default
  • EMBOSS -auto
  • Help on options
  • GCG -check
  • EMBOSS -help or -help -verbose
  • Boolean options (Yes/No, True/False)
  • GCG -thing, -nothing
  • EMBOSS -thing, -nothing, -thingT, -thingF,
    -thing1, -thing0, -thingY, -thingN

8
Sequence options in EMBOSS
  • "-sequence" related qualifiers
  • -sbegin integer first base used
  • -send integer last base
    used, defseq length
  • -sreverse bool reverse (if
    DNA)
  • -sask bool ask for
    begin/end/reverse
  • -slower bool make lower
    case
  • -supper bool make upper
    case
  • -sformat string input sequence
    format
  • -ufo string UFO
    features

9
Sequence options in EMBOSS
  • "-outseq" related qualifiers
  • -osformat string output sequence
    format
  • -ossingle bool separate file
    for each entry

10
EMBOSS general options
  • -debug bool write debug output to
    program.dbg
  • -auto bool turn off prompts
  • -stdout bool write standard output
  • -filter bool read standard input,
    write standard output
  • -options bool prompt for required and
    optional values
  • -verbose bool report some/full command
    line options
  • -help bool report command line
    options

11
Data files
  • GCG uses .. to divide comments from data
  • EMBOSS does not use ..
  • In general, EMBOSS uses to mark a comment
    line
  • Use embossdata to extract and check on data
    files.
  • As in GCG, data files copied into the current or
    home directory are used in preference to the
    originals.

12
List files (files of file names)
  • Similar to GCG lists files, but no ..
  • Comment lines start with
  • Can contain the names of other list files
  • This is my list file
  • emblhsfau
  • emblggg
  • myfile.seqclone10
  • file.seq
  • _at_list2

13
File formats
  • GCG
  • only GCG format, MSF and RSF
  • EMBOSS
  • many formats
  • automatically recognised
  • can specify using or -osf
  • eg
  • clustalglobin.aln
  • -osf gcg

14
One file, many sequences
  • GCG
  • Only one sequence per GCG file
  • EMBOSS
  • One or more sequences per file
  • Default is to write all sequences to one file
  • -ossingle will change to writing many files
  • GCG, Staden and plain format files can only hold
    one sequence per file.

15
Features
  • GCG
  • No concept of feature tables
  • EMBOSS
  • Many programs now write out results as GFF
  • Soon, all programs that find things will write
    the results as GFF
  • GFF will become another sequence format
  • Programs to manipulate and display sets of
    features are planned
  • c.f. showfeat, coderet, maskfeat, diffseq

16
Databases
  • EMBOSS is poor at grouping many databases under
    one name
  • E.G. Need a way of referring to embl and
    emblnew as one database.
  • This will be done, but currently, a list file
    containing the following seems best
  • embl
  • emblnew

17
Command line wildcards
  • GCG
  • embl - no problem
  • EMBOSS
  • embl - UNIX complains it cant find the files
  • solution is to quote it
  • embl
  • or
  • embl\

18
HELP
  • GCG
  • genman, genhelp
  • EMBOSS
  • tfm

19
What program does what?
  • See David Martins list of equivalences
  • http//www.no.embnet.org/Programs/SAL/EMBOSS/fromG
    CG.php3
  • NB this doesnt list EMBOSS programs with no
    equivalent in GCG!

20
What EMBOSS does NOT do
  • The major deficiencies in the EMBOSS package are
  • BLAST, FASTA, ASSEMBLY
  • You should use the publicly available software
  • Blast - NCBI, HGMP, many other sites
  • Fasta - HGMP
  • Assembly - Staden package

21
What EMBOSS does do
  • Giving stdout as the output file name makes
    output go to the screen.
  • Much effort is put into removing arbitrary
    limits.
  • E.g. Max. sequence length 2Gb
  • Many programs limited only by available memory
  • Source code available for inspection, change and
    writing your own programs
  • EMBOSS is FREE!
  • GNU Public Licence
  • Open Source Software

22
THE END
Write a Comment
User Comments (0)
About PowerShow.com