question

Three major DNA databases

answer

EMBL
GenBank
DDBJ

question

Flat-file database

answer

Simplest form of a database. Information such as nucleotide or aa sequences are stored as either a large single text file or a collection of different text files

question

Accession number

answer

Label used to identify a sequence.
Ex: X102275 GenBank Genomic DNA sequence DNA

question

FastA

answer

Simple sequence format used in flat-file databases
Ex: Header line for DNA
sequence

question

PDB

answer

File format for 3D structures like proteins

question

Structured Query Language

answer

Computer language used with relational databases

question

INDEL

answer

Insertion or deletion mutations

question

Block

answer

Highly conserved local regions of DNA that are used in BLOSUM substitution matrices

question

Multiple Sequence Alignment

answer

Collection of three or more sequences that are partially or completely aligned. Residues are inferred to be homologous

question

Feng-Doolittle

answer

Method of constructing MSAs

question

BLAST steps

answer

1:Compile a list of words
2:Scan the database for entries that match the compiled list
3:When a hit on a word pair is found, the hit is extended in either direction until the score drops below a certain cutoff

question

Psi-BLAST

answer

Position-specific iterated BLAST that iteratively searches a protein sequence database, using the matches in round I to construct a PSSM for searching the database

question

Delta-BLAST

answer

Searches a database of pre-constructed PSSMs before searching a protein database to yield better homology detection.

question

HMM

answer

Hidden Markov Model

question

Pfam

answer

Database with a large collection of protein families, each represented by multiple sequence alignments (MSAs) and Hidden Markov Models (HMMs)

question

Profile Hidden Markov Model

answer

Can represent a sequence alignment profile similar to how a PSSM (position-specific scoring matrix) does.

A profile HMM includes information on amino acid consensus at each position in the alignment like a PSSM.

A profile HMM also has position-specific scores for gap insertions and deletions

question

Things needed to build an HMM

answer

Need to determine two things
1: structure/topology of the HMM-states and transitions.
2: The values of the parameters-emission and transition probablities

question

How to build an HMM

answer

1: Pick HMM structure/topology
2: Estimate initial parameters
3: Train the HMM by running sequences through it
4: Transitions that get used are given higher probabilities, those rarely used are given lower probabilities

question

Databases that use HMMs

answer

Pfam & SMART

question

Unrooted tree

answer

Fully resolved phylogenetic tree with each node connecting ancestors and descendants, but direction of evolution (which ancestor evolved from which) is undetermined

question

Rooted tree

answer

Phylogenetic tree in which one species is designated as the "root", the last common ancestor of all species below it

question

Internal nodes

answer

Represent hypothetical ancestors of taxa

question

Terminal nodes

answer

Represent the taxa (genes, proteins, species) used to infer the phylogeny

question

Cladogram

answer

Branch lengths have no meaning

question

Additive tree

answer

Branch lengths are a measure of evolutionary divergence

question

Ultrametric tree

answer

Branch lengths are a measure of evolutionary divergence
Same constant rate of mutation assumed along all branches

question

Ortholog

answer

Genes in different species that evolved from a common ancestral gene. Possess the same function

question

Paralog

answer

Genes in the same species that evolved from a common ancestral gene and created by gene duplication. Develop different functions, though often related to old funtions

question

What can be learned from character analysis using phylogenies?

answer

When did specific episodes of positive Darwinian selection occur during evolutionary history
Which genetic changes are unique to the human lineage
What was the most likely geographical location of the common ancestor of the African apes and humans?

question

Bootstrap Procedure

answer

Assigns values to individual branches that indicate the percentage occurrence

question

Consensus tree

answer

Shows only features that are consistent between multiple possible trees

question

P-distance

answer

This distance is the proportion (p) of nucleotide sites at which two sequences being compared are different. It is obtained by dividing the number of nucleotide differences by the total number of nucleotides compared.

question

Transition

answer

Changing purine to purine, or pyrimidine to pyrimidine
More common than transversion

question

Transversion

answer

Changing purine to pyrimidine, or pyrimidine to purine
Less common that transition

question

Positive selection

answer

Greater # of non-synonymous mutations observed than expected, indicates that mutations are more likely to be retained

question

Negative selection

answer

Smaller # of non-synonymous mutations observed than expected, indicates that mutations are being selected against and the sequence is conserved

question

COGs

answer

Clusters of Orthologous Genes
Used to find paralogs and homologs.
All genes in a species genome are compared against each other and against all genes in another species. If a gene's best-scoring BLAST hit (BeT) is within the genome, they are paralogs. If they BeT is between species, the genes are homologs.

question

DSSP

answer

Method for the assignment of secondary structure in a protein, uses hydrogen bond patterns

question

STRIDE

answer

Method for the assignment of secondary structure in a protein, uses both hydrogen bond energy and backbone dihedral angles

question

DEFINE

answer

Method for the assignment of secondary structure in a protein, matches the interatomic distances within the protein to those from idealized secondary structures.

question

1st method of protein attachment to membrane

answer

Attachment due to ionic interactions between protein and cytosolic face of the lipid bilayer

question

2nd method of protein attachment to membrane

answer

Attachment via an anchor such as a lipid. Added to the protein post-translationally, meaning that these types of proteins have no specialized structural or sequence features that can be identified

question

3rd method of protein attachment to membrane

answer

Bitopic membrane protein, in which the protein chain crosses the membrane exactly once

question

4th method of protein attachment to membrane

answer

Polytopic membrane protein, in which the protein chain threads back and forth across the membrane multiple times.

question

X-ray crystallography

answer

Used to determine most protein structures, requires crystals with a high protein concentration

question

NMR

answer

Used to determine some protein structures. Limited to smaller proteins

question

Threading method

answer

Method of predicting protein structure by using a library of folds and comparing the energies of different folds for the target sequence. These folds are then scored and the best-scoring ones are used in the model.

question

Homology Method

answer

Based on the assumption that homologous proteins have similar structures. Uses structure of known homologue to model target protein. More closely related sequences give better models.

question

What do structurally reliable alignments depend on?

answer

Sequence identity and alignment length

question

SCR

answer

Structurally Conserved Region

question

Swiss Model

answer

Automated protein structure homology-modeling server, used to model protein structures.

question

Pearson Correlation Coefficient

answer

Simple and fundamental method used to cluster microarray data.

question

SOM

answer

Self-organizing map.

question

2D gel

answer

Separates proteins based on both pH and size.

question

BIND

answer

Database of components and interactions, where each interaction includes information on cellular location, experimental conditions, conserved sequence, molecular location of interaction, and so on.

question

KEGG Pathway

answer

Draft metabolic reconstructions

question

Steps in making KEGG Pathway

answer

1.Draft reconstruction of metabolic network
2.Curate the reconstruction (add and correct information)
3.Convert to a computable metabolic model.

Bioinformatics Final – Flashcards

Unlock all answers in this set

Haven't found what you were looking for?

Search for samples, answers to your questions and flashcards