Bioinformatics Glossary





E


EMBL (欧洲分子生物学实验室,EMBL数据库是主要公共核酸序列数据库之一)
European Molecular Biology Laboratories. Maintain the EMBL database, one of the major public sequence databases.

EMBnet (欧洲分子生物学网络)
European Molecular Biology Network: http://www.embnet.org/ was established in 1988, and provides services including local molecular databases and software for molecular biologists in Europe. There are several large outposts of EMBnet, including EXPASY.

Entropy(熵)
From information theory, a measure of the unpredictable nature of a set of possible elements. The higher the level of variation within the set, the higher the entropy.

Erdos and Renyi law
In a toss of a “fair” coin, the number of heads in a row that can be expected is the logarithm of the number of tosses to the base 2. The law may be generalized for more than two possible outcomes by changing the base of the logarithm to the number of out-comes. This law was used to analyze the number of matches and mismatches that can be expected between random sequences as a basis for scoring the statistical significance of a sequence alignment.

EST (表达序列标签的缩写)
See Expressed Sequence Tag

Expect value (E)(E值)
E value. The number of different alignents with scores equivalent to or better than S that are expected to occur in a database search by chance. The lower the E value, the more significant the score. In a database similarity search, the probability that an alignment score as good as the one found between a query sequence and a database sequence would be found in as many comparisons between random sequences as was done to find the matching sequence. In other types of sequence analysis, E has a similar meaning.

Expectation maximization (sequence analysis)
An algorithm for locating similar sequence patterns in a set of sequences. A guessed alignment of the sequences is first used to generate an expected scoring matrix representing the distribution of sequence characters in each column of the alignment, this pattern is matched to each sequence, and the scoring matrix values are then updated to maximize the alignment of the matrix to the sequences. The procedure is repeated until there is no further improvement.

Exon (外显子)
Coding region of DNA. See CDS.

Expressed Sequence Tag (EST) (表达序列标签)
Randomly selected, partial cDNA sequence; represents it's corresponding mRNA. dbEST is a large database of ESTs at GenBank, NCBI.

Extreme value distribution(极值分布)
Some measurements are found to follow a distribution that has a long tail which decays at high values much more slowly than that found in a normal distribution. This slow-falling type is called the extreme value distribution. The alignment scores between unrelated or random sequences are an example. These scores can reach very high values, particularly when a large number of comparisons are made, as in a database similarity search. The probability of a particular score may be accurately predicted by the extreme value distribution, which follows a double negative exponential function after Gumbel.


CategoryResource
Valid XHTML :: Valid CSS: :: Powered by WikkaWiki