
Monday, October 19, 2009
UCSC Genome browser..

Monday, October 12, 2009
What are Databases..??
Databases
At the beginning of the "genomic revolution," a bioinformatics concern was the creation and maintenance of a database to store biological information, such as nucleotide and amino acid sequences. Development of this type of database involved not only design issues, but also the development of complex interfaces whereby researchers could both access existing data as well as submit new or revised data.
Ultimately, however, all of this information must be combined to form a comprehensive picture of normal cellular activities so that researchers may study how these activities are altered in different disease states. Therefore, the field of bioinformatics has evolved such that the most pressing task now involves the analysis and interpretation of various types of data, including nucleotide and amino acid sequences, protein domains, and protein structures. The actual process of analysing and interpreting data is referred to as computational biology. Important sub-disciplines within bioinformatics and computational biology include:
- The development and implementation of tools that enable efficient access to, and use and management of, various types of information;
- The development of new algorithms (mathematical formulas) and statistics with which to assess relationships among members of large data sets, such as methods to locate a gene within a sequence, predict protein structure and/or function, and cluster protein sequences into families of related sequences.
Biological databases
A biological database is a large, organised body of persistent data, usually associated with computerised software designed to update, query, and retrieve components of the data stored within the system. A simple database might be a single file containing many records, each of which includes the same set of information. For example, a record associated with a nucleotide sequence database typically contains information such as contact name; the input sequence with a description of the type of molecule; the scientific name of the source organism from which it was isolated; and, often, literature citations associated with the sequence. For researchers to benefit from the data stored in a database, two additional requirements must be met:
- Easy access to the information;
- A method for extracting only that information needed to answer a specific biological question.
Entrez
At the site of the NCBI, many of the databases are linked through a unique search and retrieval system, called Entrez. Entrez allows a user to not only access and retrieve specific information from a single database, but to access integrated information from many NCBI databases. For example, the Entrez protein database is cross-linked to the Entrez taxonomy database. This allows a researcher to find taxonomic information of the protein of interest. An overview of the most important databases is given in the part Databases on this site.
UniProt
Sunday, October 11, 2009
PROTEIN DATABANKS: Protein Information Resource(PIR)

PIR was established in 1984 by the National Biomedical Research Foundation (NBRF) as a resource to assist researchers in the identification and interpretation of protein sequence information. Prior to that, the NBRF compiled the first comprehensive collection of macromolecular sequences in the Atlas of Protein Sequence and Structure, published from 1965-1978 under the editorship of Margaret O. Dayhoff. Dr. Dayhoff and her research group pioneered in the development of computer methods for the comparison of protein sequences, for the detection of distantly related sequences and duplications within sequences, and for the inference of evolutionary histories from alignments of protein sequences.
Dr. Winona Barker and Dr. Robert Ledley assumed leadership of the project after the untimely death of Dr. Dayhoff in 1983. In 1999 Dr. Cathy H. Wu joined NBRF, and later on Georgetown University Medical Center (GUMC), to head the bioinformatics efforts of PIR, and has served first as Principal Investigator and, since 2001, as Director.
For over four decades, beginning with the Atlas of Protein Sequence and Structure, PIR has provided protein databases and analysis tools freely accessible to the scientific community including the Protein Sequence Database (PSD).
In 2002 PIR, along with its international partners, EBI (European Bioinformatics Institute) and SIB (Swiss Institute of Bioinformatics), were awarded a grant from NIH to create UniProt, a single worldwide database of protein sequence and function, by unifying the PIR-PSD, Swiss-Prot, and TrEMBL databases.
In 2009 Dr. Wu accepted the Edward G. Jefferson Chair of Bioinformatics and Computational Biology at the University of Delaware (UD).
Today, PIR maintains staff at UD and GUMC and continues to offer world leading resources to assist with proteomic and genomic data integration and the propagation and standardization of protein annotation.
DNA DATABANK OF JAPAN(DDBJ)

DDBJ is organized by the Center for Information Biology and DNA Data Bank of Japan (CIB-DDBJ) of the National Institute of Genetics (NIG) with endorsement of The Japanese Ministry of Education, Culture, Sports, Science and Technology (MEXT). 99% of INSD data from Japanese researchers are submitted through DDBJ.
The principal purpose of DDBJ operations is to improve the quality of INSD, as public domains. When researchers make their data open to the public through INSD and commonly shared in world wide, we at DDBJ make efforts to describe information on the data as rich as possible, according to the unified rules of INSD, preferably without any stress by using DDBJ.
Saturday, October 10, 2009
NCBI (National Center For Biotechnology Information)

The National Center for Biotechnology Information (NCBI) is part of the United States National Library of Medicine (NLM), a branch of the National Institutes of Health. The NCBI is located in Bethesda, Maryland(38.994994°N 77.099339°W ) and was founded in 1988 through legislation sponsored by Senator Claude Pepper. The NCBI houses genome sequencing data in GenBank and an index of biomedical research articles in PubMed Central and PubMed, as well as other information relevant to biotechnology. All these databases are available online through the Entrez search engine.
The NCBI is directed by David Lipman, one of the original authors of the BLAST sequence alignment program and a widely respected figure in Bioinformatics. He also leads an intramural research program, including groups led by Stephen Altschul (another BLAST co-author), David Landsman, and Eugene Koonin (a prolific author on comparative genomics).
Wednesday, September 23, 2009
NCBI (National Center For Biotechnology Information)
Bioinformatics was first started as a field of study under the dept of Biotechnology at the National Center For Biotechnology Information...Estb