Life in BIOINFORMATICS......: GENOMICS

Showing posts with label GENOMICS. Show all posts

Tuesday, October 13, 2009

GENOME SEQUENCING

Sequencing is the method to get the order of DNA basepairs of a DNA fragment. This fragment can be small (like 500 bp) or a whole genome of an organism. One of the major methods of DNA sequencing in known as chain termination sequencing, dideoxy sequencing, or Sanger sequencing after its inventor biochemist Frederick Sanger. The method is elegantly simple. While DNA chains are normally made up of deoxynucleotides (dNTPs), the Sanger method uses also dideoxynucleotides.

Dideoxynucleotides (ddNTPs) are missing a hydroxy (OH) group at the 3' position. This position is normally where one nucleotide attaches to another to form a chain. If there is no OH group in the 3' position, the additional nucleotides cannot be added to the chain, thus interrupting chain elongation. A small fraction of one of the bases will contain stopnucleotides. This means that everytime that nucleotide is added, a fraction of the strings will stop growing and keep the length it has reached at that time. When you first devide your sample in four tubes, you can do this procedure four times. This means with all the four basepairs. When you run these four samples on a gel, you can read the sequence from the smallest fragment to the largest..

Since 1986 the process of reading the sequence can be done with an automated fluorescence sequencer. The automated sequencer runs on the same principle as the Sanger method (dideoxynucleotide chain termination). But here a laser constantly scans the bottom of the gel, detecting the bands that move down the gel. Where the manual method uses radioactive labeling, automated cappilary sequencing uses fluorescent tags on the ddNTPs (a different dye for each nucleotide). This makes it possible for all four reactions (dGTP, dATP, dCTP, and dTTP) to be run in one lane and increases the speed of the process four times. The runs are fully automated nowadays, and the gels are replaced by cappilaries. This is a very efficient method and is very useful for fast and automatically sequencing of large DNA fragments.

More about sequencing on the history page.

There are two methods of deviding the genome in smaller parts for large-scale sequencing:

The Conventional Method
Shotgunning

The Conventional Method

Once scientists use PCR to create many copies of a single strand of the DNA fragment they begin to synthesis the location of each letter. The original method involves the following:

Step 1: Place identical DNA strands into four test tubes, each one containing a ddNTP that resembles one of the four nucleotides in DNA (A, T, C, G) and lots of dNTPs (which are free-floating nucleotides) that also resemble the DNA letter, except they do not build functioning DNA chains.
Step 2: Then add the polymerase enzyme and a known primer DNA, similar to the one used in PCR, to the test tubes. The primer marks the beginning of each sequenced string of DNA. In each test tube, the dNTPs, which act like letters in a DNA bond with the complementary nucleotide, thereby copying the original strand. However, the ddNTP in each test tube also bond with the DNA fragments at a probable ratio of 1 bond 100 times it could bond. Each time this happens the copy terminates, thereby creating millions of DNA strings of differing length that start with the same primer and each ending with the same ddNTP nucleotide. This is determined by which of the four test tubes being analysed, since only one of the ddNTPs are in each test tube.
Step 3: Then use Gel Electrophoresis to arrange the DNA pieces from largest to smallest and X-ray detection to determine the length of the strings.

This is the method that was first developed. Fortunately, contemporary institutes no longer use this exact method but one that is four times faster. By using ddNTPs tagged with fluorescents they no longer need four test tubes, but a single one, which contains all four fluorescent ddNTPs. They then rely on computers to detect the different colours of the ending pieces of the DNA segments after Gel Electrophoresis to determine the letters and length of the DNA chain. More information is given in the History part of this website.

The Shotgun Method

For using the shotgunning method, first A genomic library is made by cutting a whole genome with restriction enzymes and inserting each piece into a bacterium, which is then cloned. These segments are then detected and ordered by computers.

Step 1: Blend the post PCR DNA string that is to be sequenced into little fragments.
Step 2: Place the new segments into a test tube filled with the polymerase enzyme and a primer bit of DNA.
Step 3: Now, like in the original method, let the DNA rebuilds itself with the dNTPs and the fluorescent ddNTPs tagged in the test tubes until all the strings have terminated.
Step 4: Use Gel Electrophoresis to sort the fragments by size and a computer to record the many DNA fragments lengths. Lastly the computer should process and realign these fragments into the original string, thereby sequencing it.

The advantage of using smaller fragments of the larger DNA chain is that since the time required sequencing the DNA has been greatly shorted. Therefore, machines can sequence the fragments many times in order to achieve a high level of accuracy, by using sequencing software which lines up the DNA by finding overlapping letter sequences in the many pieces after the gel electrophoresis. However, scientists have experienced a few problems when 'shotgunning' DNA strings with many common and reoccurring sequences; therefore researchers using the 'shotgunning' process often sequence the DNA both backwards and forwards to return more accurate results. The following graphic shows how the small DNA fragments are realigned to assemble the full sequence.

Actuality billions of overlapping DNA pieces need to be aligned for an acceptable accuracy. Fortunately, by highly automating the 'shotgunning' method of sequencing, scientists are quickly organising enough DNA pieces to return precise chains faster than competitors using more conventional techniques.

ESTs are also very useful in the mapping of a genome. The 3' ESTs serve as a common source of STSs because of their likelihood of being unique to a particular species and provide the additional feature of pointing directly to an expressed gene. These ESTs gives much information as a reliable genomic landmark for genome mapping.

There are two types of shotgunning:

Hierarchical (clone by clone)
Whole genome

Hierarchical shotgunning means, the genome will be broken up into overlapping segments whose relative locations were known; each segment was then shotgun-sequenced. Using the whole genome shotgunning technique, the whole genome will be broken in pieces several times and all pieces are sequenced. Both types detect and order the segments by computer after sequencing each segment. The whole genome shotgunning was invented by Craig Venter's TIGR and the technique was used to sequence several genomes, like the influenza microbe, Drosophila melanogaster and Venter's part of the human genome.

Monday, October 12, 2009

MICROARRAY TECHNOLOGY

Microarrays exploit the preferential binding of complementary single-stranded nucleic acid sequences. A microarray is typically a glass slide, on to which DNA molecules are attached at fixed locations (spots). There may be tens of thousands of spots on an array, each containing a huge number of identical DNA molecules (or fragments of identical molecules), of lengths from twenty to hundreds of nucleotides. (According to quick napkin calculations by Wilhelm Ansorge and John Quackenbush in Schnookeloch in Heidelberg on 4 October 2001, the number of DNA molecules in a microarray spot is 10⁷-10⁸). For gene expression studies, each of these molecules ideally should identify one gene or one exon in the genome, however, in practice this is not always so simple and may not even be generally possible due to families of similar genes in a genome.

Microarrays that contain all of the approximate 6000 genes of the yeast genome have been available since 1997. The spots are either printed on the microarrays by a robot, or synthesised by photolithography (similarly as in computer chip productions) or by ink-jet printing. The spot diameter is of the order of 0.1 mm, for some microarray types can be even smaller.

There are different ways how microarrays can be used to measure the gene expression levels. One of the most popular microarray applications allows the comparison of gene expression levels in two different samples, e.g., the same cell type in a healthy and diseased state. This is called cDNA microarray. An other array technique is oligo arrays.

cDNA Microarray

In the preparation of a cDNA microarray, the total mRNA from the cells in two different conditions is extracted and reverse transcription PCR (RT-PCR) is used to convert the RNA transcripts into cDNA. The cDNAs are usually composed of 500 -2000 basepairs long. The complete pool of cDNA is representative of transcriptional events in the tissue source of the RNA. The genes that were being actively transcribed in the sample will have mRNA copies that should have been first purified and then copied into cDNA during the RT-PCR step. The reverse transcription event for the control and experimental mRNA are identical in every step except one, and it is this step that enables differential gene expression to be determined. Nucleotides labelled with a green fluorescent dye Cy3 are incorporated into the control cDNA, while nucleotides labelled with a red fluorescent dye Cy5 are incorporated into the experimental DNA. After preparation, both probes are mixed and allowed to hybridise to the glass slide. Excess hybridisation buffer is washed off following an overnight incubation, and the slides are then ready to be scanned. Labelled gene products from the extracts hybridise to their complementary sequences in the spots due to the preferential binding - complementary single stranded nucleic acid sequences tend to attract to each other and the longer the complementary parts, the stronger the attraction.

Oligonucleotide Microarray

The physical chemistry of hybridisation is oligonucleotide microarrays is clearly different from that of cDNA microarrays. Oligonucleotides range in size from 10-25 bases. So, the DNA fragments in the spots are much smaller than cDNA fragments are. Oligonucleotide microarrays are used to detect point mutations (the missing, adding or changing of a single base) in a known DNA sequence. Single base mismatches do have much more influence on binding to an oligonucleotide sequence compared to cDNA. For example, a small genome can be synthesized on a chip as a set of thousands of 20 bp long fragments. When a single basepair match exists, the fluorescence intensity decreases significant. This technique gives possibilities to find most of the point mutations in a known DNA sequence.

Data quantification

The dyes enable the amount of sample bound to a spot to be measured by the level of fluorescence emitted when a laser excites it. If the RNA from the sample in condition 1 is in abundance, the spot will be green, if the RNA from the sample in condition 2 is in abundance, it will be red. If both are equal, the spot will be yellow, while if neither are present it will not fluoresce and appear black. Thus, from the fluorescence intensities and colours for each spot, the relative expression levels of the genes in both samples can be estimated.

The raw data that are produced from microarray experiments are the hybridised microarray images. To obtain information about gene expression levels, these images should be analysed, each spot on the array identified, its intensity measured and compared to the background. This is called image quantification and is done by image analysis software. To obtain the final gene expression matrix from spot quantification's, all the quantities related to some gene (either on the same array or on arrays measuring the same conditions in repeated experiments) have to be combined and the entire matrix has to be scaled to make different arrays comparable.

Microarrays are already producing massive amounts of data. These data, like genome sequence data, can help us to gain insights into underlying biological processes only if they are carefully recorded and stored in databases, where they can be queried, compared and analysed by different computer software programs. The EBI as well as the NCBI are establishing a public repository for microarray gene expression data analogous to banks for DNA sequence data.

Microarray is fundamentally a technique to identify complete gene expression profiles in selected tissues. Microarray experiments can give false positive and false negative results. Additional means of analysing gene expression (Northern blotting or RNAse protection assays) must be used to control microarray conclusion.

CLICK ON THE LINK BELOW FOR ANIMATED EXPLANATION:

http://www.youtube.com/watch?v=ePFE7yg7LvM&feature=related

GENE EXPRESSION ANALYSIS

Northern blotting

Northern blotting is a laboratorium technique to analyse RNA expression. The experiment takes several steps. First the total amount of RNA is isolated. Then it is separated on fragment length by gel electroforesis. Then it is transferred to nitrocelloluse or nylon filter paper. The filter can then used to search for a particular RNA by several probing techniques, for instance radioactive labeling of the probe. The probe should be complement to the RNA you are looking for. This simple procedure can indicate in which tissues or cell types a particular gene is expressed. In this way a Northern blot is often used for diagnostic purposes. It can also be used to confirm results from other experimental technques like microarray .

Micro Array Analysis: (see Pictorial representation):

http://8e.devbio.com/images/ch04/06.NB.01.thumb.jpg

ESTs,Physical maps,Cytogenic maps

ESTs : ESTs are small pieces of DNA sequence (usually 200 to 500 nucleotides long) that are generated by sequencing either one or both ends of an expressed gene. The 3' ESTs serve as a common source of STSs because of their likelihood of being unique to a particular species and provide the additional feature of pointing directly to an expressed gene.

ESTs as Gene Discovery Resource:As observed ESTs represent a copy of just the interesting part of a genome, that which is expressed, they have proven themselves again and again as powerful tools in the hunt for genes involved in hereditary diseases. ESTs also have a number of practical advantages in that their sequences can be generated rapidly and inexpensively, only one sequencing experiment is needed per each cDNA generated. ESTs are powerful tools in the hunt for known genes because they greatly reduce the time required to locate a gene. Using this method, scientists have already isolated genes involved in Alzheimer's disease, colon cancer, and many other diseases.

Cytogenetic Map:

A cytogenetic map is the visual appearance of a chromosome when stained and examined under a microscope. Particularly important are visually distinct regions, called light and dark bands, which give each of the chromosomes a unique appearance. This feature allows a person's chromosomes to be studied in a clinical test known as a karyotype, which allows scientists to look for chromosomal alterations.

Physical map:

A physical map is a collection of overlapping clones that have been arranged into a tiling path based on either fingerprinting (digestion of clones with restriction enzymes and comparison of the fragment sizes) or hybridisation.

The genetic markers can help to integrate these three maps mentioned above.

For Arabidopsis thaliana, TAIR's comprehensive MapViewer is an integrated graphic display of each Arabidopsis chromosome. TAIR is the internet site where all information and data about Arabidopsis is combined. MapViewer shows genetic, physical, and sequence maps in one site and allows users to search, browse, and align different maps in a region of interest. In the future, all the maps will be fully integrated into a genome map for the organism.

An Important concept of "GENE MAPPING"

Genetic map

Well lets use some imagination: Like interstate maps having cities and towns that serve as landmarks, the genetic maps have landmarks known as genetic markers, or "markers" for short. The term "marker" is used very broadly to describe any observable variation that results from an alteration, or mutation, at a single genetic locus. A marker may be used as one landmark in a map if, in most cases, that stretch of DNA is inherited from parent to child according to the standard rules of inheritance. Markers can be within genes that code for a noticeable physical characteristic such as leaf colour, or a not so noticeable trait such as a disease. The greater the distance between two linked genes, the greater the chance that two nonsister chromatids would cross over in the region between the genes and the greater the proportion of recombinants that would be produced. Thus, by determining the frequency of recombinants, we can obtain a measure of map distance between the genes. Today, several other genetic markers are used to detect linkage. There are several genetic markers:

RFLPs/ (Restriction Fragment Length Polymorphism's): They were among the first developed DNA markers. RFLPs are defined by the presence or absence of a specific site, called a restriction site, for a bacterial restriction enzyme. This enzyme breaks apart strands of DNA wherever they contain a certain nucleotide sequence...
VNTRs/ ( Variable Number of Tandem Repeat Polymorphisms): They mainly occur in non-coding regions of DNA. This type of marker is defined by the presence of a nucleotide sequence that is repeated several times. In each case, the number of times a sequence is repeated may vary..
Microsatellite polymorphism's: Defined by a variable number of repeats of a very small number of base pairs. Oftentimes, these repeats consist of the nucleotides, or bases, cytosine and adenosine. The number of repeats for a given microsatellite may differ between individuals, hence the term polymorphism--the existence of different forms within a population;
SNPs/ Single Nucleotide Polymorphism's: They are individual point mutations, or substitutions of a single nucleotide, that do not change the overall length of the DNA sequence in that region. SNPs occur throughout an individual's genome;
AFLP/ Amplified Fragment Length Polymorphism: They mainly involve a DNA fingerprinting technique which detects DNA restriction fragments by means of PCR amplification.

Currently, the most powerful mapping technique, and one that has been used to generate many genome maps, relies on Sequence Tagged Site (STS) mapping: A STS is a short DNA sequence that is easily recognisable and occurs only once in a genome (or chromosome).

what is GENOMICS..??

Must tell you most of the topics in Bioinformatics that u will be coming across are simple in meaning...i mean GENOMICS u can again use ur grey cells passively and without scratching ur head can tell "its basically study of particular organism's genome" off course ur right...!!!

But again my blog would be a failure if I cant get u into understanding what exactly is going on...RIGHT!!!

so then Wat exactly is constituting our/any organism's genome..??

the Chromosomes.DNA sequences with genes embedded onto the chromosomes/DNA sequence(both are possible) Ref-GENE 8...

Primarily a genome study (GEnomiCS) involves the study of these sequences...

Though Higher Genomics study involves sequencing, comparative genomics, genome annotation, microarray technology....

Some of these techniques i have kept as video tutorials(Micro array technique) keeping in view the easy understanding in relatively less time...

GENOMICS

Now then once we are done with the "what's,the who's and where's"...i feel i can start up the topics on BIOINFORMATICS....how bout GENOMICS.....??

Saturday, October 10, 2009

Human Genome Project

Must see,.......