Prof. of crop genetics and breeding Department of Agronomy College of Agriculture and Biotechnology, Zhejiang University

Samankaltaiset tiedostot
Sivuston tiedotlunji.com

Tarvitseeko minun toimittaa Do I have alkuperäiset to show copies o dokumentit vai kopiot? the original documents the 询问你是否需要提供原件还是复印件 Mitkä ovat ylio

Viittaan ilmoitukseenne... I refer päiväykseltä... to your advertisem 用于解释在何处看到招聘信息的标准格式 Luin ilmoituksenne kokeneesta I read your... lehdestä adverti

文件 - 个人信息 Mikä sinun nimesi on? 询问某人的名字 Mikä sinun nimesi on? Voisitko kertoa minulle Voisitko syntymäpaikkasi kertoa minulle ja sy syntymäaikasi? syn

Bewerbung Zeugnis. Chinesisch

我写这封信是看到您在... 上登的招聘信息 Kirjoitamme teille liittyen... Standard formula used when responding to an advertisement posted online 我看到您于... 在... 上登的招聘信息 Vii

On ollut ilo toimia päällikkönä / kollegana... sillä... 做... 的老板 / 上司 / 同事很愉快, 因为... Kirjoitan mielelläni tämän suosituskirjeen henkilön... puolesta.

Kilpailuohje / instructions / 说明

Formal, to open on behalf of the whole company Kirjoitamme teille liittyen... Formal, to open on behalf of the whole company 我们因... 写这封信 Koskien... 因贵

Celia Jones 47 Herbert Street Floreat Perth WA 6018 Osoitteen ulkoasu Australiassa: osavaltion nimi kaupungin nimi + postinumero Celia Jones Herbert 街

State of the Union... Functional Genomics Research Stream. Molecular Biology. Genomics. Computational Biology

Mun nimi kirjoitetaan S-A-R-I. Miten sun nimi kirjoitetaan? Mun nimi kirjoitetaan F-A-N.

Koko hakuprosessin ajan pidämme Sinut sähköpostitse ajan tasalla - tiedät, missä vaiheessa viisumihakemuksesi on ja milloin viisumisi on valmis.

Onnittelut kihlauksesta! Congratulations Oletteko päättäneet on your joen hääpäivän? decided upon big day yet? 用于恭喜你很熟悉的最近刚订婚的夫妇并询问婚礼何时举行 祝福 - 生日和纪念日

Bezugnehmend auf Ihre Anzeige in... vom... 我看到您于... 在... 上登的招聘信息 Standardimalli, jolla selitetään miten työpaikasta on kuullut Mit großem Interesse ha

Master's Programme in Life Science Technologies (LifeTech) Prof. Juho Rousu Director of the Life Science Technologies programme 3.1.

Kirjoitamme teille liittyen... Vă scriem cu privire la... 正式, 代表整个公司 Kirjoitamme teille liittyen... Vă scriem în legătură cu.. 正式, 代表整个公司 Koskien... 正

Sähkötarkastusviranomaisten ilmoitukset ja turvallisuus- ja ympäristöohjeet KÄYTTÖOPAS

Sivuston tiedotcomiket.co.jp

漢字入門. Johdatusta kanji-järjestelmään Kie 春 編著 : 大倉純一郎. Kirjoittanut ja toimittanut: Junichiro Okura

7. Product-line architectures

CS284A Representations & Algorithms for Molecular Biology. Xiaohui S. Xie University of California, Irvine

Functional Genomics & Proteomics

Genome 373: Genomic Informatics. Professors Elhanan Borenstein and Jay Shendure

Efficiency change over time

Histoire illustrée des la construction navale dans la Chine ancienne. Qin et Han (de 221 av.j.-c. à 221 ap.j.-c.) 秦 汉时代

Constructive Alignment in Specialisation Studies in Industrial Pharmacy in Finland

中国科学数据论文数据样例 2014( 数据样例描述 )

Bioinformatics in Laboratory of Computer and Information Science

Improving advisory services through technology. Challenges for agricultural advisory after 2020 Jussi Juhola Warsaw,

NBE-E4510 Special Assignment in Biophysics and Biomedical Engineering AND NBE-E4500 Special Assignment in Human. NBE-E4225 Cognitive Neuroscience

Bioinformatics. Sequence Analysis: Part III. Pattern Searching and Gene Finding. Fran Lewitter, Ph.D. Head, Biocomputing Whitehead Institute

ICT-markkinoiden murros Venäjällä, Intiassa ja Kiinassa Liiketoiminnan strategiat ja mahdollisuudet uudella vuosikymmenellä

ジリナ自治区. 面積 : 6,788 km 2 国境 : チェコ ( 北西 ) ポーランド ( 北 北東 ) プレショウ地区 ( 東 ) バンスカービストリツ地区 ( 南 ) トレンチン地区 ( 南西 ) 人口 : 693,499 人. 人口密度 : 102 人 / km 2

Vertailua eri käännösten välillä.

ProAgria. Opportunities For Success

M E N U B U S I N E S S C L A S S

WP3 Decision Support Technologies

Arctic food from Finland

Windows Phone. Module Descriptions. Opiframe Oy puh Espoo

Bioinformatiikan maisteriohjelma

Land-Use Model for the Helsinki Metropolitan Area

Basset: Learning the regulatory code of the accessible genome with deep convolutional neural networks. David R. Kelley

Other approaches to restrict multipliers

Kemi Kemi news and events -uutislehti n:o 001 / 2017

Faculty of Agriculture and Forestry. Forestry

Konetekniikan koulutusohjelman opintojaksomuutokset

Fungi infecting cultivated moss can also cause diseases in crop plants

M E N U B U S I N E S S C L A S S

Use of spatial data in the new production environment and in a data warehouse

Supporting Information for

Joki 4 Kanerva 8 Koli 10 Kuusamo 12 Mori 13 Notte 14 Seniori 20 Moottoripohjat Motor frames 26 Patjat Mattress 26 Mittapiirrokset Measurements 27

FETAL FIBROBLASTS, PASSAGE 10

OP1. PreDP StudyPlan

AGS IP65/ ø30 HNG STK. 1 ø10 ø20 ø14 ø26 5Y7/1 *1 IP65 IEC JIS C Y7/1 600V *2 AM AC2000V C 45 85%RH.

BDD (behavior-driven development) suunnittelumenetelmän käyttö open source projektissa, case: SpecFlow/.NET.

MALE ADULT FIBROBLAST LINE (82-6hTERT)

TIEKE Verkottaja Service Tools for electronic data interchange utilizers. Heikki Laaksamo

On instrument costs in decentralized macroeconomic decision making (Helsingin Kauppakorkeakoulun julkaisuja ; D-31)

Additions, deletions and changes to courses for the academic year Mitä vanhoja kursseja uusi korvaa / kommentit

Chinese language studies

Kemi Kemi news and events -uutislehti n:o 002 / 2017

Capacity Utilization

The CCR Model and Production Correspondence

Predicting evolutionarily conserved regions (ECRs) in the Xenopus tropicalis genome using a MultiPipMaker-based bioinformatic strategy

Tekes the Finnish Funding Agency for Technology and Innovation. Copyright Tekes

Paikkatiedon semanttinen mallinnus, integrointi ja julkaiseminen Case Suomalainen ajallinen paikkaontologia SAPO

Informaatioteknologia vaikuttaa ihmisten käyttäytymiseen ja asenteisiin

FROM VISION TO CRITERIA: PLANNING SUSTAINABLE TOURISM DESTINATIONS Case Ylläs Lapland

Mitä mahdollisuuksia tuloksemme tarjoavat museoille?

ARENEN TOIMISTA KOULUTUSVIENNIN EDISTÄMISEKSI

7.4 Variability management

Page 1 of 9. Ryhmä/group: L = luento, lecture H = harjoitus, exercises A, ATK = atk-harjoitukset, computer exercises

SOA SIG SOA Tuotetoimittajan näkökulma

Augmented Reality (AR) in media applications

Indoor Environment

TSSH-HEnet : Kansainvälistyvä opetussuunnitelma. CASE4: International Master s Degree Programme in Information Technology

Flexbright Oy Embedded software/hardware engineer

Bachelor level exams by date in Otaniemi

Viral DNA as a model for coil to globule transition

LYTH-CONS CONSISTENCY TRANSMITTER

Plasmid Name: pmm290. Aliases: none known. Length: bp. Constructed by: Mike Moser/Cristina Swanson. Last updated: 17 August 2009

BIOENV Laitoskokous Departmental Meeting

Bachelor level exams by subject in Otaniemi

Integration of Finnish web services in WebLicht Presentation in Freudenstadt by Jussi Piitulainen

Kemi Kemi news and events -uutislehti n:o 001 / 2018

KANTO CHEMICAL CO., INC.

WAMS 2010,Ylivieska Monitoring service of energy efficiency in housing Jan Nyman,

3 9-VUOTIAIDEN LASTEN SUORIUTUMINEN BOSTONIN NIMENTÄTESTISTÄ

Security server v6 installation requirements

Älykäs erikoistuminen. Kristiina Heiniemi-Pulkkinen

Millaisia mahdollisuuksia kyberturva tarjoaa ja kenelle? Ja mitä on saatu aikaan?

M E N U B U S I N E S S C L A S S

Network to Get Work. Tehtäviä opiskelijoille Assignments for students.

Tehostettu kisällioppiminen tietojenkäsittelytieteen ja matematiikan opetuksessa yliopistossa Thomas Vikberg

Heisingin kaupungin tietokeskus Helsingfors stads faktacentral City of Helsinki Urban Facts 0N THE EFFECTS 0F URBAN NATURAL AMENITIES, ARCHITECTURAL

Computing Curricula raportin vertailu kolmeen suomalaiseen koulutusohjelmaan

Eukaryotic Comparative Genomics

Käytännön kokemuksia osallistumisesta EU projekteihin. 7. puiteohjelman uusien hakujen infopäivät 2011

Transkriptio:

Bioinformatics Longjiang Fan ( 樊龙江 ) Prof. of crop genetics and breeding Department of Agronomy College of Agriculture and Biotechnology, Zhejiang University http://cab.zju.edu.cn/nxx/ Prof. of bioinformatics James D. Watson Institute of Genome Sciences, Zhejiang University http://wigs.zju.edu.cn Institute of Bioinformatics, Zhejiang University http://ibi.zju.edu.cn The Bioinplant Lab (Bioinformatics and Plant Genomics Lab) http://ibi.zju.edu.cn/bioinplant/

Contents LE C # TOPICS DETAILS TEACHER 1 Historical introduction Functional genomics QTL/LD mapping etc. ZHU 2 Network analysis Microarray, proteomic analysis, pathway analysis ZHU 3 Sequence alignment 4 Sequence pattern 5 Genome analysis 6 Evolutionary analysis Sequence collection; alignment and database searching for similar sequences Domain finding; pattern and functional element finding, etc. Annotation, transcriptomic analysis, comparative genomics Genetic variation and phylogenetic analysis FAN FAN FAN FAN 7 Practice 8 Course activities Utility and realization of bioinformatics tools and databases) References reading, case study and discussion/ Invited speeches QIU/FAN FAN

Grading Mini tests during the course Term paper

Readings Bioinformatics----Sequence and Genome Analysis (D.W. Mount, Cold Spring Harbor Lab Press, 2004) 生物信息学札记 ( 第 3 版 )( 樊龙江,PDF,Online Course: http://ibi.zju.edu.cn/bioinplant/ ) 生物信息学手册 ( 第 2 版 )( 郝柏林, 张淑誉, 上海科技出版社,2002) Developing Bioinformatics Computer Skills (C. Gibas and P. Jambeck, O REILLY, 2001) Nucleic Acids Research Database Issue (Jan. 1, from 1993) Program Issue (July 1, from 2003)

Google bioinformatics

1. Historical Introduction and Overview 1.1 What s bioinformatics? 1.2 Bioinformatics milestones

Bioinformatics is the science of using information to understand biology. It s s the discipline of obtaining information about genomic or protein sequence data. This may involve similarity searches of databases, comparing your unidentified sequence to the sequences 1.1 What s in a database, Bioinformatics or making predictions? about the sequence based on current knowledge of similar sequences. A A term coined in 1990 to define the use of computers in sequence analysis (Claverie, 2000)

Bioinformatics is a branch of biological science which deals with the study of methods for storing, retrieving and analyzing biological data, such as nucleic acid (DNA/RNA) and protein sequence, structure, function, pathways and genetic interactions. It generates new knowledge that is useful in such fields as drug design and development of new software tools to create that knowledge. Bioinformatics also deals with algorithms, databases and information systems, web technologies, artificial intelligence and soft computing, information and computation theory, structural biology, software engineering, data mining, image processing, modeling and simulation, discrete mathematics, control and system theory, circuit theory, and statistics. Major research areas: Sequence analysis Genome annotation Computational evolutionary biology Literature analysis Analysis of gene expression Analysis of regulation Analysis of protein expression Wikipedia_bioinformatics Analysis of mutations in cancer Comparative genomics Modeling biological systems High-throughput image analysis Structural bioinformatic approaches Prediction of protein structure Molecular Interaction Docking algorithms

Role of bioinformatics Control and management of the high-through data Analysis of primary data e.g. Base calling from chromatograms Mass spectra analysis DNA microarrays images analysis Statistics Database storage and access Results analysis in a biological context

Example of high-throughput Complete genome sequencing (Sanger and highthroughput sequencing etc.) Large-scale sampling of the transcriptome (EST, small RNA populations) Simultaneous expression analysis of thousands of genes (DNA microarrays, SAGE) Large-scale sampling of the proteome Protein-protein analysis large-scale 2-hybrid (yeast, worm) Large-scale 3D structure production (yeast) Metabolism modelling Simulations Biodiversity

Illumina 公司先后推出的高通量测序仪测序通量 ( 能力 ) 增长情况 (2010/4/15) A:HiSeq 2000 B:Genome Analyzer Ⅱx C:Genome Analyzer Ⅱe www.illumina.com C A B

Bioinformatics: Genome-driven! A thaliana Rice In 1983 and 1984 HGP Meetings supported by NIH and DOE

An International Crop & Livestock Genomics Campaign For The Future Throughput Building: More Machines Efficient Process Powerful Computing Versatile Strategies Systematic Analysis worm ALL major pests and pathogens aphidall other farmed plant sp. grasshopper Tough weeds Invertebrate ABC Med. Herds Vertebrate ABC Marine species Mammal ABC pines Garden vari. Primate ABC bee poplarmajor tree sp. chimp cattle sheep Major crop species zebrafish frog turtle Solanaceae XYZ mouse rat cotton potato Rosaceae XYZ A. thal human pig wheat barley Leguminosae XYZ fruitfly rice corn B. nap soy tomato Graminae XYZ 1995 2000 2005 2010 Jun Yu, 2002

The massive quantities of data generated by genomic research have given rise to the field of bioinformatics--an emerging discipline that seeks to integrate computer science with applications derived from molecular biology. We are swimming in a rapidly rising sea of data...how do we keep from drowning? Roos DS. Bioinformatics----Trying to swin in a sea of data. Science, 2001,291(5507):1260-1261

CACTAGTCTCTGTACTAGCCACTAGAAGTACTAACCTTTCACACTAATATATCTATCTCCTGCTGCATTTAGTACACAAGTTCATAAAAGCACCCTATTTCTATAAAAAAAATACGGTAAATGTAGCAACTTAC TAGTACCATAAGAAATTTTGCTGATCTAGCTAACTTATTACTAGCTACTTGCTAGGTCTGAACACTATTAAAATGTAACAATACACTTACCTCCTTGATCTGTGCAGCCCTGTTCTCACGCTGGCTTCTATGG TGCGAGTAGTATTCCTAGGTTTTCGTAGGCTTTTATAGCAACAGCTTTCTTCGGACCGAATGAGACACCTGCCTTGTTTATGAGAGGGATGGATAGCTTTCACCTGCTGGACATTTATTTGTTTTTTTTTACT GGTCACTACATTCCTATCCACTGGTGCATATCTATCCTATCCCCTTTGGTCAGTAAAATATACTGCCTCCCCCATTCTCTTTCTTTCTCTATCTTTCTCTAAGCTTAACACACTTTAAGTTCACAAAATTATTAT TATTATTATTATTATTATTATTATTATTATTATTATTATTATTATTATTATTATTATTATTATTAGCAGGCTTCCCTCCTTTAGAAATTTCATCGTCGAAATTATTATACCTTGGTGATGGAAAAACTGAGGCTAGT TTTTTCTGGAGATCATCTTCCTTCTCCCATGTGGCCTCATCCATGGTGTGATGACTCCATTGTACCTTTAAAAATCTAATTGTTTGGTTCCTTGTTTTTAGATCTTTAATATCCAAGATACAAACAGGATATTC CTGATATGTCAAATCGTTATGCAACTCAGCCATAGGAATTTCAACTTAATCACTTGGCCTCCGAAGGCATTTACGAAGCATGGAGATGTGGAATACATCATGTACCCCGGTGAAAGCATCTGGTAGCTTTA GCATGTAAGGCACTTCTCCTATTTGCTTAACAATTGTAAATGGTCCAACATATCTGGAACTTATTTTTTTTCCAAGTCCGAATCGCTTAATTCCCTTTATAGGTGATACTTTTAAATATACCCAGTCACCTATAT CAAAGTTAAGATCCCTTCTCCTATTATCTGCATAACTTTTTTTGTCTATTTTGAGCTGTTTGCAGTCGTTCCCGTATCAGTCGTATTGTTTCTTCTATCTGTTGTATTATATCCGGTCCTAACAATTTTCTTTCTC CTACTTCGTTCCAGCAAACAGGTGTTCTGCATTTCCTTCCATATAAGGCTTCATACGGAGCCATTTGTATACTAGATTGATAACTATTGTTATATGCAAATTCTGCTAATGGCATAAATTCTTTCCATGATCCT TTAAATTCTAGGATGCAAGATCGTAAAATATTTTCAATTATTTGATTCACCCTTTCAGTTTGTCCATCGGTTTGGGGGTGATACGCTGCACTGAAATCTAATGTTGTTCCCACGGGCTTGTGTAGTCTTTTCT AGAAATTGGACAGAAACTGTGTATCTCTGTCTGACACAATCCTTCTTGGAACACCATGTAAAGATACTATTTCTTTGACATATAGTTTAGCTAACCTTTCCAAAGAAAATTTGCTTTTAACGGGTATGAAATGA GCAGATTTTGTTAACCGATCCACTATTCAGATACTATCATTTCCTGGAGGTGTGGTAGGTAATCCTTGAACAAAGTCCATACTGATTTCTTCTCATTTCCATAGTGGAATACTTAAGGGTTGTAACAGTCTTG CCGGCCTTTGATGTTCAACTTTTACGCATTGGCAGATATCACATTCTGCAATGAATTTTGCAATTTCTATTTTCATGATACATTTTGGTACTTCCTGGATGTATGGTATAGGGAGAGAAATGTGATTCTTCCAA TATTCTCTGTTTTAAATTAGGGTCGTTAGGCACACACAATCTATTTTTGAAACATATAGCACCATTATGATCAATTCGAAATTCAGACACCTTCCCTTCTTCAATATTTTTCTTTGCCTTTTGCAATCCACTGTC GTCTCTTTGTTTCTCTAGAATATTTTCTTCTAAAGTAGGCTTTATTTGAAGCACGGGTAATAATACTCTGGGTTCATGGATCTTTAATTCCACATCCAATCTTTCCAAGTCTCTAAGTATATGTTGATCCTGTGT GATCTGAATAGCCATATTACAAAGAGCTTTTCGACTTAGAGCATCTTCCACAATGTTGGCTTTCAGAGGGTGATAATGAATATTCAAATCATAATCTTTCAATAATTCTAACCATCCCCTTTATCTCATATTCA ATTCCTTCTGAGTAAATATGTACTTTAAACTTTTGTGGTCAGTAAATATTTCACAATGCTCACCATATAGGTAATGTCTCCAGATTTTTAAGGCAAAAATAACAGCAGCTAATTCCATATCATGGGTTGGATAA TTTTGCTCGTATGGCTTTAATTGACGCGAAGCATAGGCAATTACCTTAGCTTTTTGCATGAGAACACAACCTAATCCAATTTTTGAAGCATCACAGTAAATAGTAAATTCTTCTCCCATTATAGGCAAGGCAA GAATAGTAAATTCTTTGCAATTCTGAGTCCACTCATATTTTACTCCCTTTTGTGTCAACCGGGTTAGAGGAGCTGCAATTCTAGCGAAGTTACTAATAAATCGACGGTAATATCCCGCCAACCCAAGAAAAC TTCGTATCTCGGTTACCGATGAGGGCCTTTTCCACTCTGAGACGGTTTTGACCTTTTCAGGGTCCACTGATATACCTTCACCCGAAATAACATGACCAAGCAAAAATACTTTATCCATCCAGAAATCGCATT TCTTTAATTTGGCAAATAGTTTATGATCTCGCAATGTCTTGTAGTACTATTCTCAAATGATTTGCATGATCTTCCTTAGTCTTGGAATATATCAAAATATCATCTATATATAAATACAACTACAAATTAATCAAGA TAAGGCTTGAATTTACGATTCATTAAATCCATAAAAGCTGCCGGTGCATTAGTCAAACCAAATGGCATTACTAGATATTCATAGTGTCCATAGCATGCACGGAAAGCAGTCTTGGGTATATCACTAGGTTTA ATCTTTAGTTGATGGTAGCCTGATTGAAGATCAATTTTTGAGAAAACCCGAGCTCCTTGTAGTTGATCAAATAGATCGTCTATCCTTGGTAAAGGATATTTGTTTTTGATAGTCACCTTATTCAGTTCTCGGTA ATCCGTGCATAATCGCATAGTTCCATCCTTTTTCTTGACAAATAGAACAGGAACACCCCCACGGGGAGACACTAGGACAAATGAATCCTTTATCTTCTAATTCTTTTAATTGTACATTTAGTTCCTTTAGCTC AACAGGGGCCATTATGTAGGGTGCCTAATAAATCGGAGTAGTTCCTGGTCCTATTTCAATACCAAATTCAATCTCTCGATCTAGTGCTAATCCTGGTAATTCAGCTGGAAAAACTGGAAACTCATTCACAAT TGGCATTCCTTCCCAACTTGCTTCCTTTCTCATGATTTCTGCCACTAAAGGTCTTGGTAAATTGTTTTAATCTCCATGGTAAGTAATTTGGTTTTGATCCCATGGTTTAAGTGTAATTTGTTTTTCATGGCAATC AATATTTGCTTTGTTCTTACATAACCAATCCATACCAAGTATAATATCAAAATCATGCATATCCAAGGGTATGAGGTCAGCAGTTAATTCCCATCCATCAATAGTAATTGGACACAATTTGCAAATTAAATTAG TTATTTGGCTATCCAAAGGAGTTTCTATGCAAATCCTTTCTTTTAATTGACTAGTAGGGATGGTGTATTTTCTCACGAAGTTGGTGGAGATAAACGAATGTGTTGCGCCAGAATCAAATAAAACTTTACCAGG ATAAGAGCACACTAAGACATTACCTGTAACCACGGTGTTGGATTTTTCGGCTGTGCTCTTAGTTAAGTTGTATACCCCAAGCGCGATTCCCACCTTGTGAATTATTCGACCGTATTCCTCATGTAGTATTAG TATTTGCAGGTGGCTTTCCTTGATTTGGCCCATTATTATTTGCTGAAGATGGTCCAGGTAAATAAAGCGACGGTACTGAAGTCAATACTTTAGTACTTGGCTGAGTAGTTCAATTAACTCGATTTTTACCCTT CTGTAACAGAGGACAAAGGTATCTAGTATGTCCTGCTTCTCCACACTCAAAGCACCTTCCCCACCGATTAGGACAAATTGATGGAACATGGCCACCTTGGCATATTGGACATTTTCTGTCTTGATTTTCTAA AGATTCCCTCTACATTTTTCCAGAGTAGTTTCCACGGAATCTTCCCTGGTTTTGTTGATTATTTGTCTTGAATTTCTTTTGGGGTTGTCCGTGTTCTATTCTTTGTTCATGATACCCCTTCTCAAGAAGTTGTG CTTTACTTACTACCTCCCTGAATATGGTTAATTCAAAGGCTTCGACACACCTTTTGAGAGGTTGGCGTAATCCACTTTCAAATCGTCGAGCTTTAGAGCCGTCCGTTTGTACAAATTCAGGAGCAAATCTTG CAAGTCTCGAAAATTCTATTTCATATTCTACTACAGATTTATTACCTTACTTAAGCTCTAGAAATTCCTTCTTCATTCTCTTCACACTTTCTGGAAAATATTTCTTGTAAAAAGCTTCTTTGAATATTTCCCATGT AATAGAGATACGTTCCGAATATGACTTTTTGTGAGCATCCCACCATTCAAAAGCACTAGACTGAAGCATATAGGTAGCATATGTAATCTTTTCTTTATCTGTACAACCCATAGCTTCAAATGCCTTTTCCATT GCTACTATCCAAACTTCCGCTTCAAGTGGATTGGTAGTTCCTGAAAGGAAAAAGTATGAATTACCCCCTGAACTATTGCGAGAGTATGAATTACCCCCCCCCCCCAAAACCACAAAACCAGACATATTAAAC CTCAAACTATTGAAATCGGATTACCCCCCCTGATTCAATCCGGAGCGGTTTGGTCCTACGTGGCATACACGTGGCACCGCCATGGAAATCCAATCAGCAATATTAGGTGGTCCCACATGTCATGATCATGT ATTTCTTCCACTTTCCCCTCTCTTCATCTCCTCCAGGGCAAATAGAAAGCGGCGCGGTGGTGGCGCTCTCCAGGGCGGCCGGGGGAAGCGGCGGCGGCGGCGTCCAGGGCGGGTGGGGGAAGCGGC GGCGTCCAGGGCGGCTGCGGAAGCGACGGCGGCGTCCAGGGTGGGCTAGGGAAGCGGCGGCTTCTAGGGCAAGCTGGGGAAGTGGCGGCGGTGGCGGCGACGGCGGCGTCCAGGGCGGGCTGG GGAAGCAGCGGCGTCCAGGGCAGGCGGGGAAGTGGCGGTGATGACGGCGCCCTCCAGGTCGAACTGGGGTGGTGGCGGGGAAGTGACGGCAGCGACGGCGCCCTCCAGGGCAGGTAGGGGAAGC GGTGGCGGCGGGTGTGGCGGGAGCGCTCGTGCGGTGGGCGCGGCGGGAGCGGGAGCGGGCGCGGCGAGGAGCAGGCGCTTGTGCTCCTCCTCCGTGGCGCCAGAGATGGAGCGGGCGCTCGTG AGCGGGTCGGCCGCCGCTGCGAGCTCGCCGTGGAGGCGGCGAGAATCGAGATCGACGGCGAGCTCCACGGAGATGGAGAGAAGAAGGGAAGGGGCAAAGAGGAGGGGGAGAAGAGGAGGGTTGG GCAGACAGTGGGCCCCACCATATTTATTTGTTGTGGCTGACAAGTGGGTCCTATATATTTTTCTTTTGTTTTAGCTGACCAGACTGCCACATGGGCATCCACGTAGGACCGAAACCACCCTATATCGATCTA GGGGGTAATTCATCCGGTTTGTAAAGTTCAGGGTTAAAAATAACTGGTATTGGAGTTCAGGGTTAAAAATCGGACGACCGTAATTGTTGAGGGGGTAATTCGTACTTTTTCCTTCTTGAAAATGTTGGTGG CTTCAATTTCTGAAATTCCCCAAGTCCATTCCGGTTAGCATCACTTTTAGTAGTACGTTCTAAAATCTCCATCTATCGTTGTTGGGTTTCCTGTTGCTTGCCCAATATATTCGCGAGTAAGTTAGCCCAAGGG TCTTGACTACTTGCACTAGGTATTATTGATCCAGTGGCACCATTACTAGTATTATTTCCATCCTGACTAGTACCATTGTTGTCGTTGTTTTGCTCCATCTATCATATTCAACTCATTAGCCAGAATACATAAAT GATCATTGGATGGATCTCAAAATGGTAACAAAAATCAGATTTACTATAAAATATTCAATATAGGTAATATTAAAATAAAACTATTTAGTTATATTATCATCATTATACTTTTCTCTTCTTATTTTAGTCTTATCATT ATTCTTAACATGCACCAGTTAAAAAATAAATAAATAAAATTAGTACAAACCACAAGCACCACAGCACTAGTGCATTACGGTCATGTTTAGATTCAAATTTTTTTCTTCAAACTTCTAACTTTTCCGTCACATCAA ATGTTTGGACACATGCATGGAGCATTAAATGTGGAGAAAAAAACAATTGCACAGTTTGCATGTAAATTGTGAGACGAATCTTTTGAGCCTAATTACACCATGATTTGACAATGTGATGCTATAGTAAACATTT GTTAATGATAGATTAATTAGTCTTAATAAATTCATCTCGCAGTTTACAGGTGAAATCTGTAATTTGTTTTGTTATTAGTCTACATTTAATACTTCAAATGTATATCCATATACTTGAAAAAAAATTTGGCACACG AACTAAACACAGCCTACTTCGACGAAAAGAAAGTGCAGGAGCCTATCATGCTACACAAACACTAAGGCAAACACCTACTGGTGTACTAGTGCCACATACAGAGCTCTGGTTGTTTACACAAGATGTCTAGA AAGACATCACCATGAGTTCTGATGTTAACTCTTCAGTTCTAAAAGCTCCTTTGGCTGTCTCGTGACCCATCCACACATGCTACTAACACTAAGGGTGTGTAGGGTGTGTTTAGTTCACACCAAAATTGAAAG TTTGGTTGAAATTGAAACGATGTGACGGAAAAGTTGAAGTTTACGTGTGTAGGAGAGTTTTGATGTGATGAAAAAGTTAAAAGTTTGAAGAAAAATTTTGGAACTAAACTCAGCCTAAAGGACTTATTATAGT GGAGTACATCCCATCCCAAGGGAAAACAAAACCCATACTGACACCACTCCTACATCTCACACACTGCCACTAGAGCTGTCACTACCCCCAACCCCACTCTGCAGAACAGTAAATGGTTTCACTCAGGTAG CAGACGCGGTGGTACAGGCGATAGGTGAGGCGCTCCAGAAACATAGGCTGTGTTTAGATGGTGGAAAAGTTGGGAGGTTGGGAGAAAGTTAGTAGTTTGGAGAAAAAGTTGGTAGTTTATGTGTGTACG AAAGTTTTCGATGTGATGTGATGTGATGGAAAGTTAGGAATTTGGGGGGAACTAAACACGGCCATAACTTCATTCTCACTGGAGCGAACAATAGTCGGCAGTTATTTTTATATACATATTTGTTAAAGAAGA AATATTACTGTCCATGGATATTAATGGCCGATAAATAGTATAAAAAACATTAAATATAGTAAGTGATTTAAATACATTCTGCAGAGGTATTAAAATAATTGTCATAATCTCGTTCCTTCAATCCATTTTTTTCCA ACTAGTGATACCTCATCTGAGAATCACGGCGCCGAATTCCCTACTTGTGTGAGGCATTCCTTCTCTCACACTGATATCAGCCGACCCGATATCGTTGTTTCAGGTATCGGCCGTCTCAGGCTAAGTATCAA AATCATGTTCCATGATTATGACGTTATTATTCTCACTGATAAAATCATCAATCAATTATTCGGGAGTTAATAATATTTACCGTTAGATCGTTAGTATCATCATCCCAATATATAATACAGGTAAGCGAATTTAGT TAGAGATGATTAAGTAAAATAGTTGATGGACACAGTCTTGCCTTCTCTTTTGTTGTTCTTCCTCTGCATCCCACCTAATCAAATATACATGTCTTTGGTATTAATTTATATCTATATTTGTTATGCAGGACATTA GCTACTGGAACCAGCTACTAGGACCATAGATAGCTAGTTGATGTGACTCTACTGGAGAAAGAAAACCAACATGTAGGCCTAGTTTATTTCCCCCAAAATTTTTCCCAAAAACATCACATTGAATCTTTGGAC ATATGCATGGAGCATTAAATATAGATTAAAAAAACTAATTGCACAGTTAGGGGGAAAATCACGAGACGAATCTTTTGAGCCTTATTAATCCATGATTAGCCATAAGTGCTACAGTAATGCCAGCTGGGCGAG GAGAGGTGGCAGTGGTGGTGAGCCCAGCTGGGTGGATGTGTGGAGGGTGGAGAGGAGACGGGGAGGGAGGGAGGGAGGGAGAGAGGACTAGG

Genome Analysis Genome Analysis 基因组分析 簊圁組奮橀

Expression data Non-coding gene finding Genomewide comparison Genomewide SNP analysis

Bioinformatics and Computational Biology Strictly speaking, bioinformatics is a subset of the large field of computational biology, the application of quantitative analytical techniques in modeling biological system. (Gibas and Jambeck, 2001) From Bioinformatics to Computational Biology (Claverie, 2000)

Bioinformatics and Systems biology Systems biology is a biology-based interdisciplinary field of study that focuses on complex interactions within biological systems, using a more holistic perspective (holism instead of the more traditional reductionism) approach to biological and biomedical research.

Genomics and Its Sons Improve Health And Design Better Crops TOXICOGENOMICS PHARMACOGENOMICS Link Networks In A Background of Organisms PHYSIOGENOMICS STRUCTURAL GENOMICS PROTEOMICS Define Cellular Processes Into Systematic Networks Map Gene Products To Cellular Processes Map Genes And Their Expression Identify Genes And Other Elements Determine Genome Sequence FUNCTINAL GENOMICS Bioinformatics GENOMICS Jun Yu, 2002

History of Bioinformatics

Key developments in bioinformatics (by D.W. Mount, Bioinformatics---Sequence and Genome Analysis, Second edition, 2004) The first sequences to be collected were those of proteins (M. Dayhoff) DNA sequence databases date from the early 1980s (W. Goad) Sequences can be easily retrieved from public databases (D. Lipman) Sequence alignment programs have proliferated The dot matrix or diagram method for comparing sequences (A.J. Gibbs and G.A. McIntyre, 1970) Alignment of sequences by dynamic programming (S.B. Needleman and C.D. Wunsch, 1970) Finding local alignments between sequences (T. Smith and M. Waterman, 1981) Multiple sequence alignment

There are several methods for predicting RNA secondary structure Evolutionary relationships are discovered using sequences Database searches for similar sequences can reveal gene function FASTA and BLAST enable rapid database searches (B. Pearson and S. Altschul) Protein sequences can be predicted by translation of DNA sequences Protein structure can be predicted The first complete genome sequence was H. influenzae AceDB was the first genome database Genome analysis methods are being developed Gene expression analysis is done using microarrays Data storage and mining technologies are used for large biological data sets Sequence analysis can be automated using Bioperl and the internet

Bioinformatics milestones 1996 Yeast genome completely sequenced 1997 PSI-BLAST 1998 Worm (multicellular) genome completely sequenced 1999 Fly genome completely sequenced

1956 美国召开首次生物学中的信息理论研讨会 1962 Pauling 提出分子进化理论 1965 Dayhoff 构建世界上第一个蛋白质序列数据库 1970 Needleman-Wunsch 算法被提出 1977 Staden 利用计算机软件分析 DNA 序列 1981 Smith-Waterman 算法出现 1981 序列模序 / 模体 (motif) 的概念被提出 (Doolittle) 1982 GenBank 数据库 (Release3) 公开 1982 λ- 噬菌体基因组被测序 1983 Wilbur 和 Lipman 提出序列数据库的搜索算法 (Wilber- Lipman 算法 ) 1985 快速序列相似性搜索程序 FASTP/FASTN 发布 1988 美国家生物技术信息中心 (NCBI) 创立 1988 欧洲分子生物学网络 EMBnet 创立 三大核酸数据库 (GenBank EMBL 和 DDBJ) 开始国际合作

图 1.4 1956 年美国田纳西首次生物学中的信息理论研讨会论文集 (Yockey H P, Platzman, Quastler H. Symposium on information theory in biology. Gatlinburg Tennessee, 1956. New York: Pergamon Press,1958)

1990 快速序列相似性搜索程序 BLAST 发布 1991 表达序列标签 (EST) 概念被提出, 从此开创 EST 测序 1993 英国 Sanger 中心在英国休斯顿 (Hinxton) 建立 1994 欧洲生物信息学研究所在英国休斯顿成立 1995 第一个细菌基因组测序完成 1996 酶母基因组测序完成 1997 PSI-BLAST(BLAST 系列程序之一 ) 发布 1998 Phil Green 等人研制的自动测序组装系统 Phred-Phrap- Consed 系统正式发布 1998 多细胞线虫基因组测序完成 1999 果蝇基因组测序完成 2000 人类基因组测序基本完成 2000 拟南芥基因组测序完成 2001 人类基因组初步分析结果公布 2002 水稻基因组测序完成 * 主要引自美国家生物信息中心 (NCBI)Education-Bioinformatics Milestone(2000), 原文截止至 1999 年果蝇基因组测序完成, 有关人类基因组 PhilGreen 等的自动测序组装系统和三大核酸数据库的合并等内容为作者补入

纵观生物信息学的发展历史, 可分为 4 个主要阶段 : History of bioinformatics: Four stages: the emergence (1960-70); the initial stage (1980s); genome sequencing (1990-) and high-throughput sequencing (2005-) (1) 萌芽期 (60-70 年代 ): 以 Dayhoff 的替换矩阵和 Neelleman- Wunsch 算法为代表, 它们实际组成了生物信息学的一个最基本的内容和思路 : 序列比较 它们的出现, 代表了生物信息学的诞生 ( 虽然 生物信息学 一词很晚才出现 ) ; (2) 形成期 (80 年代 ): 以分子数据库和 FASTA 等相似性搜索程序为代表 在这一阶段, 生物信息学作为一个新兴学科已经形成, 并确立了自身学科的特征和地位 ; (3) 基因组测序时代 (90 年代 - 至今 ): 以模式基因组测序与 BLAST 为代表 ; (4) 高通量测序时代 (2005-): 以第二代测序技术和基因组重测序为代表