Index of /download

      Name                    Last modified       Size  Description

[DIR] Parent Directory 30-Jul-2008 01:23 - [DIR] Latest_Release/ 04-Mar-2007 15:47 - [DIR] Supplementary_data_t..> 16-Aug-2006 16:20 -

Data Release Notes
  1. "Supplementary data..." collects those data accompanying with Yong Zhang, XS Liu, Qing-Rong Liu and Liping Wei. Nucleic Acids Res., 34: 3465-3475
  2. We recently improved our pipeline and developed NATsDB: a web-based Natural Antisense Transcript Database. Latest release is based on UniGene and GoldenPath of Jun,2006.
    Yong Zhang, Jiong-Tang Li, Lei Kong, Ge Gao, Qing-Rong Liu and Liping Wei. NATsDB: Natural Antisense Transcripts DataBase. Nucleic Acids Res., in print
  3. "Latest release" provides the lists of representative Sense/Antisense (SA) pairs and NOn-exonic Bidirectional (NOB) pairs in 11 organisms. All these lists are tab-delimited flat files, which could be imported directly into MySQL or Microsoft Excel. The files are described in details as follows.
  4. Name of organisms (abbreviation/common name): hs (human) mm (mouse) dm (fly) cel (worm) cin (sea squirt) gga (chicken) rn (rat) str (frog) dr (zebrafish) bt (cow) cfa (dog)
  5. All columns in all these files are listed as groups, with their definition following: cluster_id, overlap_length; plus_acc, plus_type, plus_gene_name, plus_gene_id, plus_tname, plus_tstart, plus_tend; minus_acc, minus_type, minus_gene_name, minus_gene_id, minus_tname, minus_tstart, minus_tend. 'cluster_id' is unique to define one sense/antisene or NOB pair. 'overlap_length' is the sum of all overlapping exonic regions for SA clusters, whereas it indicates the sum of all overlapping genomic regions for NOB clusters. The prefix, 'plus_' and 'minus_', indicate transcript encoded by plus strand and minus strand, respectively. 'acc' is the GenBank or RefSeq accession number of the transcript. 'type' indicates the corresponding UniGene division of this transcript, mRNA or EST. In case of mouse, relatively more representative sequences might be high-throughput cDNAs (HTC). 'gene_name' and 'gene_id' indicate corresponding Entrez Gene name and Entrez Gene ID, respectively, if they are available. If not available, they will be shown as 'NULL'. As for 'tname', 'tend' and 'tstart',they give the location of SA genes, namely, chromosome name, chromosome start coordinate and end coordinate, respectively.
  6. sa_organism or nob_organism: Files to collect representative SA or NOB pairs from some organism.
Center for Bioinformatics, Peking University