/usr/lib/R/site-library/GenomicFeatures/NEWS

CHANGES IN VERSION 1.26
-----------------------

NEW FEATURES

    o makeTxDbFromGRanges() now recognizes features of type lnc_RNA,
     antisense_lncRNA, transcript_region, and pseudogenic_tRNA, as transcripts.

    o Add 'intronJunctions' argument to mapToTranscripts().

SIGNIFICANT USER-VISIBLE CHANGES

DEPRECATED AND DEFUNCT

    o The 'vals' argument of the "transcripts", "exons", "cds", and "genes"
      methods for TxDb objects is now defunct (was deprecated in BioC 3.3).

    o The "species" method for TxDb object is now defunct (was deprecated in
      BioC 3.3).

BUG FIXES


CHANGES IN VERSION 1.24
-----------------------

NEW FEATURES

    o Add mapRangesToIds() and mapIdsToRanges() for mapping genomic ranges to
      IDs and vice-versa.

    o Support makeTxDbFromUCSC("hg38", "knownGene") (gets "GENCODE v22" track).

    o Add pmapToTranscripts,GRangesList,GRangesList method.

SIGNIFICANT USER-VISIBLE CHANGES

    o Rename the 'vals' argument of the transcripts(), exons(), cds(), and
      genes() extractors -> 'filter'. The 'vals' argument is still available
      but deprecated.

    o Rename the 'filters' argument of makeTxDbFromBiomart() and
      makeTxDbPackage() -> 'filter'.

    o When grouping the transcripts by exon or CDS, transcriptsBy() now returns
      a GRangesList object with the "exon_rank" information (as an inner
      metadata column).

    o For transcripts with no exons (like in the GFF3 files from GeneDB),
      makeTxDbFromGRanges() now infers the exons from the CDS.

    o For transcripts with no exons and no CDS (like in the GFF3 files from
      miRBase), makeTxDbFromGRanges() now infers the exon from the transcript.

    o makeTxDbFromGRanges() and makeTxDbFromGFF() now support GFF/GTF files
      with one (or both) of the following peculiarities:
      - The file is GTF and contains only lines of type transcript but no
        transcript_id tag (not clear this is valid GTF but some users are
        working with this kind of file).
      - Each transcript in the file is reported to be on its own contig and
        spans it (start=1) but no strand is reported for the transcript.
        makeTxDbFromGRanges() now sets the strand to "+" for all these
        transcripts.

    o makeTxDbFromGRanges() now recognizes features of type miRNA,
      miRNA_primary_transcript, SRP_RNA, RNase_P_RNA, RNase_MRP_RNA, misc_RNA,
      antisense_RNA, and antisense as transcripts. It also now recognizes
      features of type transposable_element_gene as genes.

    o makeTxDbFromBiomart() now points to the Ensembl mart by default instead
      of the central mart service.

    o Add some commonly used alternative names (Mito, mitochondrion,
      dmel_mitochondrion_genome, Pltd, ChrC, Pt, chloroplast, Chloro, 2uM) for
      the mitochondrial and chloroplast genomes to DEFAULT_CIRC_SEQS.

DEPRECATED AND DEFUNCT

    o Remove the makeTranscriptDb*() functions (were defunct in BioC 3.2).

    o Remove the 'exonRankAttributeName', 'gffGeneIdAttributeName',
      'useGenesAsTranscripts', 'gffTxName', and 'species' arguments from the
      makeTxDbFromGFF() function (were defunct in BioC 3.2).

BUG FIXES

    o Try to improve heuristic used in makeTxDbFromGRanges() for detecting the
      format (GFF3 or GTF) of input GRanges object 'x'.


CHANGES IN VERSION 1.22
-----------------------

NEW FEATURES

    o Add coverageByTranscript() and pcoverageByTranscript(). See
      ?coverageByTranscript for more information.

    o Various improvements to makeTxDbFromGFF():
      - Now supports 'format="auto"' for auto-detection of the file format.
      - Now supports naming features by dbxref tag (like GeneID). This has
        proven useful when importing GFFs from NCBI.

    o Improvements to the coordinate mapping methods:
      - Support recycling when length(transcripts) == 1 for parallel
        mapping functions.
      - Add pmapToTranscripts,Ranges,GRangesList and
        pmapFromTranscripts,Ranges,GRangesList methods.

    o Adds 'taxonomyId' argument to the makeTxDbFrom*() functions.

    o Improvements to makeTxDbPackage():
      - Add 'pkgname' argument to makeTxDbPackage() to let the user override
        the automatic naming of the package to be generated.
      - Support person objects for 'maintainer' and 'author' arguments to
        makeTxDbPackage().

    o The 'chrominfo' vector passed to makeTxDb() can now mix NAs and non-NAs.

SIGNIFICANT USER-VISIBLE CHANGES

    o 2 significant changes to makeTxDbFromGRanges() and makeTxDbFromGFF():
      - They now also import transcripts of type pseudogenic_transcript and
        pseudogenic_exon.
      - They normally get the "gene_id" and "[tx|exon|CDS]_name" columns from
        the Name tag. Now they will also infer these columns from the ID tag
        when the Name tag is missing.

    o Improve handling of 'circ_seqs' argument by makeTxDbFromUCSC(),
      makeTxDbFromGFF(), and makeTxDbFromBiomart(): no more annoying warning
      when none of the strings in DEFAULT_CIRC_SEQS matches the seqlevels of
      the TxDb object to be made.

    o 2 minor changes to makeTxDbFromBiomart():
      - Now drops unneeded chromosome info when importing an incomplete
        transcript dataset.
      - Now returns a TxDb object with 'Full dataset' field set to 'no' when
        makeTxDbFromBiomart() is called with user-supplied 'filters'.

    o makeTxDbPackage() now includes data source in the package name by default
      (for non UCSC and BioMart databases).

    o The following changes were made to the coordinate mapping methods:
      - mapToTranscripts() now reports mapped position with respect to the
        transcription start site regardless of strand.
      - Change 'ignore.strand' default from TRUE to FALSE in all coordinate
        mapping methods for consistency with other functions that already have
        the 'ignore.strand' argument.
      - Name matching in mapFromTranscripts() is now done with seqnames(x) and
        names(transcripts).
      - The pmapFromTranscripts,*,GRangesList methods now return a GRangesList
        object. Also they no longer use 'UNMAPPED' seqname for unmapped ranges.
      - Remove uneeded ellipisis from the argument list of various coordinate
        mapping methods. 

    o Change behavior of seqlevels0() getter so it does what it was always
      intended to do.

    o The order of the transcripts returned by transcripts() has changed: now
      they are guaranteed to be in the same order as in the GRangesList object
      returned by exonsBy().

    o Code improvements and speedup to the transcripts(), exons(), cds(),
      exonsBy(), and cdsBy() extractors.

    o In order to avoid loss of information (and make it reversible with
      makeTxDbFromGRanges()), the "asGFF" method for TxDb objects now
      propagates the "exon_name" and "cds_name" columns to the GRanges object.

DEPRECATED AND DEFUNCT

    o After being deprecated in BioC 3.1, the makeTranscriptDb*() functions
      are now defunct.

    o After being deprecated in BioC 3.1, the 'exonRankAttributeName',
      'gffGeneIdAttributeName', 'useGenesAsTranscripts', 'gffTxName', and
      'species' arguments of makeTxDbFromGFF() are now defunct.

    o Remove sortExonsByRank() (was defunct in BioC 3.1).

BUG FIXES

    o Fix bug in fiveUTRsByTranscript() and threeUTRsByTranscript() extractors
      when the TxDb object had "user defined" seqlevels and/or a set of
      "active/inactive" seqlevels defined on it.

    o Fix bug in isActiveSeq() setter when the TxDb object had "user defined"
      seqlevels on it.

    o Fix many issues with seqlevels() setter for TxDb objects. In particular
      make the 'seqlevels(x) <- seqlevels0(x)' idiom work on TxDb objects.

    o Fix bug in makeTxDbFromBiomart() when using it to retrieve a dataset that
      doesn't provide the cds_length attribute (e.g. sitalica_eg_gene dataset
      in plants_mart_26).


CHANGES IN VERSION 1.20
-----------------------

NEW FEATURES

    o Add makeTxDbFromGRanges() for extracting gene, transcript, exon, and CDS
      information from a GRanges object structured as GFF3 or GTF, and
      returning that information as a TxDb object.

    o TxDb objects have a new column ("tx_type" in the "transcripts" table)
      that the user can request thru the 'columns' arg of the transcripts()
      extractor. This column is populated when the user makes a TxDb object
      from Ensembl (using makeTxDbFromBiomart()) or from a GFF3/GTF file (using
      makeTxDbFromGFF()), but not yet (i.e. it's set to NA) when s/he makes it
      from a UCSC track (using makeTxDbFromUCSC()). However it seems that UCSC
      is also providing that information for some tracks so we're planning to
      have makeTxDbFromUCSC() get it from these tracks in the near future.
      Also low-level makeTxDb() now imports the "tx_type" column if supplied.

    o Add transcriptLengths() for extracting the transcript lengths from a
      TxDb object. It also returns the CDS and UTR lengths for each transcript
      if the user requests them.

    o extractTranscriptSeqs() now works on a FaFile or GmapGenome object (or,
      more generally, on any object that supports seqinfo() and getSeq()).

SIGNIFICANT USER-VISIBLE CHANGES

    o Renamed makeTranscriptDbFromUCSC(), makeTranscriptDbFromBiomart(),
      makeTranscriptDbFromGFF(), and makeTranscriptDb() -> makeTxDbFromUCSC(),
      makeTxDbFromBiomart(), makeTxDbFromGFF(), and makeTxDb(). Old names
      still work but are deprecated.

    o Many changes and improvements to makeTxDbFromGFF():
      - Re-implemented it on top of makeTxDbFromGRanges().
      - The geneID tag, if present, is now used to assign an external gene id
        to transcripts that couldn't otherwise be linked to a gene. This is
        for compatibility with some GFF3 files from FlyBase (see for example
        dmel-1000-r5.11.filtered.gff included in this package).
      - Arguments 'exonRankAttributeName', 'gffGeneIdAttributeName',
        'useGenesAsTranscripts', and 'gffTxName', are not needed anymore so
        they are now ignored and deprecated.  
      - Deprecated 'species' arg in favor of new 'organism' arg.

    o Some tweaks to makeTxDbFromBiomart():
      - Drop transcripts with UTR anomalies with a warning instead of failing.
        We've started to see these hopeless transcripts with the release 79 of
        Ensembl in the dmelanogaster_gene_ensembl dataset (based on FlyBase
        r6.02 / FB2014_05). With this change, the user can still make a TxDb
        for dmelanogaster_gene_ensembl but some transcripts will be dropped
        with a warning.
      - BioMart data anomaly warnings and errors now show the first 3
        problematic transcripts instead of 6.

    o 'gene_id' metadata column returned by genes() is now a character vector
      instead of a CharacterList object.

    o Use # prefix instead of | in "show" method for TxDb objects.

DEPRECATED AND DEFUNCT

    o Deprecated makeTranscriptDbFromUCSC(), makeTranscriptDbFromBiomart(),
      makeTranscriptDbFromGFF(), and makeTranscriptDb(), in favor of
      makeTxDbFromUCSC(), makeTxDbFromBiomart(), and makeTxDbFromGFF(), and
      makeTxDb().

    o Deprecated species() accessor in favor of organism() on TxDb objects.

    o sortExonsByRank() is now defunct (was deprecated in GenomicFeatures
      1.18)

    o Removed extractTranscriptsFromGenome(), extractTranscripts(),
      determineDefaultSeqnameStyle() (were defunct in GenomicFeatures 1.18).

BUG FIXES

    o makeTxDbFromBiomart():
      - Fix issue causing the download of 'chrominfo' data frame to fail when
        querying the primary Ensembl mart (with host="ensembl.org" and
        biomart="ENSEMBL_MART_ENSEMBL").
      - Fix issue with error reporting code: when some transcripts failed to
        pass the sanity checks, the error message was displaying the wrong
        transcripts. More precisely, many good transcripts were mistakenly
        added to the set of bad transcripts and included in the error message.

    o extractTranscriptSeqs(): fix error message when internal call to
      exonsBy() fails on 'transcripts'.


CHANGES IN VERSION 1.18
-----------------------

NEW FEATURES

    o Add extractUpstreamSeqs().

    o makeTranscriptDbFromUCSC() now supports the "flyBaseGene" table
      (FlyBase Genes track).

    o makeTranscriptDbFromBiomart() now knows how to fetch the sequence
      lengths from the Ensembl Plants db.

    o makeTranscriptDbFromGFF() is now more tolerant of bad strand
      information.

SIGNIFICANT USER-VISIBLE CHANGES

    o Replace toy TxDb UCSC_knownGene_sample.sqlite (based on hg18) with
      hg19_knownGene_sample.sqlite (based on hg19) and use hg19 instead of
      hg18 in all examples (and unit tests).

    o Rename TranscriptDb class -> TxDb.

    o Now when GTF files are processed into TxDbs with exon ranking being
      inferreed, if the exons are on separate chromosomes, we toss out that
      transcript (since we cannot possibly guess the exon ranking correctly).

DEPRECATED AND DEFUNCT

    o extractTranscripts() and extractTranscriptsFromGenome() are now defunct.

    o Deprecate sortExonsByRank().

BUG FIXES

    o Bug fixes and improvements to makeTranscriptDbFromBiomart():
      (a) Fix long standing bug where the code in charge of inferring the
          CDSs from the UTRs would return CDSs spanning all the exons of a
          non-coding transcript.
      (b) Fix an issue that was preventing the function from extracting the
          CDS information added recently to the datasets in the Ensembl Fungi,
          Ensembl Metazoa, Ensembl Plants, and Ensembl Protists databases. 
      (c) Make the code in charge of extracting the CDSs more robust by taking
          advantage of new attributes (genomic_coding_start and
          genomic_coding_end) added by Ensembl in release 74 (Dec 2013),
          and by adding more sanity checks.


CHANGES IN VERSION 1.14
-----------------------

NEW FEATURES

    o keys method now has new arguments to allow for more
      sophisticated filtering.
    
    o adds genes() extractor 

    o makeTranscriptDbFromGFF() now handles even more different kinds
      of GFF files.


BUG FIXES

    o better argument checking for makeTranscriptDbFromGFF()

    o cols arguments and methods will now be columns arguments and methods


CHANGES IN VERSION 1.12
-----------------------

NEW FEATURES

    o Support for new UCSC species

    o Better support for GTF and GFF processing into TranscriptDb objects

    o Methods for making TranscriptDb objects from general sources
      have been made more useful

BUG FIXES

    o Updates to allow continued access to ever changing services like UCSC

    o Corrections for seqnameStyle methods

    o Over 10X performance gains for processing of GTF and GFF files


CHANGES IN VERSION 1.10
-----------------------

NEW FEATURES

    o Add makeTranscriptDbFromGFF().  Users can now use GFF files to
      make TranscriptDb resources.

    o Add *restricted* "seqinfo<-" method for TranscriptDb objects. It only
      supports replacement of the sequence names (for now), i.e., except for
      their sequence names, Seqinfo objects 'value' (supplied) and 'seqinfo(x)'
      (current) must be identical.

    o Add promoters() and getPromoterSeq().

    o Add 'reassign.ids' arg (FALSE by default) to makeTranscriptDb().

SIGNIFICANT USER-VISIBLE CHANGES

    o Updated vignette.

    o Improve how makeTranscriptDbFromUCSC() and makeTranscriptDbFromBiomart()
      assign internal ids (see commit 65144 for the details).

    o 2.5x speedup of fiveUTRsByTranscript() and threeUTRsByTranscript().

DEPRECATED AND DEFUNCT

    o Are now defunct: transcripts_deprecated(), exons_deprecated(), and
      introns_deprecated().

    o Deprecate loadFeatures() and saveFeatures() in favor of loadDb() and
      saveDb(), respectively.

BUG FIXES

    o Better handling of BioMart data anomalies.


CHANGES IN VERSION 1.8
-----------------------

NEW FEATURES

    o Added asBED and asGFF methods to convert a TranscriptDb to a
      GRanges that describes transcript structures according to either
      the BED or GFF format. This enables passing a TranscriptDb
      object to rtracklayer::export directly, when targeting GFF/BED.


CHANGES IN VERSION 1.6
-----------------------

NEW FEATURES

    o TranscriptDbs are now available as standard packages.  Functions
    that were made available before the last release allow users to
    create these packages.

    o TranscriptDb objects now can be used with select

    o select method for TranscriptDb objects to extract data.frames of
    available annotations.  Users can specify keys, along with the
    keytype, and the columns of data that they want extracted from the
    annotation package.

    o keys now will operate on TranscriptDB objects to expose ID types
    as potential keys

    o keytypes will show which kinds of IDs can be used as a key by select

    o cols will display the kinds of data that can be extracted by select

    o isActiveSeq has been added to allow entire chromosomes to be
    toggled active/inactive by the user.  By default, everything is
    exposed, but if you wish you can now easily hide everything that
    you don't want to see.  Subsequence to this, all your accessors
    will behave as if only the "active" things are present in the
    database.

SIGNIFICANT USER-VISIBLE CHANGES

    o saveDb and loadDb are here and will be replacing saveFeatures
    and loadFeatures.  The reason for the name change is that they
    dispatch on (and should work with a wider range of object types
    than just trancriptDb objects (and their associated databases).

BUG FIXES

    o ORDER BY clause has been added to SQL statements to enforce more
    consistent ordering of returned rows.

    o bug fixes to enable DB construction to still work even after
    changes in schemas etc at UCSC, and ensembl sources.

    o bug fixes to makeFeatureDbFromUCSC allow it to work more
    reliably (it was being a little too optimistic about what UCSC
    would actually supply data for)
r-bioc-genomicfeatures 1.26.2-1 / usr / lib / R / site-library / GenomicFeatures / NEWS