Class AtomCache


  • public class AtomCache
    extends Object
    A utility class that provides easy access to Structure objects. If you are running a script that is frequently re-using the same PDB structures, the AtomCache keeps an in-memory cache of the files for quicker access. The cache is a soft-cache, this means it won't cause out of memory exceptions, but garbage collects the data if the Java virtual machine needs to free up space. The AtomCache is thread-safe.
    Since:
    3.0
    Author:
    Andreas Prlic, Spencer Bliven, Peter Rose
    • Constructor Detail

      • AtomCache

        public AtomCache()
        Default AtomCache constructor. Usually stores files in a temp directory, but this can be overriden by setting the PDB_DIR variable at runtime.
        See Also:
        UserConfiguration()
      • AtomCache

        public AtomCache​(String pdbFilePath)
        Creates an instance of an AtomCache that is pointed to the a particular path in the file system. It will use the same value for pdbFilePath and cachePath.
        Parameters:
        pdbFilePath - a directory in the file system to use as a location to cache files.
      • AtomCache

        public AtomCache​(String pdbFilePath,
                         String cachePath)
        Creates an instance of an AtomCache that is pointed to the a particular path in the file system.
        Parameters:
        pdbFilePath - a directory in the file system to use as a location to cache files.
        cachePath -
      • AtomCache

        public AtomCache​(UserConfiguration config)
        Creates a new AtomCache object based on the provided UserConfiguration.
        Parameters:
        config - the UserConfiguration to use for this cache.
    • Method Detail

      • getBiologicalAssembly

        public Structure getBiologicalAssembly​(String pdbId,
                                               int bioAssemblyId,
                                               boolean multiModel)
                                        throws StructureException,
                                               IOException
        Returns the biological assembly for a given PDB ID and bioAssemblyId, by building the assembly from the biounit annotations found in Structure.getPDBHeader()

        Note, the number of available biological unit files varies. Many entries don't have a biological assembly specified (e.g. NMR structures), many entries have only one biological assembly (bioAssemblyId=1), and some structures have multiple biological assemblies.

        Parameters:
        pdbId - the PDB ID
        bioAssemblyId - the 1-based index of the biological assembly (0 gets the asymmetric unit)
        multiModel - if true the output Structure will be a multi-model one with one transformId per model, if false the outputStructure will be as the original with added chains with renamed asymIds (in the form originalAsymId_transformId and originalAuthId_transformId).
        Returns:
        a structure object
        Throws:
        IOException
        StructureException - if biassemblyId < 0 or other problems while loading structure
        Since:
        3.2
      • getBiologicalAssembly

        public Structure getBiologicalAssembly​(PdbId pdbId,
                                               int bioAssemblyId,
                                               boolean multiModel)
                                        throws StructureException,
                                               IOException
        Returns the biological assembly for a given PDB ID and bioAssemblyId, by building the assembly from the biounit annotations found in Structure.getPDBHeader()

        Note, the number of available biological unit files varies. Many entries don't have a biological assembly specified (e.g. NMR structures), many entries have only one biological assembly (bioAssemblyId=1), and some structures have multiple biological assemblies.

        Parameters:
        pdbId - the PDB ID
        bioAssemblyId - the 1-based index of the biological assembly (0 gets the asymmetric unit)
        multiModel - if true the output Structure will be a multi-model one with one transformId per model, if false the outputStructure will be as the original with added chains with renamed asymIds (in the form originalAsymId_transformId and originalAuthId_transformId).
        Returns:
        a structure object
        Throws:
        IOException
        StructureException - if biassemblyId < 0 or other problems while loading structure
        Since:
        6.0.0
      • getBiologicalAssembly

        public Structure getBiologicalAssembly​(String pdbId,
                                               boolean multiModel)
                                        throws StructureException,
                                               IOException
        Returns the default biological unit (bioassemblyId=1, known in PDB as pdb1.gz). If it is not available, the asymmetric unit will be returned, e.g. for NMR structures.

        Biological assemblies can also be accessed using getStructure("BIO:[pdbId]")

        Parameters:
        pdbId - the PDB id
        multiModel - if true the output Structure will be a multi-model one with one transformId per model, if false the outputStructure will be as the original with added chains with renamed asymIds (in the form originalAsymId_transformId and originalAuthId_transformId).
        Returns:
        a structure object
        Throws:
        IOException
        StructureException
        Since:
        4.2
      • getBiologicalAssemblies

        public List<Structure> getBiologicalAssemblies​(String pdbId,
                                                       boolean multiModel)
                                                throws StructureException,
                                                       IOException
        Returns all biological assemblies for given PDB id.
        Parameters:
        pdbId -
        multiModel - if true the output Structure will be a multi-model one with one transformId per model, if false the outputStructure will be as the original with added chains with renamed asymIds (in the form originalAsymId_transformId and originalAuthId_transformId).
        Returns:
        Throws:
        StructureException
        IOException
        Since:
        5.0
      • getCachePath

        public String getCachePath()
        Returns the path that contains the caching file for utility data, such as domain definitions.
        Returns:
      • getPath

        public String getPath()
        Get the path that is used to cache PDB files.
        Returns:
        path to a directory
      • getStructure

        public Structure getStructure​(String name)
                               throws IOException,
                                      StructureException
        Request a Structure based on a name.
                        Formal specification for how to specify the name:
        
                        name     := pdbID
                                       | pdbID '.' chainID
                                       | pdbID '.' range
                                       | scopID
                        range         := '('? range (',' range)? ')'?
                                       | chainID
                                       | chainID '_' resNum '-' resNum
                        pdbID         := [1-9][a-zA-Z0-9]{3}
                                       | PDB_[a-zA-Z0-9]{8}
                        chainID       := [a-zA-Z0-9]
                        scopID        := 'd' pdbID [a-z_][0-9_]
                        resNum        := [-+]?[0-9]+[A-Za-z]?
        
        
                        Example structures:
                        1TIM                 #whole structure
                        4HHB.C               #single chain
                        4GCR.A_1-83          #one domain, by residue number
                        3AA0.A,B             #two chains treated as one structure
                        PDB_00001TIM         #whole structure (extended format)
                        PDB_00004HHB.C       #single chain (extended format)
                        PDB_00004GCR.A_1-83  #one domain, by residue number (extended format)
                        PDB_00003AA0.A,B     #two chains treated as one structure (extended format)
                        d2bq6a1              #scop domain
         
        With the additional set of rules:
        • If only a PDB code is provided, the whole structure will be return including ligands, but the first model only (for NMR).
        • Chain IDs are case sensitive, PDB ids are not. To specify a particular chain write as: 4hhb.A or 4HHB.A
        • To specify a SCOP domain write a scopId e.g. d2bq6a1. Some flexibility can be allowed in SCOP domain names, see #setStrictSCOP(boolean)
        • URLs are accepted as well

        Note that this method should not be used in StructureIdentifier implementations to avoid circular calls.

        Parameters:
        name -
        Returns:
        a Structure object, or null if name appears improperly formated (eg too short, etc)
        Throws:
        IOException - The PDB file cannot be cached due to IO errors
        StructureException - The name appeared valid but did not correspond to a structure. Also thrown by some submethods upon errors, eg for poorly formatted subranges.
      • getStructureForDomain

        public Structure getStructureForDomain​(ScopDomain domain,
                                               ScopDatabase scopDatabase,
                                               boolean strictLigandHandling)
                                        throws IOException,
                                               StructureException
        Returns the representation of a ScopDomain as a BioJava Structure object.
        Parameters:
        domain - a SCOP domain
        scopDatabase - A ScopDatabase to use
        strictLigandHandling - If set to false, hetero-atoms are included if and only if they belong to a chain to which the SCOP domain belongs; if set to true, hetero-atoms are included if and only if they are strictly within the definition (residue numbers) of the SCOP domain
        Returns:
        a Structure object
        Throws:
        IOException
        StructureException
      • setCachePath

        public void setCachePath​(String cachePath)
        set the location at which utility data should be cached.
        Parameters:
        cachePath -
      • setObsoleteBehavior

        public void setObsoleteBehavior​(LocalPDBDirectory.ObsoleteBehavior behavior)
        [Optional] This method changes the behavior when obsolete entries are requested. Current behaviors are:
        • THROW_EXCEPTION Throw a StructureException (the default)
        • FETCH_OBSOLETE Load the requested ID from the PDB's obsolete repository
        • FETCH_CURRENT Load the most recent version of the requested structure

          This setting may be silently ignored by implementations which do not have access to the server to determine whether an entry is obsolete, such as if #isAutoFetch() is false. Note that an obsolete entry may still be returned even this is FETCH_CURRENT if the entry is found locally.

        Parameters:
        fetchFileEvenIfObsolete - Whether to fetch obsolete records
        Since:
        4.0.0
        See Also:
        #setFetchCurrent(boolean)
      • getObsoleteBehavior

        public LocalPDBDirectory.ObsoleteBehavior getObsoleteBehavior()
        Returns how this instance deals with obsolete entries. Note that this setting may be ignored by some implementations or in some situations, such as when #isAutoFetch() is false.

        For most implementations, the default value is THROW_EXCEPTION.

        Returns:
        The ObsoleteBehavior
        Since:
        4.0.0
      • setFetchBehavior

        public void setFetchBehavior​(LocalPDBDirectory.FetchBehavior fetchBehavior)
        Set the behavior for fetching files from the server
        Parameters:
        fetchBehavior -
      • setPath

        public void setPath​(String path)
        Set the path that is used to cache PDB files.
        Parameters:
        path - to a directory
      • getFiletype

        public StructureFiletype getFiletype()
        Returns the currently active file type that will be parsed.
        Returns:
        a StructureFiletype
      • setFiletype

        public void setFiletype​(StructureFiletype filetype)
        Set the file type that will be parsed.
        Parameters:
        filetype - a StructureFiletype
      • flagLoading

        protected void flagLoading​(PdbId pdbId)
      • flagLoadingFinished

        protected void flagLoadingFinished​(PdbId pdbId)
      • loadStructureFromMmtfByPdbId

        protected Structure loadStructureFromMmtfByPdbId​(PdbId pdbId)
                                                  throws IOException
        Load a Structure from MMTF either from the local file system.
        Parameters:
        pdbId - the input PDB id
        Returns:
        the Structure object of the parsed structure
        Throws:
        IOException - error reading from Web or file system