API
Main types
CrystalNets.CrystalNet
— TypeCrystalNet{D,T<:Real}
Representation of a net as a topological abstraction of a crystal.
D
is the dimensionality of the net, which is the number of repeated dimensions of a single connex component. This dimensionality is not necessarily the dimension of the space the crystal is embedded into, which would always be 3 for real space.
T
is the numeric type used to store the exact coordinates of each vertex at the equilibrium placement.
CrystalNets.UnderlyingNets
— TypeUnderlyingNets
Grouping of the connected components of a structure according to their dimensionality.
CrystalNets.TopologicalGenome
— TypeTopologicalGenome
A topological genome computed by CrystalNets.jl
.
Store both the actual genome (as a PeriodicGraph
) and the name of the net, if recognized.
Like for a PeriodicGraph
, the textual representation of a TopologicalGenome
can be parsed back into a TopologicalGenome
:
julia> topology = topological_genome(CrystalNet(PeriodicGraph("2 1 2 0 0 2 1 0 1 2 1 1 0")))
hcb
julia> typeof(topology)
TopologicalGenome
julia> PeriodicGraph(topology) # The actual topological genome, as a PeriodicGraph
PeriodicGraph2D(2, PeriodicEdge2D[(1, 2, (-1,0)), (1, 2, (0,0)), (1, 2, (0,1))])
julia> parse(TopologicalGenome, "hcb") == topology
true
CrystalNets.TopologyResult
— TypeTopologyResult
The result of a topology computation on a structure with different Clustering
options.
Its representation includes the name of the clustering options along with their corresponding genome. It is omitted if there is only one clustering option which is Auto
.
Like for a TopologicalGenome
(or a PeriodicGraph
), the textual representation of a TopologyResult
can be parsed back to a TopologyResult
:
julia> mof5 = joinpath(dirname(dirname(pathof(CrystalNets))), "test", "cif", "MOF-5.cif");
julia> topologies = only(determine_topology(mof5, structure=StructureType.MOF, clusterings=[Clustering.Auto, Clustering.Standard, Clustering.PE]))[1]
AllNodes, SingleNodes: pcu
Standard: xbh
PE: cab
julia> typeof(topologies)
TopologyResult
julia> parse(TopologyResult, repr(topologies)) == topologies
true
See also TopologicalGenome
and InterpenetratedTopologyResult
.
CrystalNets.InterpenetratedTopologyResult
— TypeInterpenetratedTopologyResult <: AbstractVector{Tuple{TopologyResult,Int}}
The result of a topology computation on a structure containing possibly several interpenetrated substructures.
An InterpenetratedTopologyResult
can be seen as a list of (topology, n)
pair where
topology
is theTopologyResult
corresponding to the substructures.n
is an integer such that the substructure is composed of ann
-fold catenated net.
The entire structure can thus be decomposed in a series of substructures, each of them possibly decomposed into several catenated nets.
In this context, interpenetration and catenation have slightly different meanings:
- two (or more) substructures are interpenetrated if both are present in the unit cell, and are composed of vertices that have disjoint numbers. They may or may not all have the same topology since they are disjoint and independent subgraphs. For example:
julia> topological_genome(PeriodicGraph("2 1 1 0 1 2 2 0 1 2 2 1 0")) 2 interpenetrated substructures: ⋅ Subnet 1 → UNKNOWN 1 1 1 1 ⋅ Subnet 2 → sql
- a net is
n
-fold catenated if the unit cell of a single connected component of the net isn
times larger than the unit cell of the overall net. In that case, the net is actually made ofn
interpenetrating connected components, which all have the same topology. For example:julia> topological_genome(PeriodicGraph("3 1 1 2 0 0 1 1 0 1 0 1 1 0 0 1")) (2-fold) pcu
Both may occur inside a single structure, for example:
julia> topological_genome(PeriodicGraph("2 1 1 0 2 2 2 0 1 2 2 1 0"))
2 interpenetrated substructures:
⋅ Subnet 1 → (2-fold) UNKNOWN 1 1 1 1
⋅ Subnet 2 → sql
Note that catenation is a particular case of interpenetration: an n
-fold catenated net repeated into a supercell n
times larger becomes n
interpenetrated nets.
See also total_interpenetration
to abstract away the difference between interpenetration and catenation.
Example
julia> mof14 = joinpath(dirname(dirname(pathof(CrystalNets))), "test", "cif", "MOFs", "MOF-14.cif");
julia> topologies = determine_topology(mof14, structure=StructureType.MOF, clusterings=[Clustering.Auto, Clustering.Standard, Clustering.PE])
2 interpenetrated substructures:
⋅ Subnet 1 → AllNodes,SingleNodes,Standard: pto | PE: sqc11259
⋅ Subnet 2 → AllNodes,SingleNodes,Standard: pto | PE: sqc11259
julia> typeof(topologies)
InterpenetratedTopologyResult
julia> parse(InterpenetratedTopologyResult, repr(topologies)) == topologies
true
julia> topologies[2]
(AllNodes, SingleNodes, Standard: pto
PE: sqc11259, 1)
julia> topology, n = topologies[2]; # second subnet
julia> n # catenation multiplicity
1
julia> topology
AllNodes, SingleNodes, Standard: pto
PE: sqc11259
julia> typeof(topology)
TopologyResult
Main functions
CrystalNets.determine_topology
— Functiondetermine_topology(path, options::Options)
determine_topology(path; kwargs...)
Compute the topology of the structure described in the file located at path
. This is exactly equivalent to calling topological_genome(UnderlyingNets(parse_chemfile(path, options)))
.
Return an InterpenetratedTopologyResult
.
CrystalNets.determine_topology_dataset
— Functiondetermine_topology_dataset(path, save, autoclean, showprogress, options::Options)
determine_topology_dataset(path; save=true, autoclean=true, showprogress=true, kwargs...)
Given a path to a directory containing structure input files, compute the topology of each structure within the directory. Return a dictionary linking each file name to the result. The result is a InterpenetratedTopologyResult
, containing the topological genome, the name if known and the stability of the net. In case of error, the exception is reported.
Warnings will be toggled off (unless force_warn
is set) and it is stongly recommended not to export any file since those actions may critically reduce performance, especially for numerous files.
If save
is set, the result is also stored in a julia serialized file located at "$path/../results_$i" where i
is the lowest integer such that this path does not already exist at the start of the computation. While processing, this path will be used to create a directory storing the current state of the computation: to continue an interrupted computation, simply pass this temporary directory as the path. If autoclean
is set, this directory is removed at the end if the computation was successful.
If save
is set and autoclean
is unset, the directory of temporary files will be renamed into "$path/../results_$i.OLD$j".
If showprogress
is set, a progress bar will be displayed representing the number of processed files.
CrystalNets.parse_chemfile
— Function parse_chemfile(path, options::Options)
parse_chemfile(path; kwargs...)
Parse a file given in any recognised chemical format and extract the topological information. Such format can be .cif or any file format recognised by Chemfiles.jl that contains all the necessary topological information.
CrystalNets.topological_genome
— Functiontopological_genome(net::CrystalNet{D,T})::String where {D,T}
Compute the topological genome of a net. The topological genome is an invariant if the net, meaning that it does not depend on its representation. It is also the string representation of a D-periodic graph such that PeriodicGraph{D}(topological_genome(net))
is isomorphic to net.pge.g
(except possibly if the ignore_types
option is unset).
Return a TopologicalGenome
.
Options must be passed directly within net
.
topological_genome(g::Union{String,PeriodicGraph}, options::Options=Options())
topological_genome(g::Union{String,PeriodicGraph}; kwargs...)
Compute the topological genome of a periodic graph. If given a topological key (as a string), it is converted to a PeriodicGraph
first.
Return a TopologicalGenome
.
topological_genome(group::UnderlyingNets)
Compute the topological genome of each subnet stored in group
.
Return a InterpenetratedTopologyResult
Options must be passed directly within the subnets.
Options
CrystalNets.Options
— TypeOptions
Different options, passed as keyword arguments.
Basic options
name
: a name for the structure.bonding
: one of theBonding
options. Default isBonding.Auto
.structure
: one of theStructureType
options. Default isStructureType.Auto
.clusterings
: a list ofClustering
options. Default is[Clustering.Auto]
.
Exports
For each export option, the accepted values are either a string, indicating the path to the directory in which to store the export, or a boolean, specifying whether or not to do the export. If the value is true
, a path will be automatically determined. An empty string is equivalent to false
.
export_input
: the parsed structure, as a .vtfexport_trimmed
: the parsed structure after iteratively removing all atoms having only one neighbour, as a .vtfexport_attributions
: the attribution of vertices into SBUs, as a .pdb. Only relevant for theMOF
StructureType
.export_clusters
: the clustering of vertices, as a .vtfexport_net
: the overall extracted net on which the topology is computed, as a .vtf.export_subnets
: each connected component of the overall net as a separate .vtf file. These subnets are defined after grouping vertices according to theirClustering
.
Other options
ignore_atoms
: set of atom symbols to ignore (for instance[:C,:H]
will remove carbohydrate solvent residues).ignore_types
: disregard atom types to compute the topology, making pcu and pcu-b identical for example (default is true)cutoff_coeff
: coefficient used to detect bonds. Default is 0.75, higher values will include bonds that were considered too long before.skip_minimize
: assume that the cell is already the unit cell (default is false).dimensions
: the set of crystal net dimensions to consider. For instance, puttingSet(3)
will ensure that only 3-dimensional nets are considered. Default isSet([1,2,3])
.cluster_kinds
: aClusterKinds
. Default separates organic and inorganic SBUs.ignore_homoatomic_bonds
: aSet{Symbol}
such that all X-X bonds of the net are removed if X is an atom whose type is inignore_homoatomic_bonds
.max_polyhedron_radius
: an integer specifying the maximum number of bonds between two corners of the coordination polyhedron built for theClustering.PE
option. Default is 4.Hbonds
: set totrue
to include hydrogen bonds. Default is false.Hbonds_dmax
: the maximum length of a hydrogen bond. Only used ifHbonds
is set. Default is 2.5 Å.Hbonds_θmax
: the maximum angle of a hydrogen bond. Only used ifHbonds
is set. Default is 30°.Hbonds_nmax
: the maximum number of hydrogen bond per hydrogen. Only used ifHbonds
is set. Default is 1.
Miscellaneous options
These boolean options have a default value that may be determined by Bonding
, StructureType
and Clustering
. They can be directly overriden here.
bond_adjacent_sbus
: bond together SBUs which are only separated by a single C atom.authorize_pruning
: remove colliding atoms in the input. Default is true.wider_metallic_bonds
: for bond detections, metals have a radius equal to 1.5× their Van der Waals radius. Default is false, unlessStructureType
isMOF
orZeolite
.ignore_homometallic_bonds
: remove all bonds between two metal atoms of the same kind.reduce_homometallic_bonds
: when guessing bonds, do not bond two metallic atoms of the same type if they are up to third neighbours anyway. Default is false, unlessStructureType
isMOF
.ignore_metal_cluster_bonds
: do not bond two metallic clusters together if they share at least one non-metallic neighbour. Default is false.ignore_low_occupancy
: atoms with occupancy lower than 0.5 are ignored. Default is false.detect_paddlewheels
: detect paddle-wheel pattern and group them into an inorganic vertex. Default is true.detect_organiccycles
: detect organic cycles and collapse all belonging C atoms into a new vertex. Default is true.detect_pe
: detect organic points-of-extension (organic atoms bonded to another SBU) and transform them into vertices. Default is true.cluster_simple_pe
: cluster adjacent points-of-extension if they are not part of a cycle. Default is true.separate_metals
: separate each metal atom into its own vertex (instead of grouping them to form metallic clusters if they are adjacent or bonded by an oxygen). Default is false, unlessClustering
isStandard
orPEM
.premerge_metalbonds
: when a periodic metallic SBU is detected, cluster together bonded metal atoms of the same kind before splitting the SBU.split_O_vertex
: if a vertex is composed of a single O, remove it and bond together all of its neighbors, unless removing its hydrogen bonds would make it bivalent. Default is true.unify_sbu_decomposition
: apply the same rule to decompose both periodic and finite SBUs. Default is false.force_warn
: force printing warning and information even during..._dataset
function calls. Default is false.label_for_type
: use the atom label instead of its type. Default is false. Note that setting this to true will result in an error when detecting bonds if any atom has a label which is not an element of the periodic table.track_mapping
: track the mapping of vertices from the input to the final genome. To use it, settrue
: at the end of the topology computation, thetrack_mapping
field will hold a listl
such thatl[i]
is the number of vertex in the result topology that corresponds to atomi
in the initial structure. In the case ofdetermine_topology...
calls, this initial structure is the file exported through theexport_input
option. Default isnothing
, which does no tracking.track_mapping
also accepts being set toInt[]
instead oftrue
: the list will then be modified in-place.keep_single_track
: set tofalse
to modify the behaviour of thetrack_mapping
option. By default (true
),track_mapping
is only allowed when the structure corresponds to a single topology: this excludes structures with multiple connected components, as well as multiple values inclusterings
. This is necessary since otherwise the list held intrack_mapping
at the end of the computation could refer to any of multiple topologies. Settingkeep_single_track
tofalse
lifts this requirement; in this case, the mapping will be printed at the end of the topology computation for each topology, but it will not be held in thetrack_mapping
field (and will not be made computationally accessible).
Internal fields
These fields are for internal use and should not be modified by the user:
dryrun
: store information on possible options to try (forguess_topology
)._pos
: the positions of the centre of the clusters collapsed into vertices.error
: store the first error that occured when building the net.throw_error
: if set, throw the error instead of storing it in theerror
field.
CrystalNets.StructureType
— ModuleStructureType
Selection mode for the crystal structure. This choice impacts the bond detection algorithm as well as the clustering algorithm used.
The choices are:
Auto
: No specific structure information. Use Van der Waals radii for bond detection andInput
asClustering
, orEachVertex
if the input does not provide residues.MOF
: Use Van der Waals radii for non-metallic atoms and larger radii for metals. Detect organic and inorganic clusters and subdivide them according toAllNodes
andSingleNodes
to identify underlying nets.Cluster
: similar to MOF but metallic atoms are not given a wider radius.Zeolite
: Same asAuto
but use larger radii for metals (and metalloids) and attempt to enforce that each O atom has exactly two neighbours and that they are not O atoms.Guess
: try to identify the clusters as inCluster
. If it fails, fall back toAuto
.
CrystalNets.Bonding
— ModuleBonding
Selection mode for the detection of bonds. The choices are:
Input
: use the input bonds. Fail if those are not specified.Guess
: guess bonds using a variant of chemfiles / VMD algorithm.Auto
: if the input specifies bonds, use them unless they look suspicious (too small or or too large according to a heuristic). Otherwise, fall back toGuess
.NoBond
: do not guess or use any bond. This cannot be used to determine topology.
CrystalNets.Clustering
— ModuleClustering
The clustering algorithm used to group atoms into vertices.
This choice only affects the creation of a UnderlyingNets
from a Crystal
, not the Crystal
itself, and in particular not the bond detection algorithm.
The basic choices are:
Auto
: determined using theStructureType
.Input
: use the input residues as vertices. Fail if some atom does not belong to a residue.EachVertex
: each atom is its own vertex. Vertices with degree 2 or lower are iteratively collapsed into edges until all vertices have degree 3 or more.
The next clustering options are designed for MOFs but may target other kinds of frameworks. In all cases, the clusters are refinements on top of already-defined clusters, such as the organic and inorganic SBUs defined by the MOF
structure. Except for AllNodes
, infinite clusters (such as the inorganic clusters in a rod MOF) are split into new finite clusters using heuristics.
SingleNodes
: each already-defined cluster is mapped to a vertex.AllNodes
: keep points of extension for organic clusters.Standard
: make each metallic atom its own vertex and do not bond those together if they share a common non-metallic neighbour.PE
: stands for Points of Extension. Keep points of extension for organic clusters, remove metallic centres and bond their surrounding points of extension.PEM
: stands for Points of Extension and Metals. Keep points of extension for organic clusters and each metal centre as a separate vertex.
CrystalNets.ClusterKinds
— TypeClusterKinds(sbus, toclassify=Int[])
Description of the different kinds of SBUs there should be when making clusters.
sbus
should be a list of set of symbols, each set containing the different elements acceptable in this SBU (an empty set designates all remaining elements). All elements of the same category of the periodic table can be grouped together by putting the name of the category. For example, ClusterKinds([[:Au, :halogen, :nonmetal], [:metal, :metalloid], []])
means that there are three kinds of SBUs:
- the first kind can only hold halogens, non-metals and Au atoms
- the second kind can only hold metalloids and metals (except Au)
- the third kind can hold all the other elements.
The list of possible categories is: :actinide, :noble (for noble gas), :halogen, :lanthanide, :metal, :metalloid and :nonmetal.
toclassify
contains the list of SBUs which are not actual SBUs but only groups of atoms waiting to be merged to a neighboring SBU. The neighboring SBU is chosen by order in the sbus
list.
The cluster kinds used by default are CrystalNets.ClusterKinds([[:metal, :actinide, :lanthanide], [:C,], [:P, :S], [:nonmetal, :metalloid, :halogen], [:noble]], [3, 4])
. This means that all atoms that are either metals, actinides or lanthanides are assigned to class 1 and all C atoms in SBUs of class 2. Afterwards, each group of adjacent P or S atoms is assigned either class 1 if any of its neighbor is of class 1, or class 2 otherwise if any of its neighbor is of class 2. If no such neighbor exist, it is assigned to class 1. Finally, each group of adjacent nonmetals, metalloids and halogens is assigned class 1 or 2 following the same rule as for P and S atoms.
At the end of the procedure, all atoms are thus given a class between 1
and length(sbus)
which is not in toclassify
. See also find_sbus!
for the implementation of this procedure.
To determine which SBU kind corresponds to a given atom, use getindex
:
julia> sbu_kinds = CrystalNets.ClusterKinds([[:nonmetal, :halogen], [:metal, :F]]);
julia> sbu_kinds[:O] # nonmetal
1
julia> sbu_kinds[:Au] # metal
2
julia> sbu_kinds[:F] # specifically F
2
julia> sbu_kinds[:Ne] # no given SBU kind
0
If no empty set has been explicitly added to sbus
and an element falls outside of the included categories, the returned SBU kind is 0.
An exception is made for nonmetals which are part of an aromatic heterocycle: those will be treated separately and put in the SBU of the corresponding carbons.
Other utilities
CrystalNets.toggle_error
— Functiontoggle_error(to=nothing)
Toggle @error visibility on (if to == true
) or off (if to == false
). Without an argument, toggle on and off repeatedly at each call.
CrystalNets.toggle_warning
— Functiontoggle_warning(to=nothing)
Toggle warnings on (if to == true
) or off (if to == false
). Without an argument, toggle on and off repeatedly at each call.
CrystalNets.toggle_export
— Functiontoggle_export(to=nothing)
Toggle default exports on (if to == true
) or off (if to == false
). Without an argument, toggle on and off repeatedly at each call.
CrystalNets.export_default
— Functionexport_default(c::Union{PeriodicGraph,CrystalNet,Crystal}, obj=nothing, name=nothing, path=tempdir(); repeats=nothing)
Export a VTF representation of an object at the given path
.
obj
is a String
describing the nature of the object, such as "net", "clusters" or "subnet" for example. Default is string(typeof(c))
.
name
is a String
inserted in the exported file name. Default is a tempname
.
repeats
is the maximum distance between a represented atom out of the unit cell and one inside. Default is between 2 and 6, depending on obj
and the size of the graph.
Archive
CrystalNets.REVERSE_CRYSTALNETS_ARCHIVE
— Constantconst REVERSE_CRYSTALNETS_ARCHIVE::Dict{String,String}
Reverse of CRYSTALNETS_ARCHIVE
.
Can be used to query the topological genome of known nets, as in:
julia> REVERSE_CRYSTALNETS_ARCHIVE["dia"]
"3 1 2 0 0 0 1 2 0 0 1 1 2 0 1 0 1 2 1 0 0"
julia> topological_genome(CrystalNet(PeriodicGraph(ans)))
dia
It is also possible to directly access the topological genome as a PeriodicGraph
by parsing the name as a TopologicalGenome
:
julia> PeriodicGraph(parse(TopologicalGenome, "pcu"))
PeriodicGraph3D(1, PeriodicEdge3D[(1, 1, (0,0,1)), (1, 1, (0,1,0)), (1, 1, (1,0,0))])
julia> string(PeriodicGraph(parse(TopologicalGenome, "nbo"))) == REVERSE_CRYSTALNETS_ARCHIVE["nbo"]
true
CrystalNets.parse_arc
— Functionparse_arc(file)
Parse a .arc Systre archive such as the one used by the RCSR. Return a pair (flag, pairs)
.
flag
is set if the archive corresponds to one generated by a compatible release of CrystalNets. If unset, the genomes of the archive may not be the same as those computed by CrystalNets for the same nets. pairs
is a Dict{String,String}
whose entries have the form genome => id
where id
is the name of the net and genome
is the topological genome corresponding to this net (given as a string of whitespace-separated values parseable by PeriodicGraph
).
CrystalNets.parse_arcs
— Functionparse_arcs(file)
Parse a folder containing .arc Systre archives such as the one used by the RCSR. Return a pair (flag, pairs)
with the same convention than parse_arc
.
CrystalNets.clean_default_archive!
— Functionclean_default_archive!(custom_arc; validate=true, refresh=true)
Erase the default archive used by CrystalNets.jl to recognize known topologies and replace it with a new one from the file located at custom_arc
.
The validate
parameter controls whether the new file is checked and converted to a format usable by CrystalNets.jl. If unsure, leave it set.
The refresh
optional parameter controls whether the current archive should be replaced by the new default one.
This archive will be kept and used for subsequent runs of CrystalNets.jl, even if you restart your Julia session.
To only change the archive for the current session, use change_current_archive!(custom_arc)
.
See also refresh_current_archive!
for similar uses.
The previous default archive cannot be recovered afterwards, so make sure to keep a copy if necessary. The default archive is the set of ".arc" files located at joinpath(dirname(dirname(pathof(CrystalNets))), "archives")
.
CrystalNets.set_default_archive!
— Functionset_default_archive!()
Set the current archive as the new default archive.
This archive will be kept and used for subsequent runs of CrystalNets.jl, even if you restart your Julia session.
CrystalNets.empty_default_archive!
— Functionempty_default_archive!(; refresh=true)
Empty the default archive. This will prevent CrystalNets from recognizing any topology before they are explicitly added.
The refresh
optional parameter controls whether the current archive should also be emptied.
This empty archive will be kept and used for subsequent runs of CrystalNets.jl, even if you restart your Julia session. If you only want to empty the current archive, do empty!(CrystalNets.CRYSTALNETS_ARCHIVE)
.
CrystalNets.change_current_archive!
— Functionchange_current_archive!(custom_arc; validate=true)
Erase the current archive used by CrystalNets.jl to recognize known topologies and replace it with the archive stored in the file located at custom_arc
.
The validate
optional parameter controls whether the new file is checked and converted to a format usable by CrystalNets.jl. If unsure, leave it set.
This modification will only last for the duration of this Julia session.
If you wish to change the default archive and use it for subsequent runs, use clean_default_archive!
.
Using an invalid archive will make CrystalNets.jl unusable. If this happens, simply run refresh_current_archive!()
to revert to the default archive.
CrystalNets.refresh_current_archive!
— Functionrefresh_current_archive!()
Revert the current topological archive to the default one.
CrystalNets.add_to_current_archive!
— Functionadd_to_current_archive!(id, genome)
Mark genome
as the topological genome associated with the name id
in the current archive.
The input id
and genome
are not modified by this operation.
This modification will only last for the duration of this Julia session.
If you wish to save the archive and use it for subsequent runs, use set_default_archive!
after calling this function.
CrystalNets.export_arc
— Functionexport_arc(path, arc=CRYSTALNETS_ARCHIVE)
Export archive arc
to the specified path
. If unspecified, the exported archive is the current one.