Mapping of vertices
Principle
CrystalNets.jl extracts the net underlying the input structure, and can identify its name when it exists. Sometimes, it is useful to retrieve the mapping between the input structure and the vertices of the net. This is possible with CrystalNets.jl, using the track_mapping
option (see full list of Options
)
There are multiple ways of using it. One consists in preparing an empty list and passing it to the track_mapping
option. The list will contain the mapping at the end of the computation.
In the case of a call to determine_topology
, the net can be extracted from the InterpenetratedTopologyResult
returned by the function, while the number of the input atoms is that corresponding to the "name" field of the .vtf file exported with the export_input
option. For example:
julia> path_to_PIVCIJ = joinpath(dirname(dirname(pathof(CrystalNets))), "test", "cif", "PIVCIJ.cif");
julia> mapping = Int[]; # pass an empty list to track_mapping
julia> topology = determine_topology(path_to_PIVCIJ; structure=StructureType.MOF, clusterings=[Clustering.SingleNodes], track_mapping=mapping, export_input=true)
Export of input is enabled: saving file at /tmp/input_PIVCIJ.vtf
SingleNodes: fsc
julia> fsc = last(only(first(only(topology))))
fsc
julia> fsc_graph = PeriodicGraph(fsc)
PeriodicGraph3D(2, PeriodicEdge3D[(1, 1, (0,0,1)), (1, 2, (-1,0,0)), (1, 2, (-1,1,0)), (1, 2, (0,0,0)), (1, 2, (0,1,0))])
julia> degree(fsc_graph) # fsc has two vertices, of degree respectively 6 and 4
2-element Vector{Int64}:
6
4
julia> first(Iterators.drop(eachline("/tmp/input_PIVCIJ.vtf"), 4))
"atom 0 type Co name 1 resid 0 atomicnumber 27"
julia> mapping[1] # atom 1, Co, is mapped to the 6-coordinated vertex
1
julia> first(Iterators.drop(eachline("/tmp/input_PIVCIJ.vtf"), 11))
"atom 7 type C name 8 resid 0 atomicnumber 6"
julia> mapping[8] # atom 8, C, is part of the 4-coordinated ligand
2
julia> first(Iterators.drop(eachline("/tmp/input_PIVCIJ.vtf"), 13))
"atom 9 type H name 10 resid 0 atomicnumber 1"
julia> mapping[10] # atom 10, H, has been removed early and has no mapping
0
The operation that goes from the InterpenetratedTopologyResult
, returned by determine_topology
, to the actual genome stored as a PeriodicGraph
, is explained in a dedicated FAQ question
When using the lower-level topological_genome
function, the call will include an Options
, either given explicitly (for PeriodicGraph
input) or within the passed object if it is a CrystalNet
or an UnderlyingNets
. In such cases, the mapping can be fetched from the track_mapping
field of the given Options
after the computation, as long as it was initialized with either an empty list, like in the previous example, or simply the boolean true
. For example :
julia> crystal_PIVCIJ = parse_chemfile(path_to_PIVCIJ; clusterings=[Clustering.SingleNodes], track_mapping=true);
julia> topological_genome(CrystalNet(crystal_PIVCIJ))
fsc
julia> crystal_PIVCIJ.options.track_mapping == mapping
true
Restriction
In order to have a one-to-one correspondence between the input structure and the net, there must be exactly one net that corresponds to the input. Using track_mapping
will thus fail when there is only one track_mapping
field for either multiple interpenetrated substructures, or multiple clusterings.
For example, the following command errors:
julia> determine_topology(path_to_PIVCIJ; structure=StructureType.MOF, track_mapping=mapping)
ERROR: ArgumentError: Cannot keep a single mapping track for multiple sub-nets. Please use keep_single_track=true only on single components with a single clustering.
[...]
The reason is that the structure=StructureType.MOF
argument makes the default Clustering.Auto
value equivalent to both Clustering.SingleNodes
and Clustering.AllNodes
, so the previous call is actually evaluating two different clusterings, hence the error.
Workaround: the keep_single_track=false
option
A quick workaround consists in adding the keep_single_track=false
option. The result of the mapping will then be printed after the computation, but it cannot be retrieved computationally:
julia> mapping2 = Int[];
julia> determine_topology(path_to_PIVCIJ; structure=StructureType.MOF, track_mapping=mapping2, keep_single_track=false)
Mapping for CrystalNets.Clustering._Clustering[CrystalNets.Clustering.AllNodes][1, 1, 1, 1, 1, 0, 0, 2, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 3, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Mapping for CrystalNets.Clustering._Clustering[CrystalNets.Clustering.SingleNodes][1, 1, 1, 1, 1, 0, 0, 2, 2, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 2, 2, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
AllNodes: sqc27
SingleNodes: fsc
julia> mapping2 # untouched, cannot be used to retrieve the mapping
Int64[]
Subnet separation
A more proper solution to this issue consists in creating separate CrystalNet
instances, one for each substructure and for each target clustering, and thus with their own separate track_mapping
field. The UnderlyingNets
constructor can be used for this purpose, starting from a CrystalNets.Crystal
. For example:
julia> path_to_WEBZEK = joinpath(dirname(dirname(pathof(CrystalNets))), "test", "cif", "MOFs", "WEBZEK.cif");
julia> crystal_webzek = parse_chemfile(path_to_WEBZEK; structure=StructureType.MOF);
julia> unets_webzek = UnderlyingNets(crystal_webzek);
julia> topological_genome(unets_webzek) # there are 2 interpenetrated subnets, with 2 clusterings each
2 interpenetrated substructures:
⋅ Subnet 1 → AllNodes,SingleNodes: bcu
⋅ Subnet 2 → AllNodes,SingleNodes: dia
julia> webzekA, webzekB = first.(unets_webzek.D3) # only keep 3D nets here
2-element Vector{Vector{CrystalNet3D}}:
[CrystalNet3D{Rational{Int32}} of WEBZEK_1 with 8 vertices and 32 edges (clustering: AllNodes), CrystalNet3D{Rational{Int32}} of WEBZEK_1 with 8 vertices and 32 edges (clustering: SingleNodes)]
[CrystalNet3D{Rational{Int32}} of WEBZEK_2 with 4 vertices and 8 edges (clustering: AllNodes), CrystalNet3D{Rational{Int32}} of WEBZEK_2 with 4 vertices and 8 edges (clustering: SingleNodes)]
Say we focus on a particular net, for instance the AllNodes
clustering of substructure B. After selecting it, its Options
can be modified by wrapping it in the CrystalNet
constructor and using the appropriate keyword arguments:
julia> subnet = webzekB[1]
CrystalNet3D{Rational{Int32}} of WEBZEK_2 with 4 vertices and 8 edges (clustering: AllNodes)
julia> newmapping = Int[];
julia> newsubnet = CrystalNet(subnet; track_mapping=newmapping);
julia> topological_genome(newsubnet)
dia
julia> newmapping
4-element Vector{Int64}:
1
2
2
1
Note that the call CrystalNet(subnet; track_mapping=newmapping)
does not modify subnet
: it simply creates a copy of subnet
with a modified track_mapping
field. As a consequence, subnet.options.track_mapping
cannot be used to track the result of the computation of topological_genome(newsubnet)
:
julia> newmapping === newsubnet.options.track_mapping != subnet.options.track_mapping
true