Fingerprints¶
COSMolKit exposes RDKit-style Morgan fingerprints as fixed-length bit vectors.
The Python Fingerprint object is a sparse view over that binary vector:
on_bits() returns the bit indexes whose value is 1. It is not a dense
floating-point neural embedding.
Single Molecules¶
from cosmolkit import Molecule
mol = Molecule.from_smiles("c1ccccc1O")
fp = mol.fingerprint_morgan(radius=2, n_bits=2048)
print(fp.n_bits())
print(fp.on_bits())
Tanimoto similarity is computed directly on Fingerprint values:
phenol = Molecule.from_smiles("c1ccccc1O").fingerprint_morgan()
benzene = Molecule.from_smiles("c1ccccc1").fingerprint_morgan()
print(phenol.tanimoto(benzene))
Additional Output¶
fingerprint_morgan_with_output() returns a MorganFingerprintResult with
the fingerprint and RDKit-style provenance data:
result = mol.fingerprint_morgan_with_output(radius=2, n_bits=2048)
output = result.additional_output()
print(result.fingerprint().on_bits())
print(output.atom_counts())
print(output.atom_to_bits())
print(output.bit_info_map())
print(output.atoms_per_bit())
Supported Parameters¶
The Python binding exposes the supported RDKit-style Morgan generator branches:
radiusandn_bitsinclude_chiralityanduse_bond_typescount_simulationandcount_boundsonly_nonzero_invariantsinclude_redundant_environmentsfrom_atomsandignore_atomscustom_atom_invariantsandcustom_bond_invariantsatom_invariants_generator="connectivity" | "morgan" | "feature" | "fcfp"atom_invariants_include_ring_membershipbond_invariants_generator="morgan" | "default" | "bond"bond_invariants_use_bond_typesbond_invariants_use_chiralitynum_bits_per_feature
Batch Fingerprints¶
MoleculeBatch exposes matching batch APIs. Invalid records kept with
errors="keep" produce None in the corresponding output position.
from cosmolkit import MoleculeBatch
batch = MoleculeBatch.from_smiles_list(
["CCO", "not-smiles", "CCCO"],
errors="keep",
).with_parallel_jobs(8)
fps = batch.fingerprint_morgan_list(n_bits=2048)
print([fp.on_bits() if fp is not None else None for fp in fps])