CPC Definition - G16B BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIAL...

CPC Definition - Subclass G16B

Last Updated Version: 2026.01

BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY

Definition statement

This place covers:

Methods or systems for genetic or protein-related data processing in computational molecular biology.

Computational processing of data related to nucleic acids, proteins, peptides, or amino acids.

Bioinformatics methods or systems where the digital data processing is inherent or implicit, although not explicitly mentioned.

Relationships with other classification places

This subclass covers bioinformatics, whereas subclass G16C covers computational theoretical chemistry, chemoinformatics and computational materials science.

In order to determine whether classification should be directed to this subclass or to subclass G16C, in particular regarding computational theoretical chemistry (G16C 10/00) and chemoinformatics (G16C 20/00), one has to take into account the type of molecule(s), whose characterising features are processed by a computational algorithm.

Following the definition statement, processing of data related to nucleic acids, proteins, peptides and/or amino acids should be classified under G16B.

Processing of data related to any other type of molecule should be classified under G16C.

References

Informative references

Attention is drawn to the following places, which may be of interest for search:

Medical diagnosis	A61B 5/00
Genetic engineering involving nucleic acids	C12N 15/00
Nucleic acid analysis, e.g. microarrays, sequencing or PCR	C12Q 1/68
Chromatographic signal analysis	G01N 30/86
Chemical analysis of biological material, e.g. blood or urine	G01N 33/50
Chemical analysis of biological material involving proteins, peptides or amino acids	G01N 33/68
Computer input/output arrangements	G06F 3/00
Computer architectures or program control	G06F 9/00
Information retrieval; Database structures thereof; File system structures therefor	G06F 16/00
Complex mathematical operations	G06F 17/10
Pattern recognition	G06F 18/00
Computer systems using neural network models per se	G06N 3/02
Computer systems using knowledge representation per se, e.g. expert systems	G06N 5/02
Computer systems using probabilistic models per se	G06N 7/00
Machine learning	G06N 20/00
3D image rendering	G06T 15/00
3D modelling for computer graphics	G06T 17/00
Manipulating 3D models or images for computer graphics	G06T 19/00
Computational chemistry; Chemoinformatics	G16C
Computational materials science	G16C 60/00
Healthcare Informatics	G16H
Mass spectrometry apparatus per se	H01J 49/00

Glossary of terms

In this place, the following terms or expressions are used with the meaning indicated:

systems biology	Simulation and mathematical modelling of relationships and interactions between molecular entities in sub-cellular systems integrating genetic and/or protein-related data to describe the dynamic behaviour of, for example, protein-protein/protein-ligand interactions, regulatory networks and metabolic networks
phylogeny	Reconstruction of an evolutionary development and history of a species or higher taxonomic grouping of organisms; typically represented as a phylogenetic tree; methods for creating phylogenetic trees
phylogenetic tree	Tree-like graphical representation of phylogenetic relationships
molecular structure	2-dimensional or 3-dimensional arrangement of atoms, groups of atoms or domains in nucleic acids, proteins, peptides and amino acids
structure alignment	Form of alignment to establish structural and functional equivalences between two or more proteins based on their secondary or tertiary structures
protein folding	Process by which a polypeptide chain folds into a specific 3-dimensional structure
domain	Domain of a protein is an element of the overall molecular structure that is self-stabilising and often folds independently of the rest of a polypeptide chain
drug targeting	Drug design strategy aiming at optimising the properties of a medicinal compound, based on the 3-dimensional structure of a target, for delivery to a particular tissue or organ in the body
functional genomics	Experimental analyses aiming at assessing the function of genes in determining traits, physiology and/or development of an organism, making use of computational and high-throughput technologies
proteomics	Large-scale study of the functions of proteins and their interactions with other molecular entities in a biological system
genotype	Genetic makeup or profile of an organism with respect to a trait
ploidy	Number of sets of chromosomes in a cell/cells of an organism
allele	Alternative form of a gene (one member of a pair) that is located at a specific position (locus) on a specific chromosome
snp	Single nucleotide polymorphism: a DNA sequence variation that involves a change in a single nucleotide and is commonly present in a part of a population
motif	Specific nucleotide or amino acid sequence pattern
population genetics	Study of genetic variation and genetic evolution of populations
linkage disequilibrium	Tendency of alleles located close to each other on the same chromosome to be inherited together
mutagenesis	Process by which the genetic information of an organism is changed, resulting in a mutation
gene expression	Process by which proteins are made or transcribed from the instructions encoded in DNA
gene expression profiling	Determination of the pattern of genes expressed, i.e. transcribed, under specific circumstances or in a specific cell line
probe design and optimisation for microarrays	Designing and selecting (i) optimal, highly specific probes, e.g. oligonucleotides, cDNA, fragments for hybridisation experiments with microarrays and (ii) optimal sets of probes, e.g. oligonucleotides, cDNA, to be chemically attached to a solid support to form an array
microarray	Plurality of nucleic acid probes attached to a substrate, which form an ordered pattern
sequence alignment	Process of comparing nucleic or amino acid sequences, generally by a linear alignment in such a way that equivalent positions in adjacent sequences are brought into the correct alignment with each other by introducing insertions in suitable positions, in order to identify similarities and/or differences amongst the compared sequences
sequence assembly	Method by which linear portions of sequence information are assembled to obtain full length gene sequence data
in silico	Performed on a computer or via computer simulations
ontology	Classification methodology for formalising a subject's knowledge in a structured and controlled vocabulary

Synonyms and Keywords

In patent documents, the following words/expressions are often used with the meaning indicated:

systems

includes apparatus

G16B 5/00

ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks

Definition statement

This place covers:

Simulation or mathematical modelling of relationships and interactions between molecular entities on a subcellular level, integrating data related to genetic and/or proteins to describe the dynamic behaviour of protein-protein/protein-ligand interactions, regulatory or metabolic networks.

Mere mention of modelling or simulation is not sufficient to justify classification in this place.

G16B 10/00

ICT specially adapted for evolutionary bioinformatics, e.g. phylogenetic tree construction or analysis

Definition statement

This place covers:

Analysis of orthologous, paralogous, syntenic, or taxonomic relationships.

Generation of pedigrees and phylogenetic trees.

Mere mention of evolutionary data is not sufficient to justify classification in this place.

G16B 15/00

ICT specially adapted for analysing two-dimensional [2D] or three-dimensional [3D] molecular structures, e.g. structural or functional relations or structure alignment

Definition statement

This place covers:

Structural architecture of proteins, peptides, amino acids, and nucleic acids and the prediction thereof.

Processes including structural alignment, protein folding, domain topology, molecular modelling, receptor-ligand modelling, docking methods, structural-functional relationships and drug targeting using structure data, as well as two- and three-dimensional structure prediction and/or analysis.

The structure types include secondary, tertiary, and quaternary structures.

Mere mention of structural data is not sufficient to justify classification in this place.

G16B 20/00

ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations

Definition statement

This place covers:

Assessment of the function of genes and proteins in determining traits, physiology and/or development of an organism, making use of computational and large scale, high-throughput technologies.

Genotypic-phenotypic associations, including genotyping and genome annotation, linkage disequilibrium analysis and association studies, population genetics, alternative splicing, and small Interfering RNA design (siRNA, RNAi).

Binding site identification, mutagenesis analysis, protein-protein or protein-nucleic acid interactions.

Mere mention of gene or protein function is not sufficient to justify classification in this place.

G16B 25/00

ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression

Definition statement

This place covers:

Analysis of hybridization or gene/protein expression information. This includes microarray analysis, gel electrophoresis analysis and sequencing by hybridisation (SBH). Further covered technologies include modelling polymerase chain reaction (PCR) data, primer or probe design and probe optimisation, microarray design, normalisation, expression profiling, noise correction models, and expression-ratio estimation.

Mere mention of hybridisation or gene/protein expression is not sufficient to justify classification in this place.

G16B 30/00

ICT specially adapted for sequence analysis involving nucleotides or amino acids

Definition statement

This place covers:

Comparison of sequence information, wherein the sequences are nucleic acids, proteins, or peptides. The comparisons include methods of alignment, homology identification, motif identification, single-nucleotide polymorphism (SNP) discovery, haplotype identification, fragment assembly, and gene finding.

Mere mention of sequence data is not sufficient to justify classification in this place.

G16B 35/00

ICT specially adapted for in silico combinatorial libraries of nucleic acids, proteins or peptides

Definition statement

This place covers:

In silico (i.e. computer based) designing and screening of combinatorial nucleic acids, protein, or peptide libraries.

Mere mention of nucleic acid, protein, or peptide combinatorial libraries is not sufficient to justify classification in this place.

G16B 40/00

ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding

Definition statement

This place covers:

Discovery and/or analysis of patterns within a vast amount of genetic or protein-related data, wherein the emphasis is placed on the method of analysis and is largely independent of the particular type of bioinformatic data.

Covered methods based on machine learning and statistical models; supervised and unsupervised learning techniques include bioinformatic pattern finding, knowledge discovery, rule extraction, correlation, clustering, and classification.

Multivariate analysis of protein or gene-related data, e.g. analysis of variances (ANOVA), principal component analysis (PCA), and support vector machines (SVM).

G16B 45/00

ICT specially adapted for bioinformatics-related data visualisation, e.g. displaying of maps or networks

Definition statement

This place covers:

Visual representations specifically adapted to bioinformatic data, wherein the emphasis is placed on the method of visualisation and is largely independent of the particular type of bioinformatic data.

For example: graphics generation, map display (e.g. haplotype maps, linkage maps), and network display (e.g. genetic networks, protein-protein interaction networks, metabolic networks).

G16B 50/00

ICT programming tools or database systems specially adapted for bioinformatics

Definition statement

This place covers:

Software specially adapted to assist in programming procedures within bioinformatics.

Database systems specially adapted for managing bioinformatic data. For example: ontologies, heterogeneous data integration, data warehousing, and computing architectures.

Encryption and compression algorithms for genetic data.

Lookup Symbol

Search CPC

CPC Definition - Subclass G16B