lamindb.ULabel

class lamindb.ULabel(name: str, type: ULabel | None = None, is_type: bool = False, description: str | None = None, reference: str | None = None, reference_type: str | None = None)

Bases: DBRecord, HasParents, CanCurate, TracksRun, TracksUpdates

Universal labels.

Parameters:
  • namestr A name.

  • descriptionstr A description.

  • referencestr | None = None For instance, an external ID or a URL.

  • reference_typestr | None = None For instance, "url".

A ULabel record provides the easiest way to annotate a dataset with a label: "My project", "curated", or "Batch X":

>>> my_project = ULabel(name="My project")
>>> my_project.save()
>>> artifact.ulabels.add(my_project)

Often, a ulabel is measured within a dataset. For instance, an artifact might characterize 2 species of the Iris flower ("setosa" & "versicolor") measured by a "species" feature. Use the DataFrameCurator flow to automatically parse, validate, and annotate with labels that are contained in DataFrame objects.

Note

If you work with complex entities like cell lines, cell types, tissues, etc., consider using the pre-defined biological registries in bionty to label artifacts & collections.

If you work with biological samples, likely, the only sustainable way of tracking metadata, is to create a custom schema module.

See also

Feature()

Dimensions of measurement for artifacts & collections.

features

Feature manager for an artifact.

Examples

Create a new label:

>>> train_split = ln.ULabel(name="train").save()

Organize labels in a hierarchy:

>>> split_type = ln.ULabel(name="Split", is_type=True).save()
>>> train_split = ln.ULabel(name="train", type="split_type").save()

Label an artifact:

>>> artifact.ulabels.add(ulabel)

Query an artifact by label:

>>> ln.Artifact.filter(ulabels=train_split).df()

Attributes

DoesNotExist = <class 'lamindb.models.ulabel.ULabel.DoesNotExist'>
Meta = <class 'lamindb.models.dbrecord.DBRecord.Meta'>
MultipleObjectsReturned = <class 'lamindb.models.ulabel.ULabel.MultipleObjectsReturned'>
artifacts: Artifact

Linked artifacts.

children: ULabel

Child entities of this ulabel.

Reverse accessor for parents.

collections: Collection

Linked collections.

created_by: User

Creator of record.

created_by_id

Accessor to the related objects manager on the reverse side of a many-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

Parent.children is a ReverseManyToOneDescriptor instance.

Most of the implementation is delegated to a dynamically defined manager class built by create_forward_many_to_many_manager() defined below.

Accessor to the related objects manager on the reverse side of a many-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

Parent.children is a ReverseManyToOneDescriptor instance.

Most of the implementation is delegated to a dynamically defined manager class built by create_forward_many_to_many_manager() defined below.

Accessor to the related objects manager on the reverse side of a many-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

Parent.children is a ReverseManyToOneDescriptor instance.

Most of the implementation is delegated to a dynamically defined manager class built by create_forward_many_to_many_manager() defined below.

Accessor to the related objects manager on the reverse side of a many-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

Parent.children is a ReverseManyToOneDescriptor instance.

Most of the implementation is delegated to a dynamically defined manager class built by create_forward_many_to_many_manager() defined below.

Accessor to the related objects manager on the reverse side of a many-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

Parent.children is a ReverseManyToOneDescriptor instance.

Most of the implementation is delegated to a dynamically defined manager class built by create_forward_many_to_many_manager() defined below.

objects = <lamindb.models.query_manager.QueryManager object>
parents: ULabel

Parent entities of this ulabel.

For advanced use cases, you can build an ontology under a given type.

Say, if you modeled CellType as a ULabel, you would introduce a type CellType and model the hiearchy of cell types under it.

property pk
projects: Project

Linked projects.

records: ULabel

DBRecords of this type.

run: Run | None

Run that created record.

run_id
runs: Run

Linked runs.

space: Space

The space in which the record lives.

space_id
transforms: Transform

Linked transforms.

type: ULabel | None

Type of ulabel, e.g., "donor", "split", etc.

Allows to group ulabels by type, e.g., all donors, all split ulabels, etc.

type_id

Class methods

classmethod from_values(values, field=None, create=False, organism=None, source=None, mute=False)

Bulk create validated records by parsing values for an identifier such as a name or an id).

Parameters:
  • values (list[str] | Series | array) – A list of values for an identifier, e.g. ["name1", "name2"].

  • field (str | DeferredAttribute | None, default: None) – A DBRecord field to look up, e.g., bt.CellMarker.name.

  • create (bool, default: False) – Whether to create records if they don’t exist.

  • organism (DBRecord | str | None, default: None) – A bionty.Organism name or record.

  • source (DBRecord | None, default: None) – A bionty.Source record to validate against to create records for.

  • mute (bool, default: False) – Whether to mute logging.

Return type:

DBRecordList

Returns:

A list of validated records. For bionty registries. Also returns knowledge-coupled records.

Notes

For more info, see tutorial: Manage biological registries.

Example:

import bionty as bt

# Bulk create from non-validated values will log warnings & returns empty list
ulabels = ln.ULabel.from_values(["benchmark", "prediction", "test"])
assert len(ulabels) == 0

# Bulk create records from validated values returns the corresponding existing records
ulabels = ln.ULabel.from_values(["benchmark", "prediction", "test"], create=True).save()
assert len(ulabels) == 3

# Bulk create records from public reference
bt.CellType.from_values(["T cell", "B cell"]).save()
classmethod inspect(values, field=None, *, mute=False, organism=None, source=None, from_source=True, strict_source=False)

Inspect if values are mappable to a field.

Being mappable means that an exact match exists.

Parameters:
  • values (list[str] | Series | array) – Values that will be checked against the field.

  • field (str | DeferredAttribute | None, default: None) – The field of values. Examples are 'ontology_id' to map against the source ID or 'name' to map against the ontologies field names.

  • mute (bool, default: False) – Whether to mute logging.

  • organism (str | DBRecord | None, default: None) – An Organism name or record.

  • source (DBRecord | None, default: None) – A bionty.Source record that specifies the version to inspect against.

  • strict_source (bool, default: False) – Determines the validation behavior against records in the registry. - If False, validation will include all records in the registry, ignoring the specified source. - If True, validation will only include records in the registry that are linked to the specified source. Note: this parameter won’t affect validation against public sources.

Return type:

InspectResult

See also

validate()

Example:

import bionty as bt

# save some gene records
bt.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol", organism="human").save()

# inspect gene symbols
gene_symbols = ["A1CF", "A1BG", "FANCD1", "FANCD20"]
result = bt.Gene.inspect(gene_symbols, field=bt.Gene.symbol, organism="human")
assert result.validated == ["A1CF", "A1BG"]
assert result.non_validated == ["FANCD1", "FANCD20"]
classmethod standardize(values, field=None, *, return_field=None, return_mapper=False, case_sensitive=False, mute=False, source_aware=True, keep='first', synonyms_field='synonyms', organism=None, source=None, strict_source=False)

Maps input synonyms to standardized names.

Parameters:
  • values (Iterable) – Identifiers that will be standardized.

  • field (str | DeferredAttribute | None, default: None) – The field representing the standardized names.

  • return_field (str | DeferredAttribute | None, default: None) – The field to return. Defaults to field.

  • return_mapper (bool, default: False) – If True, returns {input_value: standardized_name}.

  • case_sensitive (bool, default: False) – Whether the mapping is case sensitive.

  • mute (bool, default: False) – Whether to mute logging.

  • source_aware (bool, default: True) – Whether to standardize from public source. Defaults to True for BioRecord registries.

  • keep (Literal['first', 'last', False], default: 'first') –

    When a synonym maps to multiple names, determines which duplicates to mark as pd.DataFrame.duplicated: - "first": returns the first mapped standardized name - "last": returns the last mapped standardized name - False: returns all mapped standardized name.

    When keep is False, the returned list of standardized names will contain nested lists in case of duplicates.

    When a field is converted into return_field, keep marks which matches to keep when multiple return_field values map to the same field value.

  • synonyms_field (str, default: 'synonyms') – A field containing the concatenated synonyms.

  • organism (str | DBRecord | None, default: None) – An Organism name or record.

  • source (DBRecord | None, default: None) – A bionty.Source record that specifies the version to validate against.

  • strict_source (bool, default: False) – Determines the validation behavior against records in the registry. - If False, validation will include all records in the registry, ignoring the specified source. - If True, validation will only include records in the registry that are linked to the specified source. Note: this parameter won’t affect validation against public sources.

Return type:

list[str] | dict[str, str]

Returns:

If return_mapper is False – a list of standardized names. Otherwise, a dictionary of mapped values with mappable synonyms as keys and standardized names as values.

See also

add_synonym()

Add synonyms.

remove_synonym()

Remove synonyms.

Example:

import bionty as bt

# save some gene records
bt.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol", organism="human").save()

# standardize gene synonyms
gene_synonyms = ["A1CF", "A1BG", "FANCD1", "FANCD20"]
bt.Gene.standardize(gene_synonyms)
#> ['A1CF', 'A1BG', 'BRCA2', 'FANCD20']
classmethod validate(values, field=None, *, mute=False, organism=None, source=None, strict_source=False)

Validate values against existing values of a string field.

Note this is strict_source validation, only asserts exact matches.

Parameters:
  • values (list[str] | Series | array) – Values that will be validated against the field.

  • field (str | DeferredAttribute | None, default: None) – The field of values. Examples are 'ontology_id' to map against the source ID or 'name' to map against the ontologies field names.

  • mute (bool, default: False) – Whether to mute logging.

  • organism (str | DBRecord | None, default: None) – An Organism name or record.

  • source (DBRecord | None, default: None) – A bionty.Source record that specifies the version to validate against.

  • strict_source (bool, default: False) – Determines the validation behavior against records in the registry. - If False, validation will include all records in the registry, ignoring the specified source. - If True, validation will only include records in the registry that are linked to the specified source. Note: this parameter won’t affect validation against public sources.

Return type:

ndarray

Returns:

A vector of booleans indicating if an element is validated.

See also

inspect()

Example:

import bionty as bt

bt.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol", organism="human").save()

gene_symbols = ["A1CF", "A1BG", "FANCD1", "FANCD20"]
bt.Gene.validate(gene_symbols, field=bt.Gene.symbol, organism="human")
#> array([ True,  True, False, False])

Methods

add_synonym(synonym, force=False, save=None)

Add synonyms to a record.

Parameters:
  • synonym (str | list[str] | Series | array) – The synonyms to add to the record.

  • force (bool, default: False) – Whether to add synonyms even if they are already synonyms of other records.

  • save (bool | None, default: None) – Whether to save the record to the database.

See also

remove_synonym()

Remove synonyms.

Example:

import bionty as bt

# save "T cell" record
record = bt.CellType.from_source(name="T cell").save()
record.synonyms
#> "T-cell|T lymphocyte|T-lymphocyte"

# add a synonym
record.add_synonym("T cells")
record.synonyms
#> "T cells|T-cell|T-lymphocyte|T lymphocyte"
async adelete(using=None, keep_parents=False)
async arefresh_from_db(using=None, fields=None, from_queryset=None)
async asave(*args, force_insert=False, force_update=False, using=None, update_fields=None)
clean()

Hook for doing any extra model-wide validation after clean() has been called on every field by self.clean_fields. Any ValidationError raised by this method will not be associated with a particular field; it will have a special-case association with the field defined by NON_FIELD_ERRORS.

clean_fields(exclude=None)

Clean all fields and raise a ValidationError containing a dict of all validation errors if any occur.

date_error_message(lookup_type, field_name, unique_for)
delete()

Delete.

Return type:

None

get_constraints()
get_deferred_fields()

Return a set containing names of deferred fields for this instance.

prepare_database_save(field)
query_children()

Query children in an ontology.

Return type:

QuerySet

query_parents()

Query parents in an ontology.

Return type:

QuerySet

refresh_from_db(using=None, fields=None, from_queryset=None)

Reload field values from the database.

By default, the reloading happens from the database this instance was loaded from, or by the read router if this instance wasn’t loaded from any database. The using parameter will override the default.

Fields can be used to specify which fields to reload. The fields should be an iterable of field attnames. If fields is None, then all non-deferred fields are reloaded.

When accessing deferred fields of an instance, the deferred loading of the field will call this method.

remove_synonym(synonym)

Remove synonyms from a record.

Parameters:

synonym (str | list[str] | Series | array) – The synonym values to remove.

See also

add_synonym()

Add synonyms

Example:

import bionty as bt

# save "T cell" record
record = bt.CellType.from_source(name="T cell").save()
record.synonyms
#> "T-cell|T lymphocyte|T-lymphocyte"

# remove a synonym
record.remove_synonym("T-cell")
record.synonyms
#> "T lymphocyte|T-lymphocyte"
save(*args, **kwargs)

Save.

Always saves to the default database.

Return type:

DBRecord

save_base(raw=False, force_insert=False, force_update=False, using=None, update_fields=None)

Handle the parts of saving which should be done only once per save, yet need to be done in raw saves, too. This includes some sanity checks and signal sending.

The ‘raw’ argument is telling save_base not to save any parent models and not to do any changes to the values before save. This is used by fixture loading.

serializable_value(field_name)

Return the value of the field name for this instance. If the field is a foreign key, return the id value instead of the object. If there’s no Field object with this name on the model, return the model attribute’s value.

Used to serialize a field’s value (in the serializer, or form output, for example). Normally, you would just access the attribute directly and not use this method.

set_abbr(value)

Set value for abbr field and add to synonyms.

Parameters:

value (str) – A value for an abbreviation.

See also

add_synonym()

Example:

import bionty as bt

# save an experimental factor record
scrna = bt.ExperimentalFactor.from_source(name="single-cell RNA sequencing").save()
assert scrna.abbr is None
assert scrna.synonyms == "single-cell RNA-seq|single-cell transcriptome sequencing|scRNA-seq|single cell RNA sequencing"

# set abbreviation
scrna.set_abbr("scRNA")
assert scrna.abbr == "scRNA"
# synonyms are updated
assert scrna.synonyms == "scRNA|single-cell RNA-seq|single cell RNA sequencing|single-cell transcriptome sequencing|scRNA-seq"
unique_error_message(model_class, unique_check)
validate_constraints(exclude=None)
validate_unique(exclude=None)

Check unique constraints on the model and raise ValidationError if any failed.

view_children(field=None, distance=5)

View children in an ontology.

Parameters:
  • field (str | DeferredAttribute | None, default: None) – Field to display on graph

  • distance (int, default: 5) – Maximum distance still shown.

Ontological hierarchies: ULabel (project & sub-project), CellType (cell type & subtype).

Examples

>>> import bionty as bt
>>> bt.Tissue.from_source(name="subsegmental bronchus").save()
>>> record = bt.Tissue.get(name="respiratory tube")
>>> record.view_parents()
>>> tissue.view_parents(with_children=True)
view_parents(field=None, with_children=False, distance=5)

View parents in an ontology.

Parameters:
  • field (str | DeferredAttribute | None, default: None) – Field to display on graph

  • with_children (bool, default: False) – Whether to also show children.

  • distance (int, default: 5) – Maximum distance still shown.

Ontological hierarchies: ULabel (project & sub-project), CellType (cell type & subtype).

Examples

>>> import bionty as bt
>>> bt.Tissue.from_source(name="subsegmental bronchus").save()
>>> record = bt.Tissue.get(name="respiratory tube")
>>> record.view_parents()
>>> tissue.view_parents(with_children=True)