Skip to main content
Multi-model dataset for joint histology and omics inference. Wraps a single WSI together with optional bulk RNA data, providing lazy patch access and a PyTorch-compatible __getitem__ interface.

MultiModelData

class MultiModelData()
Wraps a single Whole Slide Image with its extraction plan. On construction the slide is opened via WSI and the extractor is fitted + executed to produce a list of RegionSpec. Each spec describes one tile/patch location. Patches are read lazily via get_patch.
wsi_path
Path to the WSI file (any format supported by WSI).
bulk_rna_path
Optional path to bulk RNA CSV.
extractor
A configured TileExtractor. A fresh fit_extract is called for every slide so the same extractor object can be reused across slides.
transform
Optional callable applied to the raw np.ndarray patch before it is returned. Receives (H, W, C) uint8 and should return a transformed array (or tensor).
path
Path
Resolved slide path.
bulk_rna_path
Path | None
Resolved path to bulk RNA CSV.
reader
WSIReader
Open reader for the slide.
specs
List[RegionSpec]
Extraction plan (one entry per patch).
slide_name
str
Stem of the slide filename.
Example:
wsi = MultiModelData("tissue.svs", extractor)
patch, meta = wsi.get_patch(0)

get_patch

def get_patch(idx: int) -> Tuple[Any, Dict[str, Any]]
Reads a single patch from the slide.
idx
int
required
Index into specs.
returns
Tuple[Any, Dict[str, Any]]
A tuple (patch, metadata) where patch is an np.ndarray of shape (H, W, C) (or whatever the transform returns), and metadata is a dict with at minimum source, x, y, width, height, slide_name, and tissue_ratio.
Raises:
  • IndexError — If idx is out of range.

close

def close() -> None
Closes the underlying WSI reader and releases resources. Example:
>>> wsi.close()