spot_img
HomeResearch & DevelopmentMiDeSeC: A New Dataset to Advance Automated Mitosis Detection...

MiDeSeC: A New Dataset to Advance Automated Mitosis Detection in Breast Cancer

TLDR: A new public dataset, MiDeSeC, has been introduced to improve automatic detection and segmentation of mitosis in breast cancer histopathology images. Addressing the challenges of varied cell appearance, low probability, and subjective manual grading, the dataset provides H&E stained images with precise ground truth coordinates. It aims to facilitate the development of more accurate and reliable automated cancer grading systems.

Breast cancer grading is a critical step in diagnosis, and one of the most important factors considered is the mitotic count. Mitosis refers to the process of cell division, and counting these dividing cells in tissue samples helps pathologists understand how quickly a tumor is growing. Traditionally, this involves pathologists manually searching for mitotic cells on glass slides under a microscope. This task is often tedious, time-consuming, and can be subjective, leading to inconsistencies between different pathologists.

The challenges in accurately detecting mitosis are numerous. Mitotic cells appear in various forms, corresponding to different stages of cell division (prophase, metaphase, anaphase, and telophase). Cells in these stages have distinct shapes and structures. Furthermore, mitotic cells are significantly less common than non-mitotic cells, making them harder to find. Adding to the complexity, some other cell types, like apoptotic cells or those with dense nuclei, can look very similar to mitotic cells, making differentiation difficult.

In response to these challenges, there has been a growing interest in developing automated methods for mitosis detection. Several past contests, such as the 2012 ICPR, AMIDA13, 2014 ICPR MITOS-ATYPIA, and TUPAC16, have contributed to advancements in this field. However, most of these contests did not specifically focus on breast cancer, which is a leading cause of cancer-related deaths among women.

To address this gap, a new publicly available dataset called MiDeSeC has been introduced. MiDeSeC is specifically designed for the detection and segmentation of mitosis in Hematoxylin and Eosin (H&E) stained breast cancer histopathology images. The dataset aims to support the development of more robust and reliable automated cancer grading systems.

About the MiDeSeC Dataset

The MiDeSeC dataset was created using H&E stained slides of invasive breast carcinoma (no special type) from 25 different patients. These slides were captured at 40x magnification at the Department of Medical Pathology at Ankara University. The images were scanned using a 3D Histech Panoramic p250 Flash-3 scanner and an Olympus BX50 microscope. Recognizing the diverse appearances of mitotic cells, the dataset includes a total of 50 regions, each measuring 1024 × 1024 pixels, selected from the glass slides. In total, these 50 regions contain over 500 mitoses. For research purposes, two-thirds of these regions are designated for training, and the remaining third for testing.

Each high-power field (HPF) in the dataset comes with a ground truth text file in CSV format. This file precisely indicates the coordinates of all pixels belonging to each mitosis region. The coordinate system is standard, with the origin (0,0) located at the top-left corner of the image. Pixel coordinates are given as integers, and importantly, the dataset accounts for cases where a single mitosis might have gaps in its shape, representing it as a single entity despite the discontinuity.

Also Read:

Evaluating Performance

To assess the effectiveness of mitosis detection and segmentation methods using the MiDeSeC dataset, standard evaluation metrics are employed: recall, precision, and F1-Score. Recall measures how many of the actual mitoses were correctly identified (True Positives relative to all ground truth mitoses). Precision indicates how many of the detected mitoses were actually correct (True Positives relative to all detected mitoses). The F1-Score provides a balance between precision and recall, offering a single metric for overall performance.

The MiDeSeC dataset is openly accessible, providing a valuable resource for researchers and developers working on automated diagnostic tools for breast cancer. You can download the dataset from the following link: Dataset available here.

Karthik Mehta
Karthik Mehtahttps://blogs.edgentiq.com
Karthik Mehta is a data journalist known for his data-rich, insightful coverage of AI news and developments. Armed with a degree in Data Science from IIT Bombay and years of newsroom experience, Karthik merges storytelling with metrics to surface deeper narratives in AI-related events. His writing cuts through hype, revealing the real-world impact of Generative AI on industries, policy, and society. You can reach him out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -