Challenge & Solution
In the start-up phase, the company was able to rely on institutional high-performance computing (HPC) resources. As analysis needs grew and Allcyte evolved into an independent company, image analysis was outsourced to Google Cloud Platform (GCP). Since 2017, Allcyte has analyzed millions of images of patient samples to determine the likely clinical efficacy of a large number of cancer treatments simultaneously -- prior to the launch of costly and risky clinical trials.
Allcyte is a biotech company that combines advanced microscopy with Artificial Intelligence to advance cancer research in order to support new drug development. Allcyte's innovative Pharmacoscopy® platform enables the assessment of cancer drug efficacy at the single cell level. In collaboration with CLOUDPILOTS and Google, Allcyte implemented a flexible GPU cluster to implement the next generation of AI-powered high-throughput microscopy and image analysis
Allcyte: Combining single cell microscopy and AI...
...for profiling anticancer drugs in primary human tissues.
Allcyte's innovative Pharmacoscopy® platform enables the assessment of cancer drug efficacy at the single cell level. In collaboration with CLOUDPILOTS and Google, Allcyte implemented a flexible GPU cluster to implement next-generation AI-powered high-throughput microscopy and image analysis.
Allcyte's Pharmacoscopy® Platform:
- Captures, stores, and analyzes more than 100 terabytes of image data every year.
- Implements analysis in a fast, reliable, and on-demand manner thanks to Kubernetes Cluster Autoscaling.
- Enables reproducible analysis as well as continuous improvements by leveraging the Cloud Build Continuous Integration system.
Biotech startup Allcyte captures and analyzes tens of thousands of microscopy images per hour of cancer cells directly in human tissue samples to preview the effectiveness of cancer drugs in individual patients.
Allcyte was founded as a spin-off of the Research Center for Molecular Medicine (CeMM) of the Austrian Academy of Sciences. In the start-up phase, the company was able to draw on institutional high-performance computing (HPC) resources. As the need for analysis increased and Allcyte evolved into an independent company, image analysis was outsourced to Google Cloud Platform (GCP). Since 2017, Allcyte has analyzed millions of images of patient samples to determine the likely clinical efficacy of a large number of cancer treatments simultaneously -- before the start of costly and risky clinical trials. Supported by increasingly realistic, patient-centric models, Pharmacoscopy® has proven to be an effective tool in the early stages of the drug development process as well as precision medicine.
In 2019, Allcyte's computational team was ready for the next challenge: how can Pharmacoscopy be applied beyond blood cancers to multi-faceted cancer indications? Disseminated tumor cells (DTCs) can occur during the course of multiple cancer indications, including pancreatic, ovarian, and lung cancer. However, while blood cancer cells are homogeneous and can therefore be analyzed using proven classical image analysis methods, DTC are highly morphologically mixed. This requires the precise recognition of complex objects in a high-throughput manner; a task for which modern region-based Convolutional Neural Networks (R-CNNs) are ideally suited. Developing these AI solutions for Pharmacoscopy® required rapid prototyping and deployment of a novel GPU-based analysis pipeline in the Cloud.
Rapid transition from prototype to deployment
GCP's high adaptability - from analysis prototype to deployment - is a key advantage over traditional, on-premises IT infrastructure. After quickly setting up the development infrastructure, Allcyte's researchers were able to access existing data and begin work within days, thanks to Google Workspace and Cloud storage.
"Compared to inflexible local solutions, our biggest concern with GCP was to be able to scale new analysis streams quickly. Unlike traditional image analysis solutions, the newly developed deep learning platform for disseminated tumor cell analysis required GPU-enabled hardware. By implementing in the Cloud with Kubernetes, we were able to quickly test and deploy the new architecture without a large upfront investment. Working in the Cloud has helped research and development at Allcyte tremendously."
Soon after, a prototype was ready for use: R-CNNs allowed images of complex DTC cell samples to be analyzed with high fidelity on a single GPU machine: 9600 images every two hours. To achieve the goal of analyzing hundreds of thousands of images within a few hours, Allcyte data scientist Florian Rohrer worked with CLOUDPILOTS to translate the prototype into a scalable cluster solution.
Flexible clusters with unlimited scaling
Allcyte's new DTC analysis pipeline was not suitable for traditional HPC environments due to the intermittently, demanding workload: An on-premises solution would have required maintaining excess capacity at all times. The Cloud-based Kubernetes solution allows data to be accessed directly from its location in the Cloud, while the machines used can scale flexibly in highly parallel operations without network bottlenecks. Automatic cluster scaling allows dozens of powerful GPU computers to be accessed during peak demands and reduced to a single machine during standby.
In addition to the performance requirements of the analysis, Allcyte has high demands on traceability and reproducibility. At the same time, continuously improved versions of AI models should be traceable. To keep track, Allcyte relies on integrated monitoring by Stackdriver. In addition, Cloud Build facilitates the development, integration, and deployment cycle so that a new version of the software on Github can be brought into a deployable form within minutes. As a result, every single analysis becomes traceable and is consistently reproducible down to the specific version of the software environment.
The release of the DTC Pharmacoscopy® pipeline represents the second major innovation in Allcyte's analytical infrastructure in the last two years. As the company continues to expand its scope, the ability to rapidly develop and deploy new analytical streams will be a key factor in its success. As a data-driven company, Allcyte is also continually evolving its data storage infrastructure. Google Cloud Storage provides a robust, high-performance foundation for this. In the future, Allcyte also intends to leverage Data Warehousing solutions such as Big Query to make Pharmacoscopy® pipeline results directly available for integration with other analytical streams - providing valuable insights into the mechanisms and efficacy of novel cancer drugs.