Skip to content
brickster.ai
All videos
newsDatabricks·August 26, 2021

Empowering Real Time Patient Care Through Spark Streaming

Description

Takeda’s Plasma Derived Therapies (PDT) business unit has recently embarked on a project to use Spark Streaming on Databricks to empower how they deliver value to their Plasma Donation centers. As patients come in and interface without clinics, we store and track all of the patient interactions in real time and deliver outputs and results based on said interactions. The current problem with our existing architecture is that it is very expensive to maintain and has an unsustainable number of failure points. Spark Streaming is essential for allowing this use case because it allows for a more robust ETL pipeline. With Spark Streaming, we are able to replace our existing ETL processes (that are based on Lamdbas, step functions, triggered jobs, etc) into a purely stream driven architecture. Data is brought into our s3 raw layer as a large set of CSV files through AWS DMS and Informatica IICS as these services bring data from on-prem systems into our cloud layer. We have a stream currently running which takes these raw files up and merges them into Delta tables established in the bronze/stage layer. We are using AWS Glue as the metadata provider for all of these operations. From the sta

Description from YouTube. Full content on the video page.

More from Databricks