Empower Your Life Sciences Organization with airSlate SignNow's Pipeline Management System for Life Sciences
See airSlate SignNow eSignatures in action
Our user reviews speak for themselves
Why choose airSlate SignNow
-
Free 7-day trial. Choose the plan you need and try it risk-free.
-
Honest pricing for full-featured plans. airSlate SignNow offers subscription plans with no overages or hidden fees at renewal.
-
Enterprise-grade security. airSlate SignNow helps you comply with global security standards.
Pipeline management system for Life Sciences
pipeline management system for Life Sciences
Experience the benefits of airSlate SignNow's pipeline management system for Life Sciences by simplifying your document processes and improving efficiency. Take advantage of the user-friendly interface and customizable features to streamline your workflows today.
Sign up for a free trial of airSlate SignNow now to see how easy it is to manage your pipeline processes in the Life Sciences industry.
airSlate SignNow features that users love
Get legally-binding signatures now!
FAQs online signature
-
What does Veeva actually do?
Our cloud solutions provide data, software, services and an extensive ecosystem of partners to support our customers' most critical business processes from Research & Development through commercialization.
-
What is a CRM in pipeline management?
Pipeline CRM is a term used to describe a system of keeping track of everyone within your sales pipeline. CRM itself is an abbreviation for the phrase Customer Relationship Management, and although the leads in your pipeline may not yet be customers, they need to be kept track of in just the same way.
-
What type of software is Veeva?
Veeva, a provider of customer relationship-management (CRM) software to the life sciences industry, updated its customers on the unusual transition it is undergoing as an enterprise software provider.
-
What sector is Veeva in?
About Veeva. A global provider of industry-specific cloud based software solutions. Founded in 2007, Veeva Systems addresses the unique operating challenges and regulatory requirements of companies in the Life Sciences & consumer products industries.
-
Is Veeva a healthcare company?
Veeva Systems Inc. is an American cloud-computing company focused on pharmaceutical and life sciences industry applications. Headquartered in Pleasanton, California, it was founded in 2007 by Peter Gassner and Matt Wallach. It works with software as a service (SaaS) in the life-science industry.
-
What is pipeline management?
Pipeline management is the process of identifying and managing all the moving parts — from manufacturing to your sales team— within a supply chain. The best-performing companies learn how to identify where their cash is flowing and then direct that money where it's most productive. This is called “pipeline management.”
-
What type of system is Veeva?
Veeva Vault is a true cloud enterprise content management platform and suite of applications specifically built for life sciences. Traditionally, companies have had to deploy applications for content and separate applications to manage associated data.
-
Is Veeva a CRM software?
The Veeva CRM suite of products is the proven and most advanced CRM solution for the life sciences industry.
Trusted e-signature solution — what our customers are saying
How to create outlook signature
hello everyone thank you for being here with us today at the aws healthcare and life sciences event we are so delighted to be with you today to talk to you about tetrascience tetrascience is an r d data cloud platform that unifies r d data in the cloud we work with biopharma companies around the world to make their data truly accessible and actionable my name is rachel darasak i'm the vice president of marketing at tetroscience and i'm here with my colleague punya biswal who's the chief technology officer and we are so delighted to be with you today to share our story so our talk has three parts we're going to talk about the biopharma digital lab data landscape specifically i'm sure everyone in this room in attending with us today will not be surprised to hear that this data landscape is siloed and fragmented um and then puna is going to talk about the tetrascience platform and how our platform prof provides the hub that unifies this digital lab data of course based on aws and then we'll go through a case study about how we worked with some innovators at biogen using our platform in the real world so when we think about this fragmented siloed landscape in the life sciences we distilled the r d lab down to two two items data sources and data targets the data sources produce the data this is where the scientists and third-party collaborators use instruments to run the experiments and collect all of the data needed to discover and develop new new therapeutics so there are tons of different types of instruments tens of manufacturers hundreds of instruments the new types of instruments makes models software getting updated regularly so there are there are a lot of different ways to generate data and as i mentioned increasingly there's a trend towards third-party collaborators called cros contract research organizations where a lot of these experiments are often outsourced to these third parties on the data target side of the equation this is where the data is consumed so there's also a very busy landscape on this side of the equation there is a virtual alphabet soup of software platforms elns i'll call that call out specifically because that will come up throughout the presentation electronic lab notebooks there's also limbs and scms that cover sample management and process and document management respectively there are data warehouses for archiving there are the new data visualization and data science tools like tableau and this side of the equation is also becoming more and more fragmented and siloed so this siloed fragmented landscape results in pharma r d labs experiencing all the usual big data challenges that most of us are familiar with the data sources forget about talking to the data targets they often don't even talk to each other and when a scientist needs to conduct an experiment often they need to use more than one instrument to conduct that experiment so the data needs to flow from one instrument to another a couple of those data points likely need to go to the eln so that they can do their immediate analysis and figure out what experiment to do next but the complete data set probably needs to go to a data archive and then the data scientists are left trying to access all of the data any way they can usually once the data goes into an eln or an archive it does not come back out again so as i've described this landscape is highly siloed it results in lots of manual processes and transcription by the scientists which means that it's prone to error because humans are processing and transcribing data there are heterogeneous formats because of all the different players and none of the players are incentivized to share data or make it easy for data to flow from one place to another and this all results in the data not being accessible and not being reusable which is not good for any of us because it severely impacts how quickly we can discover and develop new therapeutics so what are companies doing about it today well from what we can tell most companies are either building custom solutions in-house or they're hiring a consulting company to do it for them either way it is not scalable and not efficient think about it if you have 10 data sources and 10 data targets and i can assure you there are far more than 10 data sources and far more than 10 data targets you would need to manage 100 integrations in order to be sure that everything integrated with each other that is a lot of work and there's just no need for it none of the very few of these vendors are going to take it upon themselves to integrate with each other without this manual custom point-to-point integration so it's going to take a third-party neutral partner like tetrascience that focuses first on the data our only focus is on the data and making sure that we are connecting all of the sources with all of the targets so that the data can flow freely from one place to another and scientists and pharmaceutical and biotech companies ultimately can have the data they need in the place they want it with the right format so that they can unlock the power of ai and data science and this approach solves all of those big data challenges it automates the data collection it centralizes everything in the cloud of course based on aws in our case it harmonizes it and all of this means that your data is now prepared for data science and advanced analytics a lot of folks out there who talk about this problem really focus on the first two of these big data challenges the automation and the centralization which are probably the most acute pain points right now but they only get you part way there they only solve part of the problem it's like i have two small children at home so when we have guests over we are often scrambling to clean up the house quickly before the guests arrive and we might take that mess and shove it in the closet and close the door well you can't see the mess we've now centralized the mass but the mess is still there it's still a mess it's just centralized in one location if you don't actually take the time to clean up the mast harmonize it to put things in the place that where they're supposed to be it's never gonna actually be cleaned up and ready for use again so in case i haven't yet convinced you that this is a major problem that can only be solved by a neutral partner let's think about some of the the numbers associated with this problem so we did some quick back of the envelope calculations and if you go ahead and assume that a big pharma company has 1500 scientists that they're paying 125 dollars per hour and that they spend 15 hours per week doing this manual data wrangling the processing the movement the transcription the analysis that pretty quickly adds up to a million hours and 125 million dollars spent in scientists doing data wrangling instead of actually doing their experiments and this is a very very back of the envelope calculation it does not include a lot of things like the cost of data errors or the inability to reproduce experimental results and what that might cost you so what i'd like to know is what could your scientists do with a million more productive hours in a year what about your data scientists what could they do with accessible clean data at their fingertips well i would really like to find out and that's what tetra science is here to do so with that i will turn it over to punya who is going to tell you a whole lot more about the platform itself thanks rachel uh that's a that's a super exciting charge uh to give us as the software team how can we enable these scientists and these data scientists uh these highly specialized precious members of our society to be really much more productive with the time that they're at their hands um and you know i'm excited to show you how we have built software you know leveraging the power of aws to move that mission forward so uh if you could move forward just to quickly recap what rachel said we think of the lab as consisting of data producers and data consumers sources might include lab instruments of varying sophistication going all the way from your mass balance all the way to a uh an hplc you know sophisticated instrument with a supporting data system a windows computer attached to the instrument or it could be something outside the lab like a contract research organization the first step in what tetrascience does is to deeply understand the data formats coming out of each of these sources and we build data connectors that engage with these sources pull the data from them and then send them into the cloud into the data lake right the data lake is where the data gets as rachel said centralized but not yet harmonized and that's the next step we have a set of data pipelines that operate on data that's been ingested into the cloud into the data lake convert them into a standardized data format the intermediate data schema that i'll talk more about later and finally given that the data has been standardized we take that data and prepare it we make it available to all kinds of downstream applications these are all the data targets we talked about so electronic lab notebooks data warehouses databases line of business applications you name it um we power the data flow into all of these different targets so in a nutshell that's what tetra science does right connectors the lake pipelines apis targets so this is still pretty abstract and i'd love to walk you through what this looks like concretely on the ground at one of our implementations so well let's stare at that for a moment that's a lot of icons it's a lot of different things going on and i think if i if i try to walk you through everything that's happening on that page we'd still be here an hour from now so uh just to keep the discussion really focused let's follow the journey of one piece of data as it travels from left to right across the system we start by ingesting some chromatography data and we follow it as it gets centralized harmonized and eventually sent to a uh to a lab notebook on stress uh uh excuse me uh all the way on the right so uh strap in let's uh let's uh let's see how that data travels so uh the first step as i mentioned is the connector uh in this case we're talking about a an hplc uh or a chromatography system and chromatography systems are widely extensively really used across pharma and biopharma so these are among the first set of instruments that we chose to integrate with and the way we think about this as a company is we we pursue deep lasting relationships with the companies that produce these instruments that produce the software systems that control uh the instruments and their data so in this case waters who is uh one of the biggest vendors of a chromato chromatography systems is actually one of our investors we have a long relationship with them they provide uh sdks to access the data produced by the by the by the chromatography systems and uh you know to many folks in the audience if you're building on aws you're probably very used to running systems uh linux based systems in the cloud that's a lot of what we do but uh the instrument landscape is different instruments tend to have control software written in net or on windows but with these are very mature pieces of software that have existed since the 90s and we've built up a lot of expertise in the in the nooks and crannies of these different pieces of software in addition to getting the core information that the software provides which is you know what are the actual physical numbers that were measured for this injection or this sample we're also able to extract additional metadata such as logs and errors produced by the device while processing the sample so you might want to know hey why is my chromatography system degrading over time oh it turns out there's a firmware problem that's emitting extra logs you want to know that and uh just thinking going beyond the basic functionality of these of these data extractors these connectors we also want to make sure that when we these connectors onto a lab workstation we're not getting in the way of the scientists ability to get their work done these lab workstations aren't single purpose devices like servers in the cloud they exist to be used for a variety of ad hoc reasons by scientists we want to be very respectful of what they're doing we want to meet the scientists where they are now i focus here heavily on the approach where we an agent onto a computer connected to an instrument but there's a whole other path where we use iot approaches where we uh we put a physical tetra science device connected to a port on the instrument and extract data that way we won't talk more about that but just something to keep in mind so we've taken that data from the instrument it's been turned into a big messy blob of binary information and we're ready to put it into the cloud fantastic we will centralize your data we will put it into the data science data lake powered by s3 everyone's favorite data leak technology and uh and and now is when we're able to start really doing a lot more with this data so the very first thing we do once we've got the data written down is to figure out how can we make this data really accessible to a lot of different tools currently it's locked away in some binary format the schema is poorly understood how can we make it more accessible and we reached for what is probably the the preeminent data format of our times which is json and we figured out a set of strongly typed json schemas representing different categories of scientific data we try to design these schemas so that they capture the essence of what is going on with an instrument category or a type of scientific data without being tied to each vendor so if you take a waters hplc's readouts convert them into our hplc ids you can now compare those results across different hplc vendors you can understand you can begin to ask questions like which vendors columns degrade faster or slower how do these how do these experiments how do these experimental results reproduce across different sites right all of these uh cross cohort analyses that are much harder to do when you're siloed by vendor and finally given the data given the json formatted data that we've extracted from these uh vendor-specific binary blobs we're able to power a variety of cloud-native solutions that give you deeper insight into that data so we're able to give you elastic search for search athena for sql queries powered by that same intermediate data schema that we talked about so all of this once you get your data into the cloud without any additional work on the scientist's path i want to move on from there on to the next stage in the process which is our data pipelines we talked about how the data got transformed into a standard widely accessible format and now starting from here we're able to get that data and push it to a variety of targets uh these can be on-premises or cloud-based targets elns uh and we're going to focus on elns in this specific uh uh example typically alns have rest apis uh each eln may have a totally different rest api but because we're starting from json it's trivially easy to adapt the input format into whatever specific rest format the eln requires one of the most compelling use cases of doing this is that we're able to link back an elns slice of the data to the full rich data set that's now preserved in the cloud one of the one of the popular airlines we work with has a 25 megabyte limit on data sets that can be attached to an eln result uh whereas uh mass spec data or flow cytometry data can easily go up into the gigabytes or even terabytes so being able to link back to that from an from a visualization or a scientist's notes uh can be invaluable in an audit setting or when you're trying to investigate deeply more deeply into results so hopefully i was able to give you some picture of of what it is that we're able to do with when we create this network linking together all these nodes with tetrascience uh uh and to recap right first piece connectors this represents the investment we've made into understanding different data sources we understand how to write windows software how to build deep relationships with vendors and how to bring in third-party collaborators like cros the second piece of of our approach is really creating this network positioning ourselves in the market as a sort of neutral but heavily armed switzerland if you will that really only cares about the data we're not trying to control the end user experience we're not trying to brand how the scientist gets their hands on the data we want to make sure that everyone can come together and exchange data with the lowest possible friction right the third piece uh in our approach is the intermediate data schema and and this is really the technical foundation of the network it's uh it's something that we put a ton of effort into our scientists engineers think deeply about how to represent different categories of scientific data but still keep the json keep it really widely accessible to data scientists and to enter existing tools which finally brings me to the last piece which is open standards the reason that we use json the reason that we use uh these well-defined formats is to meet data scientists where they are to meet your id where they are and make it possible for them to leverage python are pandas all of the existing systems that they're used to without needing to relearn and re-tool to take advantage of this data way to slice how we're doing this right uh is to ask why aws right um the ideas that i spoke about on the previous slide they're fairly generic they translate they transcend any implementation strategy but we have really found aw has to be the perfect foundation on which to build this product and this platform uh first of all uh aws has hands down the best integration with enterprise it networking uh the pharma company the huge pharma companies that we're going after they have uh just massive it investment they're looking for a mature cloud partner who can who can anticipate and work with them at every stage of the transition uh once we're in the cloud aws provides the most mature cloud native service offerings so by taking advantage of these service offerings we're able to create a lot of flexibility we're able to minimize cost and we're able to get to market as fast as possible we don't have to build a lot of the infrastructure for these for this work ourselves we can focus on understanding scientific data formats and uh the the end result of this for us has been we're all in on aws we don't use uh any kind of intermediary layer uh we're not hedging our bets on cloud versus uh versus on premises we really try to make full take full advantage of aws offerings and we found it to be a tremendous accelerant in delivering our product so uh with that over to you rachel i'd love to hear more about how this has been how this has been working in the market thank you punya for sharing some of the magic behind the scenes um i'm in marketing so i get to think of it all as magic so we wanted to close out today by talking about a real world use case of our platform with a couple of innovators who we are working closely with at biogen who wanted to apply data automation and standards to their cell counter files um so you know uh len blackwell and george vanden jerushi came to us or i'm going to pretend they came to us because i'm not entirely sure how the opportunity came about but they had a situation in their bio processing bio manufacturing facility where they're using cell counters to figure out how to optimize their protein production and they've discovered that this process was entirely manual and [Music] just took far too much time and introduced far too much error into the entire process and they were tasked with fixing this problem um so let me talk you through the situation so they have these five cell counters each cell counter has a local on-premises pc that captures the data the way that the cell counter works my understanding of it is that it runs one sample each cell counter runs one sample per day and the running of the sample involves taking 50 images over the 24 hour period and then at the end of the 24 hour period the cell counter compares all of these different images takes different messages and runs a bunch of calculations on those images now the scientists need two or maybe three of those data points immediately that they want to transfer to their eln their electronic lab notebook so that they can do their analyses and get ready to run the next set of experiments every six months the informatics team or the r d i t team would then go to those on-premise pcs and manually transfer the full six months worth of data over to their data archive so that full amount of data is for each one of those samples so five cell counters times one sample per day times 50 images and one text dot txt file for each sample so this is a lot of data in a bunch of different formats and it made it really challenging to find anything once the data went into the eln or into the data archive it was almost impossible to get it back out now the picture on your screen doesn't even really do it justice in my opinion it's really more like this because the process was so manual those point-to-point integrations didn't even exist so you know i loved how len characterizes this problem the lab of the feature is built on data agreed their cell counter data was basically inaccessible the heterogeneous nature of the information made it difficult to analyze without significant manual manipulation so he tasked his team with making this data accessible and actionable for the data scientists and data scientists so the specific data integrity risks that they identified were the presence of multiple file types so both image files and text files the volume of data so the 51 files generated for each sample multiplied by the number of days and the number of instruments the fact that the files were stored separately was a big issue each of those 50 images plus the text file all stored separately and just the manual manipulation required to go through this entire process so what that all rolls up to is the fact that the data just wasn't accessible if their data scientists wanted to do more secondary analysis and cross-cutting analysis across experiments or runs or some time period it was almost impossible for them to do that so the goals for the project that we worked on with them were to eliminate these data integrity risks and better align with fair principles so making the data findable accessible interoperable and reusable so our charge was simple automate and standardize so what did we do we did exactly what punya described uh using our platform is set up to do so we had a connector that connected directly with these cell counters um so we're able to automatically grab the data from the cell counter pretty much as soon as the sample was run put it into the data lake on aws run the pipelines to harmonize the data and then we were able to move the required data points in the required format to the different data targets so as i said before just those two or three data points that the scientists needed were able to be sent over to the eln pretty much immediately we were also able to work with george and len and their team to automate the basic analysis that the scientists did within the eln so once the data arrives across sample runs over i think their their experimental pyramid was period was you know a week or two so the curve that they needed to look at to really evaluate the results was automatically showed up in their eln so not only did they not have to manually transfer the data they didn't manually have to complete this routine analysis either then the full data in this case anyone who's who's very familiar with the r d r d lab space may have heard of the allotrope foundation and the allotrope data format so lor len and george wanted the full data and the data archive to be transformed into this adf allotrope data format allotrope data file format for the long term storage in the data archive so we were able to use another data pipeline to transform the data and package up all 50 images plus the text file into a single data file that could then be placed into the data archive now what this opened up was the ability to do data science and advanced analytics that folks at biogen had not been able to do before so now that the data scientists could use the software that they are already familiar with and they're able to go find the files that they were looking to find with a simple query import the the files and they had all of the images and all of the information from the various sample runs in their software and they were able to do some advanced analytics that they were unable to do before which led to some interesting conclusions and i have a link hopefully these slides will be made available or you can come to the tetrascience website on our blog and on our youtube page and there is a blog post and a couple of videos that go deeper into this use case um and and talk about what this uh these analytics that they unlocked whereas as along with the demo of how it works so i also love this quote from len about how the project ended up or at least this first phase of the project ended up he talks about these two really important improvements that came out of the process first analysis is fully automated which they were just made a huge difference in the scientist's day and second the fact that data integrity is improved by aggregating the multiple multiple files from each sample into a single file plus the storage of the data in the archive was automated so these are key points that he wanted to make sure no one overlooked so then we're not overlooking those points so thank you again for spending the time with us today and to aws for having us we're so glad to be part of this event contact information is included in this presentation so please feel free to reach out we would love to continue the conversation online or offline thank you thanks everyone
Show more










