CellPort Connect - CC 104: Artificial Intelligence and Clean Data

So when we think about the people who have been using Excel to store the information about their cell culturing and banking processes, they are primed for digital transformation. Digital transformation is the culmination of a cloud computing platform with big data that allows you to do things with machine learning. If we think about the platform that we've created with CellPort, it's providing that cloud computing platform into which our customers can put their data upon which they can build these deep learning AI models.

One of the, the the challenges when we mention the concepts of AI, it can't be underscored enough that consistency and rules kind of need to play in there and especially in the the the the life sciences area. Especially when you're dealing with a cell, something that's living and changing all of the time. How we handle it. How we feed it. Who touches it. What materials used? What instrumentation is used? How we analyze the data and I think we saw this on repeatedly over the time frame in which we worked in the small molecule space, which was creating models for doing prediction algorithms created to take a look at a molecular structure and try to predict drug like properties of the of the molecule. And one of the things we encountered was the sheer volume of data that the neural networks people needed in order to make predictable models and it was a what we saw as an ambitious goal, we were involved with it intimately in providing data, but the type of data that is needed for generating these models needs to be pretty clean and clean and reproducible and robust. And the challenge with that is that's expensive data to come by. It's it's very expensive. So how do you think this is going to work if most people are under the gun to deliver a report on time to, you know, respond, produce data that will move something into the clinic and at the same time, try to model this data in an AI format.

One of the things that is wonderful about the CellPort platform is that it allows you to do both. It allows you to track and trace all the information about everything that's happened related to the cells in your system and from that, create a bolus of information that can be used to train those models. Whether you're looking for trends trying to spot problems it all starts with having a data set that is as big and as pristine as you can. By definition, we're creating clean data. By definition, we're creating things that have been built upon, standardized onto as much as we can that are community acceptable. That will allow that data to be reusable. Again, the FAIR principles of findable, accessible interoperable and reusable having that data in a consistent format is the foundation for that and that lets our customers take that data and run with it to be honest with you.

There is a gap right now that is fairly significant, and that gap is between the technologies that have enabled us to create from combinatorial chemistry. Now, we're getting into concepts of combinatorial biology where we're able to create through genes and, and, and constructs and, and delivery systems, a wide variety of manipulated cells essentially creating new cell lines, but those cell lines need to be understood and characterized. We have a gap where we're creating new things which is incredibly exciting, but that next phase is still a little bit slow and what we would like to see there is that next phase recorded very carefully and in a structured format and, from that, we will start building the opportunities of testing where these consistencies and variations exist essentially what AI is trying to do, but we are in a phase now where the ability to manipulate a cell through editing or splicing or knocking down or knocking in all of this is creating these wonderful, wonderful opportunities. I mean, the work we're doing with stem cells that the industry is doing with some cells in evolving them into whatever cell type they would like. At the end of the day, once I get to my destination, that's where the process begins in our world and that's where the idea of onto and structure. If we can start standardizing that, it is possible we can start sharing data in a more consistent fashion and that's where the volume of data comes from because for AI and all of these other sort of confluence, you know, machine learning, they need data. I think that's, that's where I see. One of the bigger opportunities coming.

One of, one of the areas that you've talked about, you mentioned having seen problems that occurred simply by relocating a material from one location to another and that's something that probably was found out via anecdotal information that you looked at that problem. And through looking at everything in, in the in the system, we're able to discern that that's what was the variable that was the cause of that. Machine learning has the ability to take reams of data far more than is humanly possible and make inferences from that to make predictions from that Yogi Berra once said predictions are difficult to make, especially about the future. Having the ability to sift through all of this data ,in ways that are simply humanly impossible, is one of the real values and future capabilities that we can look forward to, for people who are using self-work to store that data in a standardized and repeatable manner.