An Update On RPI's COVID-19 Model And The Coming "War"
We get an update from Dr. Malik Magdon-Ismail, Professor of Computer Science at RPI, who has been working on a machine learning model for predicting the impacts of the pandemic.
In this interview, Dr. Magdon-Ismail refers to a New York State Department of Health website, which you can find here.
What has changed since our last interview in the model?
So the main thing that has changed in the model is that we've gotten a lot more data and we can see the effects of the current sort of status quo, for example, in the Capital District, where we've been doing some pretty aggressive social distancing. And so we sort of have a handle on what's going on. And we also have a handle nationally on how the disease behaves. And I think this can help us a lot in deciding how we want to move forward.
Say more about that. We've social distanced, obviously. We’ve shut down the economy so there are fewer people coming in contact with each other. Is that having the desired effect?
So yeah, that is having the desired effect. And if you look, for example, at the models in the Capital District, for example, you can see that they've sort of slightly tailed off from the earlier prediction. So, you know, depending on how aggressive you think social distancing is we may even have peaked right now. Or we may have a little bit more to go before we peak But in either case, we've sort of, you know, gotten to a point where infections are under control and, and more sort of mortality is under control, we understand the disease. So it's sort of accomplished what we what we intended it to accomplish. I guess the important thing to remember is that that was our opening gambit. We still haven't hit the end game.
Now, before we get to that, when we had our first conversation about the model, you said that the model had predicted a peak in May or June, so it did pretty well. Right?
Um, well, you know, yes. And yes, to some extent, but there was an error bar yes, so, you know, somewhere between May and June, and we've more or less, many people think we've sort of peaked.
OK, so let's get back to the idea of the endgame. The opening gambit, as it were, was the initial entry of these cases in our region and then the resulting shutdown. What happens now as we look forward?
OK, so I'm gonna compare two cases. One is New York City, and the other is the Capital District. And as you may be familiar with sort of what's been going on with, with antibody testing, you know, we've had a round of antibody testing here in New York State. And it sort of has indicated that a significant number of people have actually been infected, even though the sort of confirmed serious infections have been low. And that's a good thing, because it means that, you know, the disease is sort of not as serious as we once feared. And it means that if many people are infected, those people provide sort of a buffer against, you know, the infection spreading because most people believe that if you've been infected once, you're going to get a little bit of immunity. So in a region that already had a large number of infections, for example, New York City, and antibody testing for just about 20% of people have been infected and the models are saying, sort of agreeing with that, at least my models. That means that there are fewer people that remain to get infected. And so as you open, your risk of a huge avalanche of new infections is sort of smaller. If you compare that with the Capital District, sort of my model is saying that, you know, right, as of now, in the shutdown phase, where we've sort of peaked and where we're gradually tailing off, you know, we have a total infection count of about three, 2-3%, 2-3.5% of the population, which is look is comparable to the antibody testing of about a couple of weeks ago, where, you know, they said that the Capital District has about 2.5% infection. So that's a much lower number.
And what that sort of means is two things. One thing is, you know, we're not gonna increase that number much by remaining in the current status quo of strong social distancing. So, you know, if and when we do reopen, there's a large population that has not been infected that is, in some sense available to be infected. So that's sort of the bad news. That's why, in fact, Governor Cuomo mentioned that, you know, the Capital District is high risk. It’s high risk, not because we're not healthy, but because we have a lot more people remaining to be infected than, for example, in New York City. So that's the status quo. If we remain as we are now things will tail off perhaps, maybe now or in a week or two, at about, let's say, a 3-5% total number of infections. By the way, though the model seems to think we are 3-5% of total infections, of those, we have only confirmed about .5%. So a large fraction of the total infections go unconfirmed, which means they're not that serious. And so in fact, the model predicts the total number of infections even though we don't observe them. The model is predicting right now in Albany between 2-3.5% of total infections.
Which, then, OK, that brings me to the next question, which is, as Governor Cuomo starts reopening regions of the state, what happens to cases if we assume that a certain number of people are now going back out into the public who weren't in the public before?
OK, so yes, so that's a good question. Um, so that's that second website I sent you. So we’ve seen a lot more of the disease and if you go to this New York State DOH website, it gives a very nice breakdown of who the disease is affecting the most and in terms of sort of fatalities, the people who need to beware, sort of above 50 or above 60. I think approximately 95% of the fatalities have been aged 50 or above. So that's a very high risk age group. Also, if you look at the same DOH website, it shows you what other underlying conditions that are sort of indicative that the disease is going to hit you hard. And the main ones are hypertension, diabetes, you know, hyperlipidemia, sort of like cholesterol, artery disease, and so on. So this is all on that website. And what it basically indicates to me is that there's a vulnerable group, which is, you know, people having these underlying conditions. And there's the sort of 50 and older, and in a reopening, this group has to be protected because in a reopening, we're going to get in more infections and we have to protect this group. But then, you know, sort of increasing the number of infections is inevitable in a reopening. And we have to get to that point where we've built up a herd immunity slowly so that the hospital system can cope with it. Which means basically that's the endgame. That's when we've reached the endgame when we’ve sort of built that herd immunity slowly, and the rate at which we can build that up without breaking the healthcare system is sort of what the public planners and so on are trying to figure out.
And presumably that's why Governor Cuomo is insisting that hospitals keep 30% of beds and ICU space available for an uptick in cases once the reopening has taken hold, right?
Yeah. So you know, reopening is inevitable because there are two possible end games. One is we stay in that lockdown and drive infections literally down to zero. Because this thing can restart from just one infection, or we reopened and it doesn't look like we're going to drive infections literally down to zero, so we have to reopen. And when we reopen, we can see from the model that you know, the infections are going to arise. Infections on their own are not bad, as long as you protect the vulnerable and have enough space in the hospitals. And that's what I was trying to ensure.
So if I'm understanding you correctly, reopening society and having healthy people in their 20s, 30s, and 40s giving each other COVID-19 but in most cases not dying from it is overall a net benefit to all of us.
Correct. So, you know, if we could identify all the people who wouldn't suffer from COVID and could give COVID to all of them, and then be done, we would, society would benefit a lot. It's kind of like a vaccination campaign, which doesn't hurt anyone.
OK, now let me ask you another hard question. How long does that take?
Ah, so if you look on this website I sent you, this COVID warning website, it depends. So as a community, we have to decide what is the tolerable number of daily infections, which in turn, you know, through some multiplier translates to hospitalizations, and that multiplier is going to be relatively small, if the only people who are sort of going out are the healthy, the young and so on. So we have to decide what's the tolerable level that we can handle. And then based on that, we try to open at a rate that maintains this tolerable level, and then, sooner or later enough people get immune to the degree that endows the community with this herd immunity, and then we won the game. And so, you know, the models are suggesting that if for the Capital District, the tolerable level is 500 infections a day, then we will be we are looking at fighting this battle for about five to six months. But it's like a flu. It's like a flu season. You know, the flu season starts in October, and gradually every day people get infected, not many are getting hospitalized, hopefully. And you know, by the end of March, beginning of April, we're done with the flu season. Because in some sense, well, it's a winter disease, but also we've built up this immunity to that particular strain.
So is that a plan similar to what the governor has outlined? I mean, does his vision of reopening in regions and you know, sort of, doing it by meeting certain criteria in each region, does that comport to what your model shows you is the right way to go about it?
To some extent, yes. So he is sort of I mean, if I understand correctly, but you know, this is not really my expertise, but if I understand correctly, his plan is, is allowing for a gradual reopening, providing that your sort of infection slash hospitalization count is staying under a specific threshold. It's not clear how that threshold should be picked or how the governor is planning to pick this threshold depending on the region, because that threshold probably should depend on what the population size is of your region and so on and so forth. I believe currently, it's set at 16 hospitalizations per day on average. I'm not sure of that exact number, I believe that's the case. But that number probably should be higher, let's say in a New York City versus, you know, a much lower density area. But generally speaking, yes. So that's what those kind of strategies are gearing towards that, you know, look and see if your infections are below the tolerable level. If so, reopen a little bit. And that's basically the reopening strategy that my models are using as well. And you can, if you do it, right, you can keep the infections at the level you want them to be at.
You said something at the beginning. And I know that your website, your tool have really looked at many regions around New York, but many other states as well. And you said, you know, gathering this data, as you've been doing since March, has allowed you to understand and the model to understand the spread of the disease in different areas. So what have you learned from looking at other places and the way that it's spread compared to what we're seeing in New York and specifically in the Capital Region.
So, the spreading mechanics are sort of similar from region to region, what is different is sort of how the various regions have been social distancing. And, and that has effect and what their population density is and things like that. And that has affected, the main thing that has affected up to today is what fraction of the population is currently infected. So for example, that varies very drastically as far as the model is saying. So, you know, if you take New York, if you if you look at let's say, the Bronx, you know, the model is saying that anywhere from 15% to something like 24% of the population has been infected. If you look at Albany, it says Albany County is anywhere from about 4-4.5%. On the other hand, if you look at something like Columbia County, which is very close to us, Columbia County is a little bit lower around 3%. And if you look at Saratoga, which is also considered part of the Capital District, Saratoga, the model is saying that only about 1% of the population has been infected. So just going from Albany County, which has about 4% infected according to the model to Saratoga, which has about 1% infected is all part of what's been going on in these various locations with respect to social distancing, and what could be in store. So in some sense, people are focusing on whether or not we reached the peak now and how quickly will we reach the peak now but, but you know, there's this trade off where if you have been extremely socially distanced then you will have reached a peak early. But then as the Governor Cuomo put it, you're at high risk for further infections as we approach the end game. So there's sort of trade off going on, and different regions have achieved a different level in this trade off. And that's sort of gonna govern what they will see moving forward.
Now, lastly, we don't know if you can get COVID-19 a second time.
So what happens if we follow this path, and then you can be reinfected a second time? How does that change your modeling?
Well, so that's a very good question. My models are assuming that you cannot get infected a second time. So you either you know, you want to get infected, you can you can remain a carrier for a certain period of time that incubation period and then either you, you know, become serious and recover, or die or you sort of recover and have immunity. So, you know, if you can be reinfected, then it means that the results that are being shown by my model are an underestimate of what could happen.
That's a little scary, right?
Well, you know, it's scary if it really is the case, but I don't, I mean, I think though, the prevailing wisdom is that, you know, typically, if you've been infected, it's unlikely that you will be reinfected. But you know, I'm not an expert there so I don't know.
Is there anything that I didn't ask you that you want to mention here?
Well, you know, one thing that one thing that's sort of come about and people may be sort of wondering, why do we care about these antibody testing? And you know, to some extent what the antibody testing tells us is what's the total infection count. And that's what this model has been built to sort of tell us why do we really care about this total infection count. And one of the things that this total infection count tells us is that, you know, if you know the total number of infections, and then you can look at the fatalities, you can actually get an understanding of how serious this disease is. So if you look at the total infection counts, in most areas, and then compare that to the fatalities, you see that it's a lot better than what we thought before. So, you know, before we were seeing numbers like 5% fatality rate, and that was 5% of the confirmed infections as opposed to the total infections. When you change the denominator from confirmed infections to total infections, it's a lot better. It's, you know, you're looking at something like .5% fatality rate, which is, you know, that's sort of a little bit relieving.