AI comes with many questions and this episode comes with a clear angle to debate it: how do data experts specialized in impact business such as Lou Welgryn and Theo Alves Da Costa - the 2 founders of the NGO Data for Good - and Anastasis Stamatis - the founder of Dataphoria - answer these questions? Itâs not that much about the training, go for a factor 10, ⊠A lot of counterintuitive answers and great food for thoughts!
â ïž Breaking news â ïž
đŁ Green IO launches its first on site Green IO conference in Paris, December 8th. Join us to get the latest insights on Digital Sustainability with Aurore StĂ©phant, Perrine Tanguy, Tristan Nitot, Julia Meyer, Theo Alves Da Costa, and many more! Get also feedback from all the teams involved in the 2023 Sustainable Digital Challenge: Allianz, Axa, BlaBlaCar, BNP Paribas Cardif, Ekwateur, Evaneos, Groupama, INSEE, Leboncoin, Norauto, SNCF!
And it's free for our listeners! Register here with the voucher GREENIOVIP. We're looking forward to seeing hundreds of you there đ.
â€ïž Subscribe, follow, like, ... stay connected the way you want to never miss an episode!
đ§ Once a month, we deliver carefully curated news on digital sustainability packed with exclusive Green IO contents in your mailbox, subscribe to the Green IO newsletter here.
đ«Ž Green IO is a free and independent podcast! And so we need your help to keep it that way by supporting us on Tipeee here.
Learn more about our guest and connect
- Anastasis's LinkedIn
- Lou's LinkedIn
- Theo's LinkedIn
- Gaël's LinkedIn
- Gaël's website
- Green IO website
- Green IO newsletter
Anastasis', Lou's and Theo's sources and other references mentioned in this episode
- Climate Q&A
- Data for Good
- Dataphoria
- Hugging Face and its models Zephyr 7b-beta and Bloom
- Code Carbon
- Andrew Ng
- Sasha Luccioni and her paper âCounting Carbon: A Survey of Factors Influencing the Emissions of Machine Learningâ
- Pyronear, the NGO using AI to to detect forest fire
- Quota climat
- Green IO episode 11 with Elin Hauge and Heloise Nonne
- Climate Fresk
Transcript
[00:00] Gaël : Hello everyone. Welcome to Green IO, the podcast for responsible technologists building a greener Digital world one byte at a time. Our guests from across the globe, share insights, tools and alternative approaches, enabling people within the tech sector and beyond to boost digital sustainability.
[00:33] GaĂ«l: You might have heard about artificial intelligence last month. Yeah, sorry about the joke. But you might also have heard about the rising concern about the environmental footprint about artificial intelligence and data groups. Actually, I recorded a full episode with Jerry McGovern and Katie Singer in January about what they call data tsunami. And since then, we've seen new studies about the water consumption of chat GPT, about the electricity consumption, training and requesting these artificial intelligence models, that have started to raise concerns about the sustainability of these new technologies. But on the other end, I'm bombarded with articles, posts and discussions with peers and clients about the expected benefits of AI for humankind. So I reckon this question is on a lot of tech practitionersâ minds today: how to leverage AI and data to regenerate the planet in our societies rather than destroying them. And to answer it, I decided to ask people I could trust, to tell more about data for good. For real, and by real, I mean, without overlooking all the impacts of using AI including the negative ones. And believe me, Lou, ThĂ©o and Anastasis are to be trusted.
Lou and Théo founded Data for Good in France five years ago. Today, this community gathers more than 3000 data scientists and data engineers doing pro bono work for NGOs and nonprofits. And Data for Good has been making headlines in early November with their job on 425 climate bombs worldwide, and which companies and banks are supporting them. But that's something Data for Good has been familiar with. Just last year, Lou was listed among the 100 thought leaders who give meaning to technology in France, (kudo) and was getting attention with the ClimateQA, using Chat GPT, but trained only with IPCC reports. And it was important to me to have another point of view than a French one. And so for this episode, Anastasis Stamatis was a perfect match. Thanks to the Climate Fresk network, I discovered the founder of Dataphoria in Greece and his amazing track record in the impact business sector, most of them powered by AI. So welcome, Lou, Theo and Anastasis. Thanks a lot for joining Green IO today. That's our third attempt to get a recording, but this one will be the good one.
[03:10] Théo: Hello everyone.
[03:11] Lou: Well, hi.
[03:12] Anastasis: Hi, Gaël. Thanks for having us.
[03:13] Gaël: You're welcome. Before we jump into both the bright and dark side of AI, I would like to ask a very simple question but a very tricky one. Where are we with artificial intelligence in the world? What's really the spread of AI in both tech companies, but in regular companies, government, etcetera. And is the technology moving that fast? How much is hype, and how much is really big acceleration?
[03:46] ThĂ©o: It's an interesting question, because it's been a year since Chat GPT has come out, and it's changed a bit how people see AI. And with it, at Data for Good, we created a white paper on generative AI, and we actually discovered that a lot of people in the general public didn't really know the history of AI and there are still a lot of misconceptions about the topic and a lot of people, for example, don't know that AI actually has been around for more than eighty years, and has had a lot of waves and evolutions over the years, and has already enabled us to do lots of stuff. For example, I always say when we explain about the topic, that Google search has been using the Transformers technology (which is in chat GPT), for more than seven years in production. So people are using AI every day in their Google search. And this has been a constant evolution for the past eighty years, even if there has been some increasing pace at some points in history. But still there is something new about today, which is for me, the general publicâs adoption of these technologies, which has actually struck a lot of people by surprise, by the fact that it was not something that different. Of course, we can do a lot of things a lot better since we have access to generative AI technologies. But still, it was only the fact that 200 million people now use Chat GPT every day that signaled a big change, and people now were able to actually test it and that made them even want to learn more about the topic. So for me this is actually not a technological outbreak or like a complete shift in what was feasible. But more something that now is accessible and people know that it's not something obscure, and that they can use it. And with it, there is a question of the exponential. That's when you talk about environmental issues. Every time there is an exponential like this, there is a question. So there is a part of Ai that is in constant evolution for me.
[06:06] Lou: And I think maybe to complete what you're saying, what's interesting, is what you were saying about the fact that people now are more aware that it's here. I feel like in the tools that we're using, there's often functionalities that tell you âask AI to do this or to do thatâ. And this is a shift from before where there was AI, but we were not talking about it, and most people were not seeing it, and maybe it's a bit more visible than before, because since there's been this breakthrough in a generative AI, we have the feeling, as users, that we're talking and that we're using it maybe almost like a relation. So I think, to complete what ThĂ©o was saying, it's more visible than before, and it's breaking through much more, much faster than before.
[07:02] Gaël: Anastasis, is it a trend that you analyze as well? That the breakthrough is more on public adoption than purely the technological side?
[07:12] Anastasis: Absolutely. So it's more the democratization of AI, it has raised massive amounts of users compared to previous applications of all sorts. And I think that's where the risk might be, because our reaction time is limited with regards to the massive adoption that is happening all around us. And yes, we have been using AI over the past years without even realizing it in our phone, in the way we consume the news, even some news we might have read, it was generated by AI. But now it is all around us, and there is all this public sentiment around it. And with it comes a lot more data that needs to be used. And with AI being so massively in production, also the infrastructure requirements and the electricity and technology requirements are becoming bigger as well.
[08:10] Gaël: And you mentioned that it's more [a question of] public adoption than technology. But if you read many articles, if you listen to many thought leaders, it's more like everything has changed, you know, since GPT III or since even chat GPT IV or whatever. And actually, that's not truly the case. It has been a trend but it creates a massive hype that, you know, everyone needs to put AI in everything. I mean, it's exactly as you said Lou, that stuff that was done by machine before without acknowledging that it was artificial intelligence suddenly needs to be labeled as AI because it's better to sell. So I still see this massive hype wave and that leads me to another question. But a similar one, how you spot a hypist, I would say I've just made this word up but I love it. So how do you spot hypist and how do you spot true thought leaders, and who do you actually follow? And who do you believe is speaking good sense about what is going on in the AI field at the moment?
[09:28] Théo: I think honestly, I have one simple rule that worked quite well. It is a newly labeled expert that has been around for less than a year? which means that he only knew GPT as what he thinks is AI (his own vision) and, actually it's most of the time it's a he and not a she, or is it someone who has been around for many years and especially scientists and engineers in the field of AI, that have been applying AI in production, which is a bit different than doing a prototype using GPT or just embedding such technology in your product. And at the end, after you have confronted the buzz, there is a question of user adoption and I think already a year after, there has been a course of adoption by users, but there has also been a lot of deception on the topic. People are using it less for mundane tasks. Developers are using it a lot, but there are already a lot of people that have tried it, and now feel that they don't have a use for it. So, and that's normal and that's a lot coming from people that are not just like experts on the topic but just saw this trend and put it in their products. So in the end, mostly I love people like Hugging Face, for example, they are probably the one that I've been advancing the most to the field in the open source community. And I'm spending one hour on their platform just to see the latest innovations and papers, and following the people that are within, because they are probably gathering a lot of people that actually have the same values that we have at Data for Good. But on a wider scale, I would say that those are the people I'm following on my side.
[11:13] Lou: I would say I'm not following experts, but rather the important thing for me is really the purpose of the algorithm that's used, what's really important for me is, besides the technology, is what is it aimed at? And is it going in the right direction or in the wrong direction? And spoiler? I feel like 95% of the usages that are made today of chat GPT are just going in a way that makes us buy more stuff or do more things that we don't need to do. So that's why I'm really focused on the usages that are going to benefit for all. And I think they're are some [usages] and I'm not going to make some internal publicity, but I think the tool that has been developed called ClimateQ&A and which helps you navigate through IPCC reports and make climate data accessible to citizens and to people that want to understand the problem, yet don't have the time to read 14,000 pages of climate reports - itâs these kinds of usages to democratize knowledge which are amazing, and this is the kind of thing I follow.
[12:34] Anastasis: Wow. Hats off to ThĂ©o for this tool, which I'm gonna definitely be checking out after our session. Just to add to that, I guess the most important thing, from my perspective, is if someone is being generally objective about AI and does not treat it like a silver bullet. Usually what we get out of the hype is people claiming it will change everything, it will save the world, it will destroy the world. And I don't really like dealing in absolutes in that sense. Definitely many hard-working people in the industry for loads of years, most of whom know how difficult it is to collect the data, to label the data, to train the right models and how difficult it is to actually get a good trustworthy result out of AI, and so yeah, AI is hard, it's very burdensome to get the right results, and most people who would tell you that, yeah, they would find no silver bullets and no extreme hypeâŠ
[13:41] Gaël: And Anastasis, if you had to pick one name, who would it be?
[13:45] Anastasis: I think I would probably go back to basics, back to a person, one of the people who started all of this, who started the hype behind AI, behind data science as well. And that would be Andrew Ng , who is very well known in the community. And he's one of the voices that has been there from the beginning actually,
[4:08] GaĂ«l:Thanks a lot, Anastasis, and the two others know that I'm going to ask the same question. But actually, Iâm pretty sure I can do a bit of a mentalist exercise with ThĂ©o. So if you had to pick one name, Theo, who would it be? And my guess it will start with an SâŠ
[14:26] Théo: Yeah, that's a good person to follow that for sure, Sasha Luccioni, she was a researcher at Mila, a university in Canada (Quebec) and she was one of the people and actually the only one in the world to create a research paper on the carbon footprint of AI. And that is the only resource we have today to estimate what is actually emitting. And she has been doing a lot on AI ethics and AI carbon footprint. She also created a tool that we use, and now we maintain at Data for Good to measure the carbon footprint of AI. And now she works at Hugging Face. So yeah, that's one of our models.
[15:05] Gaël: And Lou?
[15:06] Lou: Well, I was going to say the same person. So it's not fun.
[15:10] Théo: Oh my God, that's super bad.
[15:13] Gaël: We stole your hero. I'm sorry about that. OK. So the fun fact is that I really wanted this episode to start on a positive note, and focus on data for good. But what is really interesting,is despite the fact that you are working on very positive applications of AI, the three of you started right from the start about, hey, cool down, there are some downsides as well, etc. So, OK, you know what? You won, I will not have such a positive episode, or at least not yet. And let's talk about A I as a destructive [force].
[15:51] Lou: No, I wanted to complete what you were saying, because I think it's really important, and the reason why we're doing it is that we're facing speeches in the media or people that are all AI experts, in 90% of the cases they're talking about AI and technology in general, like something that is going to save us, and behaving as if we can continue exactly the same pace that we're having, and that we just need to change our technology, and behind it we're going to be saved. And I think especially when you work in this field and you are aware of all the limits of those technologies, well, I feel like I have a role to, each time I have the possibility to talk in public to remember and to remind [everyone] of all those limits, because they're huge and we need to say it. There are positive cases, and they're amazing in science and all the projects that we can do at Data for Good, but they are not the majority. And this is really important to be said when the occasion is happening.
[17:08] GaĂ«l: So technocracy won't save the world. I'm under shock. But could you elaborate a bit with maybe one example? Like if I understood you well, that, itâs really the question of with such a powerful tool, towards where are we accelerating? And are we accelerating in the right direction? Or I would say the other side of the same coin is well, just drop a pinch of AI everywhere and it will solve the entire worldâs problems. So do you have some specific examples that you would like to share when you really believe that it is a terrible idea to use AI?
[15:51] Lou: Well, basically, when we're talking about technology, I think what is really important to bear in mind is the materiality of this world that we believe is immaterial. For instance, I think it's really funny that we're talking about the cloud as if it was something in the sky, that was really not material. When we know that for instance, 99% of the data that we exchange worldwide, they're circulating through submarine cables and then they're stuck into like monsters of metals. So everything is material. And so when we talk about technology, we need to remember that there are two sorts of impacts. The first one is âdirectâ, the materiality of it, that I was talking about. And then there are also all the indirect effects. And one of the main ones is called the rebound effect. And it's the fact that every time that we have discovered or improved a technology, or use technology to improve the usage of something, well in the meantime, we've increased its usage. And then as a result, the absolute emissions are always bigger. So, for instance, in our daily lives, we have the example of the planes. Because for instance, we've reduced the intensity per passenger, but we've increased hugely the number of flights. And so the absolute emissions are way bigger. And so for generative AI and AI in general, it's really the same thing happening. For instance, to take an example, we can take as an example the fact that AI helps you write emails way faster. But at the same time, it's going to make you maybe be able to make many more emails. So if the objective of your emails is to send stuff, maybe you're going to be able to sell much more stuff. Another example that I like a lot is the example of publicity. So you can make much more [publicity], you can personalize much more the advertising that you're going to make. And so in the end, you can make it faster. But so you're going to sell much more stuff. And these examples, actually, I didn't find a counterexample today, maybe somebody has it but I don'tâŠ
[20:23] Gaël: Anastasis, do you have a counter example? Actually? No, let's stick with the negative force. Would you like to comment on what Lou shared about the rebound effect, about accelerating in the wrong direction, just selling more stuff?
[20:39] Anastasis: Oh absolutely. And I feel this is like another facet of consumerism that's been evident in our days. And especially since if we do flood the internet with AI generated copies of emails and advertising texts, then guess what the next models will be trained on, and what degradation that might produce in the future. So, again, some elements that people should consider. When I say people, just to get back to that technocratic aspect of what we discussed, what I'm sort of feeling, which leads to the destructive force behind AI, is limited knowledge and limited engagement with the risks and the dangers around Artificial Intelligence Data centers, cloud computing, the energy they produce, that's the essence of your show, Gaël. So, we've been exploring this throughout other episodes. What happens when we try to apply AI without the relevant data? I mean, all of us are working at least to a certain part in climate change. We have to create aspects to global North and to global South. Some climate change solutions embedded with AI should apply to the global South as well. But most of the data come from the global North. So you end up trying to do good, but you might end up applying solutions where they have actually no applications because you've trained them with wrong data sets. All these are dangers and risks that we, as a community, should be informed of, and should be informing people of, and democratizing this knowledge. When people apply AI or use AI, there should be a way for them to know what to use and what not to use. Just the way when you use the electricity or you use the water, you're aware of the dangers, you're aware of what to do, what not to do. It should reach a point where it is the same thing with artificial intelligence. That's how we can de-risk it and stop it being a sort of destructive force through its application.
[23:06] GaĂ«l: And actually listening to both Lou and you Anastasis, I realize that people are seeing the data that is needed to train any data algorithm a bit like the cloud. It's some kind of magical [solution] and they don't really challenge the media reality of the cloud, and they don't really challenge the materiality of the data we need to use. And that creates this huge bias that you've mentioned. And actually, I think it's a very important bias. It's not an environmental bias. But yeah, if you train data mostly based on US behaviors and you try to apply it to Madagascar (just to pick my neighbors as an example), well that can create a lot of issues. And itâs funny that this hype and all this magical thinking around AI tends to overlook this aspect. I mean, am I right to rephrase it this way?
[24:03] Anastasis: Absolutely. You're absolutely right. And technology in itself nowadays is political. It has been used as a political tool. So the users of technology have to be political as well, the way we use it makes a statement that leaves its footprint on the world based on our ethics, our responsibility. So we do have to claim this responsibility.
[24:30] Gaël : You mentioned earlier that Sasha Lucionni was the only one who wrote a research paper on the environmental footprint. That's something that both Anastasi and Lou mentioned several times that we also need to pay attention to the electricity consumption. And I guess much more. Could you elaborate a bit? What do we know today about the environmental impact of artificial intelligence?
[24:56] Théo:This is important because that's also where you see what you can do in terms of negative impacts. I think there are three misconceptions about the environmental impact of AI. The first one is that actually we don't know what the environmental impacts are of AI. We don't have the data for it. We have ideas for it. We know that it consumes energy and water and we have a lot of data over there, but no one may actually measure it aside from a few people in the world. So we just don't know. But what we know, the second misconception is I'd say, is that people tend to focus on the carbon footprint and the energy consumption of data centers, which for me, and I will give some figures, is actually not that important, and especially for people in AI, because it has been the case for a lot of years, people think that it has to come from the training phase, that this is the most important phase. And the third misconception is that they always forget the indirect impact. So let me give you a few figures. We think we know that GPT 3.5 has been trained and consumed during the training phase and emitted around 500 tons of CO2, 500 tons of CO2. It's quite big, but it's not huge. It's like it would be a small company of maybe 10 to 30 people, or it's around 250 people flying back and forth to New York, which is actually, if you divide that by 200 million users, it's actually nothing. One thing is the second indirect impact is actually the use of AI before, because we had a lot of technologies that were actually trained and used and the usage phase were actually not that [bad] in consuming energy. But with the transformer technology that we have already, that we have now in the generative AI techniques, actually, the inference phase, the usage phase emits as much the training phase. So if you scale it to 200 million users per day, that's becoming quite a thing. And we did the exercise in our latest white paper at Data for Good. If you do the exercise, you find that for GPT 3.5 it is around 100,000 tons of CO2. But even so, for me, if we only take that into consideration, we forget a lot of things. There was one study by McKenzie that we actually did some work on, that said that a generative AI will just add to the economy 4000 billion in GDP worldwide every year. And that means that in each sector, you will have an increase in revenues. And if you take for each sector the decarbonization trajectory, you can do a small calculation, that it will mean that it would be emitting around 2 billion tons of CO2 every year and 2 billion tons of CO2. Now, it's not negligible at all. It's around 5% of the global emissions worldwide. And this is exactly a figure that represents the fact that with generative AI, people think that we will accelerate an economy that is not decarbonizing so much, and that it will be actually emitting 5% more of CO2 every year. And this is why we actually tend to focus on the finality of the algorithm. What are you using it for? And especially stopping using it for marketing and publicity, or even fossil fuel extraction actually, and start using it only for good.
[28:35] Gaël: Which makes a beautiful transition because we don't know that much on what exactly is the impact, and still we try to use it for good. So can we try to switch now? Because the alerts are crystal clear about bias with data accelerating in the wrong direction, all the political entanglement that you can see. But the three of you, you've managed actually, you tried, to use it as a force for good. So let's get our hands a bit dirty. Can you pick a project where you say, OK, I'm going to use AI, I'm going to challenge myself about the direction I'm going in, and also challenge myself about the intensity of the energy, the water usage, etcetera, trying to reduce it all.
[29:22] Lou: Maybe I can illustrate with an example, which is one of my favorite ones, that we're doing right now at Data for Good. We've been helping an NGO that's called Bloom for several months now. And this NGO is focused on protecting the ocean and marine life. And what we've been doing with them is helping them track the biggest boats on the sea and being able to identify when they were fishing in zones that they were not allowed to. So in this case, we've really used algorithms and artificial intelligence to develop algorithms that will help us detect depending on the boatâs trajectory, whether they were fishing or not, and then match it with the geographical zones in which they were identified to be in, to see if they were in a protected area or not. And when we match the two information on whether they're fishing or not and if they are in a protected area or not, we can know if they are violating some rules. And since we've been doing this work, we know that Bloom has been finding more than 20 complaints at international level in order to denounce the illegal fishing. And here, we felt that we haven't totally measured the impact of CO2 of those algorithms compared to the benefits that they procure. But what we're sure of, is that protecting marine life is way more important than the consumption of electricity that those algorithms are generating.
[31:22] Théo: Maybe I can continue on the Data for Good example. We have a framework to see if the project is actually for good or not, not talking about the moral aspect of it. But we try to be quite logical, but the one thing that we actually always see that it is working well and if we don't do it, it won't actually have a real force for good, as we were saying, deploying it in the field and not staying at the level where it's only a geek thing, where we just produced something that could be used. And we put it in open source and we say we created a great thing, just use it and now you will be for good. Now what we thought that now you discover we're actually delivering our impact when we are putting our tools and what we developed in the field. A few of our examples that have been done, not only on the fishing side, for example, was one project that we did with an NGO called Pyronear, which is detecting forest fires before it's too late for the firefighters to intervene. And at first, we created just an algorithm to detect it with computer vision from small cameras that were quite sober and frugal. But the moment that the project completely switched and actually became a force for good is when we actually made the contact with the firefighters with the people managing forests so that we can actually implement it. And at the end, it raises alerts with the firefighters. And the project now, the full life cycle, it's actually now starting to really prevent fires. And if we hadn't been doing that, we had nothing. And other examples, we worked a lot with another NGO called Quota Climate, which is basically monitoring how people speak in the media about the ecological transition. So it's counting the quantity and the quality of interventions in the media and ranking TV, and radio chains together to see who has been talking about what topic, and having some barometers to see the evolutions. And at first, the same thing, we created algorithms and we created some linkedin posts and it was good and we had a lot of buzz with it. But the project actually started to become quite important when with our work, with the KPIs that we produced, this NGO created a task force to change the regulation around media and the ecological transition. And we always see that aspect where, if we stick to keeping just having a geek thing, when we think it is a good thing, we're not actually delivering something on the field. So that's why, even if we have a lot of other aspects to consider, if the project is so good, then that's the most important thing for us.
[34:11] Gaël: And Théo in that case, do you completely skip the intensity part, which I would say is reducing the environmental footprint in other aspects as well? Or do you have some basic guidelines that you follow all the time?
[34:29] Théo: Yeah, because we have one good thing at Data for Good, we were talking about that before, where we now maintain a library in Python called Code Carbon that is helping us monitor the carbon footprint of what we do. And we're actually trying to do the most that we can; it's not always feasible when you're working with volunteers because it's not a company. So sometimes people don't use it because they don't know how to use it, and we cannot really train them, but we do that as much as possible. And what we actually found is that most of the time, because for all of those projects, it was not using generative AI. So like other AI algorithms that are basically statistics, statistics which are not that heavy for a computation. So even if, when we monitor it, it's cool because we can actually have this impact measurement of our actual impact with the plus and the negative part of it. But in the end, it's actually changing now that we have generative AI. And if we had been creating a product where the generative AI was live and used by millions of users, that would be when the question would become so important, that we actually would have some big decisions on whether we kill it or not, or actually do we invent it or not if we [already] know that it will be deployed at such a high scale. So we try to measure it, and one big part of it is not just not using some complicated algorithms in production or at all, that's all like a good way of being sober is just not doing it.
[36:05] Gaël: Anastasis, Théo mentioned that to be efficient, to actually achieve its goal, a proper data for good project should not be left only to geeks. Is it something that you've practiced also? And do you have any good stories to tell us about this?
[36:26] Anastasis: Absolutely. So what we usually do, we are more business focused in the sense that our clients are Corporates and businesses of any size that want to showcase and monitor their sustainability KPIs, we connect to their data systems. So we prepare their data infrastructure and then we are able to extract all the sustainability data and more general ESG data of sorts. Through that, what we hope to achieve is to automate the process of âhowâ, and leave the âwhyâ to them, for example, why do they need to reduce their carbon footprint? Why do they need to improve their ESG performance? and then to help them through this process. In this sense, we are trying to use data to reduce their own footprints and to improve their own environmental, social and governance performance. So their success is our success, and the tools they are using to implement these goals is something that's very close at heart for us. Through this process, we found that, more often than not, the need for AI comes in organically at some stage in the project. So we're always careful to avoid using it as a buzzword. I do remember a specific example which shed some light on the way we can use AI at Dataphoria and it was all about benchmarking. So we've got a particular case where the client wanted to see how they were performing against other peers in the industry with data that could not be found in specific databases or specialized ESG websites, but which was embedded in a number of sustainability reports. Most of these are in a PDF format, a couple of 100 pages long each. So when we started looking through there, to find the right data, at one point, we decided that it was going to take a very long time actually. So we tried to automate the process. We started with a few basic text analytics and one thing led to another, and we found ourselves working through a text classifier and a way to extract this textual data in a very structured format. And now when we're talking about technical data, like carbon emissions, like intensity base years, materiality topics and all these, of course, this has to come with very specialized knowledge. So that was the point where we had to stop and discuss between ourselves. Even at the small scale, we were applying heat, whether the footprint, the impact of what we were hoping to achieve could surpass the impact of building this model and deploying this model itself. And we found this as a rule of thumb ever since.
[39:54] GaĂ«l: Yeah, a good rule of thumb makes total sense. Itâs not easy, itâs not easy all the time, to try and assess how much your clients will actually save. But I guess having this order of magnitude to 1 to 10 protects you from minor estimation errors. I would say.
[40:16] Anastasis: Absolutely. That's the engineer in me speaking. So yeah, that's the sort of case âwe're never gonna get it rightâ. We just have to be certain.
[40:27] GaĂ«l: You know, when I prepared this episode, I had a very interesting discussion, a very meaningful discussion with Yoann Fol and Buster Franken, thanks to Anastasisâ advice. And I remember Yohan telling me, you know, maybe AI will just become like electricity or a fridge, a technology that it is so obvious that it will become invisible, and we will not talk that much about it because we don't talk every day about electricity or the fact that we are able to produce cooling mechanisms, which is pretty amazing when you think about it. So do you foresee this trend that AI will become just AI, and not a cool hype technology boost. It's just AI, you know, it's just electricity. What do you think about it? Starting with Lou?
[41:21] Lou: Well, I think it's as we were saying earlier in the podcast, I think it's already happening, as we're already using it every day in our lives. When you're going on social media, there are algorithms behind it, and when you're using the internet, it's algorithms again. So it's already everywhere in our lives. And what frightens me is the pace at which it's going right now. And the fact that nobody is questioning it, especially politicians. I think it's a subject where they're completely out of their reach and they don't understand. So they're not taking it seriously. Well, I believe we should in the political instances take the subject and ask the question about the usages, and the needs that we have about those technologies, because they're spreading faster and faster, and we need to ask ourselves about our needs.
[42:23] ThĂ©o: Maybe my idea around that is, especially with this new word generative AI, and it's not only a question of Chat GPT, that is the top 10 that we always talk about. It's a question about generated human readable content that is multimodal. So that means image plus text plus sound, plus video, plus whatever. That means that it's not a question about being a real intelligence, but it's actually emulating how we are in this world, what we do in this world and how we interact with things. And so as soon as you have that ,and you're actually connecting actions, like reading and writing and so on, it becomes something that is natural to have embedded everywhere in our life, if we just follow how [others think ] it should go, if we just listen to technocratic people. So it's logical that actually, yes, AI can become obvious and invisible. For example, if you use Notion, there is now the AI part and it's labeled as AI because it's like the buzz, but at some point it will be not [be a buzz], it will be something like when you use a keyboard and it writes for you and it helps you write faster. So it will be there, and itâs already there, but is it like a good thing? That's another question. We always love to talk about Jacques [inaudible] and the people that are and have been studying technology and there are some researchers and social researchers that are saying that a good technology has to be convivial. That means that you cannot create a dependency to it. Because at some point, there will be a problem, and this is something that we already see. If you ask a developer right now, they are using github pilot and other AIs to write code, and I'm using it six hours a day to write code faster. And now when I don't have it, because there is a problem with my server, or I'm on a train and I don't have any networks, I know less how to code than before, but with it, I'm super faster. And this is a big question, because as soon as you create dependency, you make something that can create a lot of problems. So I don't know if it has to be obvious and invisible to be more of a force for good than a destructive force, but for sure, it is happening already, and I agree that now it's a buzz. So it's labeled AI, but it will disappear at some point.
[45:02] Anastasis: I would totally agree with Théo on that. And just to go back to the ages before writing was actually democratized, there were all these mnemonic devices and mind balances and ways of sustaining the most information you could within your own mind. And then writing became easier because of papyrus and the parchments and so with the writing tools, it was certainly easier than writing things on stone tablets. And then there was a sort of disruption and a sort of resistance from the society back then, because people would tend to forget how to remember stuff if everything was written down. But that was the way we passed knowledge from one generation to the other. That was the way we could learn all the more, and use our mind for creativity instead of remembering lots of things. That's the analogy I would see for AI as well. It is a tool that, as Théo said, can free up our time for creative stuff that can help us accelerate what we are actually doing and take off the burden. I would see it more as a tool, therefore, more as an integration within our daily lives. The same way we have books, the same way we have phones, the same way we have other tools, calculators, you name it. But in order for it to be a force of good, not a destructive force, we need to set this trend in the right trajectory. That's what the European Union is aiming to do with the regulations about AI. That's what our generation should do as well. So in order to make AI for for good, in order to embed it the way it should be in our daily lives, we need more people, like you Théo, like Lou, like Gaël, like you listeners, to become part of this dialogue, part of the discussion and act for positive change.
[47:15] Gaël: Thanks a lot Anastasis, it resonates a lot, with a lot of the listeners as well, and definitely with me. And now my last question to close the podcast on a positive note. Could you share one piece of good news which made you optimistic recently about a pass towards a more sustainable world?
[47:37] Lou: Ok. Well, maybe you're gonna say it's not a good news, but I think it is. You were mentioning it at the beginning of the podcast, but today with Data for Good, we've been working on a project for several months about carbon bombs in the world. So this is not a fun topic, but I think what's positive about it, is that we've had media coverage all around the world. And I think a few years ago, this would have never happened, because people didn't care about these problems and the fact that the media is talking about these subjects in their front pages and selected them as their recommendation of the day and stuff, this is a positive thing. By showing that the media is becoming more and more aware about the subjects and trying to use the impact that they have to notify people and to inform, to help us go in the right direction.
[48:39] Théo: I'd say a very geek good news is, in the past week, there has been the release with Hugging Face, which I was saying that I loved a lot at the beginning of the podcast, there was a release of a model called Zephyr 7b-beta, which is basically like a geek code name for a new large language model. So the kind of algorithm for generative AI that is free, open-source, small, sober, with very fast inference - that means less energy consuming inference - And that is for the first time, more performant than the things that I'm doing with GPT in production. So that means that for the people that in the field are called the GPU poor, ie. the people that don't have a GPU to do AI (that means like 99.999% of the world), if you're not working at a tech giant company, you can now have access for free, and switch your actual model that you are using for AI from a proprietary closed model, to something that is open-source and free, and consuming less energy, and created by people with different kinds of values, and something that you can actually fine tune, remove bias, audit and so on. So that's actually for the first time that I've been using generative AI every day for a year in the past year. And now for the first time, I have something that I can put in a product that I actually trust a lot more because I know I can control it in a way.
[50:23] Gaël: And what about you, Anastasis, what is the good news that you want to share?
[50:29] Anastasis: Right. So a few weeks ago, we had significant news in the corporate sustainability reporting sector. So all over the EU we are waiting for new sustainability reporting standards that would require a lot more companies, five times more companies than today, to report on and reduce their carbon footprint, and also their general sustainability performance. And part of this was put to a vote in the European Parliament, and there were some moves to hinder the vote and to reject the motion, or set it back a couple of years, at least by certain parties. And it was looking quite grim. It was quite unsure if this was going to pass. Obviously, if this doesn't pass, then the whole environment and sustainability progress is left two years behind. The very good news is that this passed with a majority. So we are continuing on as planned with the vision to become the first net zero continent. So that for me is great news. We shouldn't take it for granted. But people all over the place are working towards this same goal.
[51:48] GaĂ«l: Well, it was a lovely chat and, and it was great. Actually, it was a bit less technical than I would expect with geeks like you. But maybe it was because the right question to be asked is really the use of AI and this kind of ratio, I love how you, Anastasis, how you put it, you know, âmultiplying by 10, otherwise it doesn't seem worth itâ. And, you know, starting with a real live case, and all the tips that you shared, about how Lou and Theo, with Data for Good, manage to create massive impact using AI, and using AI doesn't necessarily mean the latest shiniest model. So thanks a lot. I really do mean it, the three of you, it was great talking to you. And as I mentioned, I hope that we will have some room in the Green IO conference hosted by API days to talk about data for good. I will try to convince some of my guests tonight to be with us in Paris. I'll let you know more in the next episode.
[52:59] Gaël:Thanks a lot once again for joining the Green IO podcast tonight about AI and data for good. In episode 29 we will talk about anthropology and geography and Anthropocene and yes, it will still be a tech focused episode, but taking a huge step back in time, thanks to Maxime Blondeau. Yes, the one and only thought leader on geography and ecology and technology, who shares one map a day on Linkedin to his 100,000 followers.
[53:34] Gaël: And before you leave, a small message from our sponsor. No, I'm kidding, Green IO remains a free and independent podcast, and so we need your help to keep it that way. We have zero marketing budget, so you can support us by spreading the word, by rating the podcast five stars on Apple and Spotify, by sharing an episode on social media or directly with a relative. That's a good idea. Also, thanks for your support. It means a lot to us. Us being me, but also Tani Levitt, our amazing podcast producer, and Jill Tellier, our amazing podcast curator. And stay tuned by subscribing to Green IO on your favorite podcast platform or via the Green IO mailing list. The link is in the episode notes. But you already know the drill - every two weeks, you will get more insights and premium content to help you, the responsible technologists scattered all over the world, build a greener digital world, one byte at a time.
â€ïž Never miss an episode! Hit the subscribe button on the player above and follow us the way you like.
đ§ Our Green IO monthly newsletter is also a good way to be notified, as well as getting carefully curated news on digital sustainability packed with exclusive Green IO contents.