This week, Ryan shares how he's been using GPT-3 for data classification, including what's worked well and where the tools have fallen short.
Show Links
Follow us on Twitter: @thedwwpodcast
Email us: podcast@digitalworkspace.works
Visit us: www.digitalworkspace.works
Subscribe to the podcast: click here
YouTube channel: click here
Ryan Purvis 0:00
Hello, and welcome to the digital workspace works Podcast. I'm Ryan Purvis, your host supported by producer Heather Bicknell. In this series, you'll hear stories and opinions from experts in the field story from the frontlines, the problems they face, how they solve them. The areas they're focused on from technology, people and processes to the approaches they took, that will help you to get to grips with a digital workspace inner workings.
I'll tell you some stuff I've been doing, which I think is quite interesting. I've got a whole lot of data that I wanted to classify. And typically how a classified is I would go through it line by line and do whatever I need to do. But I got a bit clever last night. And I thought, You know what, I'm just going to use notion. And I'm going to use chat GDP. And I'm going to classify this stuff. And it worked really well, to the point that I actually broke Zapier. I put too much data through it. And it wasn't very happy. But I was very impressed with I mean, it's very clunky, the way I was doing it, it's not that's not the way you should be doing these things. But it was actually quite kind of cool that I did like 800 records in about 15 minutes, which would have taken me maybe, I don't know, two hours to go through one by one. And of course, there's there's a risk of error in both cases. But, you know, I felt pretty comfortable with the outputs. In that it was actually quite, you know, it's quite accurate. And, and I was tweaking it to so what I was doing is I was I was asking it, I gave it a framework, I said this, this is the criteria that I want to I want to allocate this data to. So it's about 20-25 categories. So I asked, sent that sent that 25 With the every time I asked the question of the day, it was because it's frozen in a titan Earth Excel spreadsheets every time I seem to have the request, I had to send the same 25 criteria with but basically, every time you see this, can you match any of those criteria to the item I'm sending you. And if you match it, give me a percentage of of match or confidence. And then if you cannot match it, give me what you recommend, we should be putting in. And I went a little that was quite amazing. I had, you know, matches with percentages. And I had a couple new recommendations for new categories. Okay, one or two categories were very close to other one, so I discarded them. But, you know, relatively speaking, if I get it to work, so So I had to move from using Zapier to using writing some Python, because I was just finding that Zapier was a bit constrained to what you could do. But you know, with the with the Python running last night, I actually ran out, so you only get the $120 that you can use with the API for check GDP. So that at about midnight ran out of I ran out of capacity. And that basically killed my whole classification script. But it was fun. I mean, it was really, I was really impressed with with hafod Guys. The only problem with that killing the thing is all the work that it did, it dropped it, because it didn't save it to the file, which was just my bit encoding. But that's, you know, something to like, and I can address today, we're gonna get back to do it. Because that's the only problem with these things as you get into these little exercises. And it becomes almost too, too consuming. Because you like all of us, they almost they almost there the then you you're not
Heather Bicknell 3:52
found very interesting. We're using that there's new chat to PC and freshing connection. I don't know what they felt, but I've been giving their advertisements for it seems like they've been pushing that. Have you tried that at all.
Ryan Purvis 4:11
So I haven't used the one specifically what i've what I've been using is you can connect directly to you could configure it the Zapier to connect to your own API with chatgpt to be so that's all I did, is I connected using my author using my own secret key and stuff? So it's using my account. So I've got a billing account, and I've set the limits and all that stuff. And I use it as part of my sort of day to day thing. I think you're telling someone else about it today. So I've got a just a notion table that I just have set up with a few fields. And that builds up my question to the Chatbot and sending it and then I wait. I mean, I used to be instantaneous but the problem I found with notion is that minutes to type something The way the Zapier thing works, is there's not a, there's not a clean way to say, I'm capturing this information, now you can send it. What it does instead is the minute you create the entry in notion, it triggers the event. So you have to be a little bit clever and put in like a delay, and IntelliJ, to come back and look in five minutes to see what the end of the question is. So So I build the question, you know, tell me, what are the what's the best way to write an objective as an example. And then, in my notion table, other fields for who's the persona this should be targeted at? What format I want the response back in? And that sort of thing. So it's very easy for me to, to build the question, and then it sends it off to to the API, and then I get a response back five minutes later, with whatever the answer is, and that's my call, because I can sort of write I guess it down, I'm gonna write five or six things easily that I want to know about. And others let the board get that get that and reply to me. And yeah, some of the stuff is really good. And some of the stuff is a little bit dodgy. But, you know, it's still useful in some respect. So maybe, yeah, it's use API quite a lot. And I find that a bit better than logging into the website, the other websites more stable. I find a bit better than that. Because I can, I can store the stuff now notion. And in all my queries that I've requested from from the API, I've now got them stored. So I can use them in the future. Because that's the other thing is often I'll ask somebody that I don't have it stored, and I want to get it. And unfortunately, it's been the websites unavailable for whatever reason. And now I need to go and wait for that stuff to come back. Or I need to go and search somewhere else on Google to find it again. Which is a pain.
Heather Bicknell 6:57
yeah, so do you have access to a newer model, or is it the same version that's available publicly?
Ryan Purvis 7:09
So it's still three is that so actually, I'll tell you my code, I think it's the DaVinci model. Now, I'm using. So I'm using the, I'm using the DaVinci, scissor to model for most of the stuff. I don't know if that's the latest model. But it works. I mean, I've been so impressed by by when it's when it's picked up. And you know, so now that I've written the Python, it can be a little more sophisticated. So now I can ask it to classify the data for me as it's going through it. And then I can also ask it while it's doing the classification, to pull out the keywords and the scoring that's using to do the classification. So you start building out your own little sub criteria, if you like, for the classification. So you don't have to keep using the API, because that's the point in the end is you don't always want to send the data to the API to give the answer because it slows things down. If you can do some basic, if you start seeing the common patterns for the words, you can speed it up a little bit using just text analytics, which also in turn, sends a St. John, the cost of using the API to not that the cost has been that bad. But it's good to have a mechanism that is a bit does a bit of storage for you likely as well, just in case also, there's a problem with connectivity.
Heather Bicknell 8:31
Yeah. And is it able to learn from you at all? Like, are you helping trainees? Yeah, you're not? It doesn't?
Ryan Purvis 8:44
Well, if you ask the Chatbot, if it doesn't store any data, it'll say no. And, and I got to believe, to an extent that they are not storing any data at all, because, you know, I've asked the same question 10 times, I've got 10 different answers, which you would expect, if it was storing the data? It would, it would be learning as it was going, and that example of the of the criteria that I was sending across. It's not storing that. So I've got a similar criteria each time otherwise, I get some really, and useful responses to be honest. But you know, it's, it's, it's definitely it's definitely working as a tool. It's definitely helping me deliver things. Like just giving me some, some ideas, you know, some what to call it, I'll say classification stuff, but it just closes the gap sometimes, because sometimes your biggest problem is that you just you just can't figure out what you want to actually do. You know, the question is do an answer, but you can't actually figure out how to start so you just push that push what you know, through and almost say, what would you do here? And sometimes it comes back with something that's completely useless. And you're like, Yeah, wasn't that but now that you said that I can go this other route? Or what comes back with exactly what you were thinking, but it's been worded in a nice way. So I find it beneficial in various ways.
Heather Bicknell 10:10
Yeah, it's a bit like having a colleague, you can bouncing things off of or like doing a bunch of general research generate some initial ideas faster.
Ryan Purvis 10:26
Yeah. And that's often what it is. It's like, oh, look at this code thing, like, you know, thought, when you're writing code, you really want to talk to someone else about it. And you get exceptions and errors, and whatever. And you can just post those acceptance straight in there. And again, it'll give you what a Court interpreted as, okay, I got to a point last night, and I was just like, I got to the point of being tied to that the answers weren't that helpful, you know, kept saying, Well, you need to import this library. And I was like, Yeah, with the library doesn't exist anymore. The API has changed, give me an alternative API. And it couldn't have kept going back to the same. But the same sample code kept saying this is a fix it. So it kind of got in its own little loop. excuse the pun, on that. But, you know, for for having someone to just pose questions to get answers very much the same way as we used to when you wrote code. It was, it's fine. I mean, now, some of these issues, I'm gonna have to speak to someone else who's done this before, to see if they've got any guidance. Otherwise, I'm gonna spend, you know, couple hours on Googling around trying to find examples, or someone else that solved this. But, you know, I'm actually hopeful that just by having, you know, fresh eyes on it tonight, I'll look at it again and go actually, you know, it's not that difficult. This is what I have to have to put in.
Heather Bicknell 11:39
Sounds like some interesting experimentation
Ryan Purvis 11:46
yeah, it's, it's amazingly, it's really, it really is fascinating to see how useful this stuff is. I mean, obviously, AI is not not a new concept. I mean, it's been around since the 50s. But if you look at how it's it's not only arrived, in a sense, it's cool and snazzy, and it works. I mean, you've had the the Google one that came out, it was a good board, I think that got the answer wrong and wiped out, you know, a substantial amount of the Google share price.
Heather Bicknell 12:18
With that new their recent sort of play to fight back against the AI.
Ryan Purvis 12:23
Yeah, so that so they launched their, their board thing, I think was last week or the week before that.
Heather Bicknell 12:30
Their competitor to Yeah, chatgpt
Ryan Purvis 12:35
yeah, and it really just, yeah. It didn't ask the question around. It was a very simple question. And yeah, it's, it took off a good substantial amount of, of the share price, because we've got the wrong the answer wrong. And just shows you I mean, no one's ever said that chatgpt is correct. It's always been a prompt mechanism where you where you ask, and you get an answer whenever it whenever it can. It's always strictly like someone called it like a fancy AutoCorrect. In some respects, it'll just work out with the next sentences. It's not it's all it's all sincere sentient being that's that's answering your questions for real. But in the same token, it's mostly right most of the time, because it's obviously consumed data, or, or patterns where the problem has been declared, and the solution has been provided. And there's enough confirmation that the solution is correct.
Heather Bicknell 13:37
Yeah, I'm sure whatever Google created was pretty rushed in order to try to capture the new cycle. But yeah, as we sort of talked about last week, it can, you know, really disrupt Google business model, at least on the search side. If, you know, AI, of course, search becomes the preferred experience, and Microsoft becomes the default for that. I still don't I think they'll, you know, keep experimenting and probably come out with their own version. them is a bit more accurate as well. Just a matter of time, and, you know, probably be the rush to get it out. I'm sure it was contributed to would be poor performance there.
Ryan Purvis 14:32
Well, if you thought I mean, it's up because I use my Google account for my chatgpt signup. I've noticed that when I go and search now, there's a little chat to GDP card if you like on the right hand side. And that is that's telling me each time we do some a search for something, it's given me a summary of what I've searched for. So it's already it's already part of the search. In some respects now, that's, that's not the bog tool that's obviously a competitor. But if you Think about it when we talked about was it last week for? I mean, the biggest thing about search, it's, it's all about search. It's about advertising. You need to keep people in your search in your search page, not not messing around inside a chat interface, because you can't really, you can, but you know, you're not trying to provide somebody with adverts while in a chat.
Heather Bicknell 15:21
Yeah, I don't know how they're going to square that, that model. So if we could change the nature of, you know, where ads get placed on a page or something like that? Or if you could even pay to have your results come up more frequently? I don't know. Because I think I shared like the rules had to frame search summary mechanisms for a while things like featured snippets or the you know, sort of FAQ section, and there are tactics that websites will use to rank within those spots to get more attention. And it'll be interesting to see if there's sort of a code to crack on doing the same thing for the AI chat experience, you know, if there's certain back end SEO or keyword criteria that is stronger in terms of being pulled into the chat gpt style response.
Ryan Purvis 16:32
Sorry, I didn't understand that.
Heather Bicknell 16:36
So, you know, the Do Not Featured Snippets are in Google search. When you search.
Ryan Purvis 16:45
Yeah, it did hear something about that. I'm trying to think what I heard about that, yeah carry on.
Heather Bicknell 16:51
Yeah, so teachers tickets have been around for three years at this point. But you don't when you search something, and you're not getting the list of results, you get sort of a summary or a different widget that sort of telling you the answer in a different way. Yeah, like a featured panel at the top, that's a Featured Snippet. So websites has done some sort of deconstruction of them, to try to figure out ways to get ranked with that featured snippet. So essentially, with featured snippet, it's not just about ranking number one on their search results, because sort of the features simply become the top spot on the search result. And when you are optimising your website, you always want to rank as high as you can in order to get someone to click through to your own website, right. So companies or you know, individuals, whoever is working on these sites will do different or use different SEO tactics, the search engine optimization techniques to rank within those Featured Snippets as well. So I'm sort of suggesting that there'll be some experimentation that happens to try to rank within the AI model as well, but I don't know yet. You know, it'll be interesting to see if there's sort of a code to crack there in terms of getting, you know, from searching, being AI, for example, for you know, build me a five day travel itinerary to a robot or something like that. And it would pull up certain, you know, hotel results or something within the itinerary suggested, right. So if I was a hotel, I could try to optimise to get within that search every time, things like that. So that makes more sense.
Ryan Purvis 18:34
Yeah, and I think that's what we talked about the sense of context, and, you know, really, really helping your research. But also, you know, we AI should help you scale things, you know, the worst part of our planning a trip is the planning. Because of all the things you got that you don't think of, especially don't travel a lot. Yeah, I mean, it's a very exciting space to see what happens. And I think what, what a good thing, but also a scary thing is, and I was looking through some old data, I mean, the growth of the market was already exponential. And it's gotten even more. You know, it was it was a hockey stick with that was always on its on itself that's going up so quickly, because people are now seeing with with what's out there, how possible. It all is.
Heather Bicknell 19:26
Yeah, I think this is sort of the a very accessible, more powerful model. And that sort of game changing about it is that it feels the potential feels so widespread, I guess, in terms of how it affects multiple sectors or sort of change the way we interact with the internet Even so, the fact it's a fascinating thing to watch unfold for sure.
Ryan Purvis 20:00
Yeah, yeah, I mean, I wonder, now that we're seeing this. I mean, you know, I was looking at some other stuff we were doing. And I think we talked about it last year, there's almost an opportunity now for a marketplace of AI services. So you don't necessarily need to have your own AI, you can just go and buy a service, you need to complete your problem. Until you get to the point that you see there's value in what you're doing. And maybe you want to do it slightly differently, or you want different results or whatever it is. But you are you are getting almost a leg up because someone else has already done the work.
Heather Bicknell 20:36
Yeah, certainly makes sense over everyone trying to build their own that. And that that seems to be the model that open AI has taken. So yeah, it's interesting to see all of these new integrations or connections being put out there, whether Zapier or the integrations into the Microsoft 365 ecosystem or other start taking advantage of the model in that way.
Ryan Purvis 21:09
Yeah,well, that's it, I mean, and you know, when you look at the barrier to entry for some of the stuff, I mean, it used to be so high. And now, it's becoming even easier and easier. While I say this, you still got to put some work in, but you're a product you've never built previously, because of the investment you need. That's almost going away when you still have to have, you know, time and energy to build something. But the technical piece to it is, is almost removed now, because there's so many services available to you.
Heather Bicknell 21:42
Yeah. Interesting time for sure to be able to leverage leverage in that way. Any any final final words or thoughts? What you've been playing around with?
Ryan Purvis 21:58
Well, I mean, I'm seeing AI everywhere. So obviously, I use notion and notions got it now where you can create your page and start typing, and then you can insert AI to that. Obviously, content generation is still still a big thing. I'm actually almost still looking for that, that service that. I mean, and Apple tried this a little bit with the Apple news, but it didn't really work for me, that content aggregator that figures out what's interesting to you, and find more of what you're interested in instead and summarises that for you, that I'd be interested to see. And maybe there was a website that someone told me about today, I think, I think it's called, there's an AI for the.com. And I need to go look at it to see if there is anything there that could be useful. From a would he call it a use case point of view? Because that's really the biggest problem we all have right now is that there is just so much information. And, you know, I was looking at your Facebook, your Twitter, your you know, using your email, you've got LinkedIn, all these different things. And how do you determine what's real? How's what's right and, you know, without exhausting yourself where to look? So, my final thought on it.
Heather Bicknell 23:17
Okay. Interesting stuff. Yeah, for sure. Let's wrap up there.
Unknown Speaker 23:22
Yeah, sounds good. Super cool. Have a good one.
Ryan Purvis 23:26
Thank you for listening to today's episode. Hey, the big news app producer and editor. Thank you, Heather. For your hard work on this episode. Please subscribe to the series and rate us on iTunes or the Google Playstore. Follow us on Twitter at the DWW podcast. The show notes and transcripts will be available on the website www.digitalworkspace.works. Please also visit our website www.digitalworkspace.works and subscribe to our newsletter. And lastly, if you found this episode useful, please share with your friends or colleagues.