Django Chat

Authentication - José Padilla

Episode Summary

José has made major contributions to the Django ecosystem, especially around Django REST Framework and authentication. He is now an engineer at Auth0.

Episode Notes

Episode Transcription

Carlton Gibson  0:06  

Spanish introduction...let's do the rest in English.

 

Will Vincent  0:32  

We're thrilled to have you on you and I've met a couple times and I think I first became aware of you through your work with J WT. So Django rest framework. So what's the Do you have a quick background on? Obviously Carlton I know about your work in Django, but what's the quick take on your involvement in the Django community?

 

José Padilla  0:51  

Um, so I spent some time building a product called blimp. We are a small team building a project management software. Remember, our first MVP was actually built in flask and like MongoDB. And after failing doing that, right, we switch back to Django that was like probably like Django 1.2. And so I spent at least like, around five years, working with Python and Django. So I had a chance to kind of help maintain Django rest framework and kind of pick up helping maintaining other libraries like pi j WT. Um, yeah. And like that I worked on random other bits and pieces. And by far, Python keeps v. My favorite community.

 

Will Vincent  1:44  

Yeah, well, I will link in the show notes. But I remember seeing UK was a 2014 to Django con, you gave a talk on JW Ts which I think was your first time at a Django conference. Is that right?

 

José Padilla  1:57  

That's right. Yeah, and I think that went well, I didn't actually get that talk anywhere else probably like in another Python meetup. But not in a conference. Putting was like to put that talk was how, at that time visually explaining what a JSON web token is, I think I did a good job at that. And kind of tying that up to authentication with like Django rest framework and Django.

 

Carlton Gibson  2:26  

You know, I have to say, Jose, because you've been working on Django rest framework for just ages and ages and years. We were their new absolute machine, maintaining the framework, you know, ever. I mean, maybe you're doing your part of your work, you know, so you had the time, but you were there. You know, making the fixes doing the updates, you know, absolute rock for years.

 

José Padilla  2:46  

Say, Thank you. That means are coming from you. Definitely. And I like looking back, I don't I think what I contributed the most at that time was kind of the story around Third Party extensions. And you know how that story fits into Django framework and Django and you know, the whole Python community as a whole.

 

Carlton Gibson  3:15  

But yeah, it was definitely not part of my job, free time kind of thing. But I definitely was glad I was able to help out with with things like that, ya know, and you would, you know, absolute store. But that's an interesting point you made about the third party extensions, because it's something that in Django rest framework we were really keen on and still are keen on, because we have very limited time and limited bandwidth to maintain stuff. And so someone comes along with a new feature idea, and it's like, well, first of all, can we put that into a third party package that that that contributor can maintain? themselves? Because that makes that keeps the the core maintainable, right.

 

Will Vincent  3:54  

Yep. So what was what was, what was the quick take on what that service was? So we

 

José Padilla  4:01  

We were three people working on that project. And we got together one day and we were just thinking of like, things we could possibly build a product around. We had all worked in kind of digital agencies and marketing firms, I think that we're mostly they were using Basecamp like, first original Basecamp at that point. So we have really strong opinions about how project management should look like and should work like especially in that creative environments, small teams. So we had we proposed a structured kind of frost process through product management. So we we had a really nice like, niche we, you know, we were we were targeting small teams like ours, three, four or five people. Most work was around The actual tasks. So we know based on Pat messaging, and people would just do project management via messaging, which is via email. So we, we kind of forced you to use our proposed flow. Then, you know, at some point Trello came up, Asana came up, and you know, there were way bigger than us and eventually like it, it made sense for us to just shut it down and focus on other things. Right. Okay.

 

Carlton Gibson  5:33  

What was around there? Because, like that was, you know, five year period, you said,

 

José Padilla  5:37  

Yeah, like, at least five, six.

 

Carlton Gibson  5:39  

That's a good run, right for a startup that you've, you know, small team that you sell.

 

José Padilla  5:43  

Yeah. I mean, think money is completely bootstrapped. We, you know, got together and made it happen. We learned a lot through the way you know, the whole period, and there were definitely really good opportunities for other things to pop up. One of them being

 

This side project I still maintain called full previous.

 

Will Vincent  6:03  

Yeah, I wanted to ask about are you and I have spoken about it, but I think it's a really interesting project. So what's the quick take on it and we can dive into the tech stack because that's, that's interesting.

 

José Padilla  6:13  

So so we were building in a blimp, we had like a file section. And you could upload files, right, share them, and you could upload proficiency files. And once we designed the whole thing, we noticed that we were kind of missing pretty thumbnails for files. So we, you know, if you upload a JPEG or a PNG, that's easy, if you odo, if you upload like a Word document that's a little bit harder to upload a Photoshop file or Illustrator file, it becomes harder. So we, we kind of built this pipeline where you could upload basically any kind of file we were, we would just output some PNGs and that was harder than it sounds. So we like for The first from the get go, we kind of built that separate to blimp as a service, separate service. And we like after we built it and we were using it, we noticed that we could just put a landing page on it, put a pay button in it, and other people might buy it. And surprisingly, it outlived blimp as a product. It's been going on. And I kind of took over it.

 

After we saw the shutdown blip the product.

 

Carlton Gibson  7:34  

That's great because it's like the original microservice, right? It's

 

José Padilla  7:37  

Yeah, right.

 

Carlton Gibson  7:38  

Five years before the book five years before the meme. It's like, yeah,

 

Will Vincent  7:42  

it's Microsoft. And so what's the what's the tech stack? How much can you tell us about the architecture of it?

 

José Padilla  7:48  

Um, so yeah, so it's, it's built with Django, obviously.

 

It does have a simple API. So that's the Django rest framework.

 

You know, we use Celery pretty extensively. We are using Redis as our broker. Postgres is our database. We host our workers on pretty hefty EC two instances. And our API is actually still in Hiroko. So yeah, it's, it's, it gives me an opportunity. I don't work with Python on my day to day job for a while now. And it gives you an opportunity to actually be in tuned with, you know, what's going on with Python and Django. But it also gives me a chance to kind of polish my sailing skills. So we have this is pretty CPU intensive work. And we have pretty interesting traffic patterns. So it you know, I have to think about how we scale this, how we improve our availability. how, you know, we live Be sure we have monitoring and observability up to par.

 

Will Vincent  9:03  

Yeah, cuz I can imagine that you could, and maybe do get a big client just comes in and crushes you with how many, I guess. So how does the traffic look? Is that common that out of the blue? You'll just get nailed with a lot of stuff. And then otherwise it's flat, or is there any consistency there?

 

José Padilla  9:18  

So yeah, so we, you know, felt previous has customers around the world and different kind of traffic patterns. So, we have people doing like bulk imports of just files that they have. So sometimes we'll get you know, 5000 requests, one after the other to Jari previous for different kinds of files. And, you know, the, I guess, part of the challenge is, you know, you'd handle that with scaling your instances. So you have more instances and you can kind of, you know, process the queue faster. But things like that. larger files. So we have customers that sometimes upload a PDF file that is like 20,000 pixels by 30,000 pixels, right? And it's kind of like a PDF. It's like a giant PDF. So, you know, those are one of those could be, you know, could hold up the queue for a longer time. So like, you know, we need to kind of be observing our limits, time limits, memory limits to the limits on file size

 

Carlton Gibson  10:32  

limits. Would this be a good example of a use case for serverless? Like, you know, a lambda functions based approach, or, you know, because that kind of looks like the perfect example, right? You've got discrete jobs that can be done one by one, does it would it work in that environment or

 

José Padilla  10:49  

so maybe so yes, so I'm hoping to be able to try that lambda, like AWS lambda. They have Some limits to the, you know the size of what you can actually install there. So, you know, we, even though our workers run in a Docker container, you could imagine that Docker containers just like a VM. So it'll be interesting to kind of split back into to be able to just fit into whatever the constraint for lambda is. But it should definitely be possible. At this point, you know, um, this is kind of like side project, and I've haven't had the time, but I try, I try I try. I try putting a specific parts of the pipeline. And it just was so like, an additional complexity, like with testing, and just like kind of how we would compile different dependencies that would fit into that particular environment. But yeah, it definitely seems like the thing that you could probably kind of model in that server. Less like architecture.

 

Carlton Gibson  12:01  

Okay, interesting. But he's not, you know, this duplex one. Yes. But the other no is that hang on, you know, the VM? Does it give me a lot here? I've got this little structure.

 

José Padilla  12:11  

Yeah. I spent a lot of time kind of tuning or, you know, what we have to a particular infrastructure model. And that has solved our immediate scaling issues. So I am not, you know, touching that much.

 

Carlton Gibson  12:29  

Can I ask currently, you're working with altero is that right? Yes,

 

José Padilla  12:32  

I do. I started working at off zero as an engineer in

 

Carlton Gibson  12:36  

August. Okay, so in an earlier episode, we were talking about authentication and I I sort of said, yo, you know, I gabled sera go and it didn't really fit in with Django. And so you know, I probably wouldn't use it. And I don't know what's the use case. So hey, can I ask you to put your old zero hat on and tell us the old still the old zero story for Django? Yeah, you got Yeah. You're wearing this sweatshirt. Can you take so if I'm if I'm going to use all zero in my Django project, I'm going to consider it. Can you just give us the sales pitch? What What does it give me as a Django developer?

 

José Padilla  13:08  

Okay, so kind of going back to that episode where you know, you were talking about authentication and authorization. You kind of define those, and I think you did a very good job. And I guess one question, I, as a developer have like in the small side projects using Django, I often question myself, like, why would I use auth? Zero versus something like Django all off? Right. And if we kind of think about Django has batteries included philosophy, you know, Django does a really great work at not letting developers screw up authentication, right, but maybe there's a configurable password hashing system with really great defaults. So you don't have to think about using bcrypt or do I use like mt five or, you know, it's really hard to even sort of like plaintext passwords. And then there's this other whole other things that third party third party packages kind of help with like password strength checks and thoroughly authenticating against like third party identity providers like Facebook or Google. But you, the developer, you still need to think about those security related things, you need to kind of know that you're looking for throttling of your login endpoints and you need to install a package. After you find that you need to configure you need to maintain that and whatever other kind of dependencies that that might have. You might need to have a Redis server or something else, right. Um, so off to Europe provides authentication and authorization as a service. We give developers and companies building blocks to secure their applications without having to become security experts. You know, we provide you f5 device agnostic login experience. So it doesn't matter if you're working on a web app or a mobile app, you can still provide this unified login experience. So it looks like one single thing, right? We provide like many social identity providers, you can add custom wants to brute force protection breach password detection, you know, single sign on password, let's login, multi factor auth user management, and, you know, these are all things that you could probably find if something like Django or Django or app can't provide, you can go and find possibly a third party package that somebody felt, you then need to actually assess if they're, you know, well maintained, if they're well built if there's a keyword, and then we go back to that, you know, security expert thing. off here also does something really cool, which they allow you to write custom JavaScript code to customize any Part of the authentication and authorization flow. So for example, if you wanted to enrich user profiles from data from like clear bit, if you wanted to reuse information from, you know, another database or service, if you wanted to defy another system of like real time logins, you can do that.

 

Will Vincent  16:21  

So

 

José Padilla  16:21  

yeah, so it goes back to like, is, you can the Django provides you login out of box. And you, you know, you can figure out the whole template situation and set yourself. And, you know, going back to like one of the recent episodes, we were talking about patterns. And you after you've built, you know, one or two apps that have login, you'll notice the patterns and you'll maybe have your own project templates, and, you know, but then this other like, little thousands of other things around authentication, authorization and security. you either need to be aware of the proactive or you'll have to be free active after a possible incident. So, either, you know, you kind of delegate all those really important security and trust aspects of your application to somebody like off zero, or, you know, you take that on and just be aware of, you know, what taking that on means, okay, there's that that's a super answer. And so

 

Carlton Gibson  17:31  

one thing you mentioned at the beginning of that, so two thoughts that came up one was you you mentioned about the side project versus the big, you know, it's a side project. Yeah, totally. I can see why you take it on, but also surely pitches itself for big projects, too, right. Oh, yeah. It's not just for developers hope so.

 

José Padilla  17:48  

Yeah. So So it's, it's not only for, you know, somebody building a web app and a mobile app. We are identity platform, right. So if you're a Building a single sign on for your whole, like internal infrastructure services, large enterprise setting off zero would still work for

 

Carlton Gibson  18:10  

you fine. Okay. And then when the second thought was when I tried this a long time ago is when I first came out and I gave it a go. And again, it's been, I didn't really find that it integrated with Django in the way I wanted to, in that it would it would do the JW t client side authentication, I could see how that would give me the mobile app authentication. That was all super. But what I really wanted was, in my view, request dot user to return a, an instance. So is there now a nice story about integrating with the Django project, we'd like remote user, but remote user back end.

 

José Padilla  18:40  

So we have like one of the things that attracted me to austero a long time ago, was their content as a marketing effort, and you know, they have really great guides. So there's I was just checking out the getting started. Good with Django. And since odd zero actually provides on off on sorry. And off to API. You can use something like Django, social auth or Django off. And you would use auth zero as an identity, right. And then you would handle an officer or you would handle Facebook logins. Um, so I, you know, part of the quickstart guide, I was looking at those use remote user back end on the remote user middleware. You know, it really fits into the whole Django lingo.

 

Carlton Gibson  19:41  

Okay, that sounds that sounds super so I think it can you in the shownotes, we can put a link to those Getting Started guys with Django.

 

Will Vincent  19:49  

What I wanted to ask you to so I, I saw on Twitter. I think just yesterday, there was just the North Bay Python conference and you Beto tweeted this thing about Jacob Kaplan moss, Django, co creator, saying stop using passwords use login with Google Facebook. Quote, Google security team is better than yours. I have some, I'm, honestly my jaw dropped when I saw that. But I'm curious what your thoughts are as much more of a expert in this realm than I am. I just don't even know where they're coming from with that. You know, I mean, Facebook was storing stuff in clear text. I mean, I can see off zero, but to just hand it over to you know, one of the couple monopolies to me, I just, I don't know I just, I'm, I feel like I'm missing something

 

Carlton Gibson  20:46  

like this. If this is just a side but we can project then you know, absolutely. The more you can outsource the best but if this is, you know, if your use of database is a core strategic asset of your business, then You can't just hand it over to a third party.

 

So there's that. There's that point,

 

which is important. And yeah, our Facebook. That's a cool because every time I turn on the news, it's

 

Will Vincent  21:11  

it's so crafty and creepy. I mean, Google's not much better. I mean, that's I actually I mean, so with Django all off, which is fantastic. I think I still have the, you know, top Google spot for a tutorial on that. And I actually deliberately don't I think there's one I maybe I showed Gmail, but I don't show Facebook because we're here I go, never working at Facebook. I think Facebook is pretty awful. So I don't feel like doing their job for them. But we got to find a little bit of a rant. So so make the case for odd zero, as opposed to like Google Facebook, if someone just says, Well, I'm just delegating my auth. In either case, why is that? Actually?

 

José Padilla  21:45  

Interesting. So, like, I think one thing to notice is that you can use full off, I mean, sorry, auth zero, without integrating any like social identity providers, right? You don't need to use Facebook or Google or, you know, the other 300 providers that we might have? You can use, like email and password against the auth zero database, right?

 

Will Vincent  22:15  

Um, yeah, isn't like trust is the name of the game? Um, it's especially when you're handling such an important path for an application as authentication and authorization. Yeah, no, I mean, it's a deep topic. I mean, one part of this, which I've seen with some services recently is this login by email where there is no password every time I log in, I just put in my email address and I get a one time link, you know, that's, I can see that it's more secure. It's maybe a little annoying, but I'm seeing that more and more.

 

Carlton Gibson  22:50  

as well. You got to bear in mind as well that a lot of people struggle with passwords. And if you can send them a link where they get a button and they can click it. That's all you to tap it these days, that's actually a much better user flow. So it's, it's arguably more secure probably is. And, you know, assuming the emails not compromised, but the emails that sort of golden key to all of this, isn't it?

 

José Padilla  23:12  

Yeah, so so that's like password list login. And that's something that author also allows you to pretty easily. And, you know, we, since we also provide like, MFA, so you can do like multi factor authentication. You can use like, like the Google Authenticator or authy. or, or, or if you have to, or you can use SMS and it's like, I think now that we're starting to see to fa, become like something normal. It's, I wouldn't say it's not like, I can't get my parents to use like two fa. And so then you can flick make that a hard requirement, right? So I can see like, Patrick List login not depending on your targets, as a UX standpoint, it you know, it might be, it might actually like hurt your user experience. Like, I can't imagine my mom, you know, putting in her email, if you remember switch email to us and then like, Why do I have to go to my email and like I found the email now what? Oh, there's a link. So like, again, like it depends on your target it you can see now that you know, there's a lot of players kind of working to improve the whole accuser sorry for security. Like if you think about security keys like hardware keys, passwords, password list, login NFA, and things like that. So I'm, I'm hoping that in the coming years, we'll have a better story for all of this. I think

 

Carlton Gibson  24:55  

Django needs a better story as well. We don't have the two factor auth built in yet and We don't need it's not we haven't yet got the even like the hooks where you would add that really there are a couple of third party solutions, but they're not, you know that they need to be brought into Django. I think it's part of the batteries that now that you need to be able to plug in two factor. Yeah.

 

José Padilla  25:14  

I recently saw I love the Django admin, and I never turned it off. So for filed previous, I've had to kind of work around things like pagination, because like, you know, at our rates, most things don't work in the Django admin. But like, once you get through that hurt, you know, hurdle. It's like, Okay, so now how do I hide the admin login? Because I already have like this whole other login flow that I want to use and kind of funnel old traffic if you know if it's for regular users or for admin, super users users. And that's, it's been a little bit clunky on that side.

 

Carlton Gibson  25:57  

But the way I do that, which

 

Will Vincent  25:58  

I don't quite understand that you Don't worry, I just can ask is are you giving admin access to some users can see their, their previews in there sounds like there's three levels? No,

 

José Padilla  26:07  

it's just just No, that's fine. Okay, so,

 

Carlton Gibson  26:10  

so what I do in that, in that situation is I use nginx to lock it down to localhost. And then just SSH into the box. Yeah, it's an connect via Tanya.

 

José Padilla  26:20  

And then if you're using Heroku, that becomes like a harder. Oh, yeah. Okay. Okay.

 

Carlton Gibson  26:25  

But like, the reason I do that is because then it's not exposed on a, you know, URL and

 

José Padilla  26:30  

Yep. And like, you know, most logins out there don't even have free limiting or throttling. So you can do like credential stuffing attacks, and you would probably don't even notice. Yeah,

 

Will Vincent  26:43  

yeah. Well, this whole I mean, I wonder just from the, for obviously, over the marketing aspect, you know, all of this stuff sounds the more you learn about the web, the more you go, oh my god, how scary it is. But it's, you know, but a beginner just, you know, wants it to work and has trouble with that. So I, I wonder if maybe at what point does someone go wow, I really need to switch over to off zero is it they already get burned or they're at a big company where the login is complicated. I just sort of it feels like he's almost need to get burned or being a large enough complicated organization before you go. Oh, wow. Maybe I want some help with us.

 

José Padilla  27:25  

Yep. That's, that's, it's I think, right now, it's, it's kind of, you know, reactive. You've had it. Yeah, that experience, you've had a leak you've had, you know, security vulnerabilities being reported to you.

 

And

 

I think one of the other like, great things that author does well is again, this is probably a marketing strategy is, you know, putting out content that is relevant on security topics. Best practices, and kind of like teaching the general audience about security. And hoping that we're going to start seeing people be more proactive about this. And, you know, not having to wait to actually be on a bad position after an incident. And then, you know, having to figure Oh, maybe we should have use an identity provider. And, you know, three years stand with millions of users, that might be a harder thing. But still possible.

 

Will Vincent  28:35  

Right now. I think they have fantastic guides on I remember just JW T's when I was learning about that. And I mean, it's similar to what digitalocean has done, I think, to differentiate themselves, they have fantastic content, because, you know, spinning stuff up is scary for beginners, though it does seem to me just to the business aspect at the end of the day, you know, beginners or kind of whatever, it's all about big clients. I mean, this is what I have to keep reminding myself because I'm, I mean, even Heroku like it I have Heroku in my books, and I mean, almost every day I get some sort of question around Heroku and or sendgrid. You know, Twilio? Just like why can't they be bothered to make this? You know, friendly? And it's because they don't? I think it's because they don't really care. Right. I mean, a couple newcomers isn't the same as having really good. Yeah, you know, docs is supposed to tutorials that an experienced developer at a large company can just dive in and use. Yeah, that's the thesis I have. Now, I don't know,

 

José Padilla  29:27  

I can't speak to

 

you know, bots or or Twilio or digitalocean stuff. But having worked on felt previous I you know, after some time, I decided to cater to a specific target of users. Mostly because I don't have the bandwidth to deal with a lot of like, incoming, like, incoming beginner level leads right at the end of the day held, though. Unfortunately, they'll require a bit more time. And kind of like not not hand holding, but you know, explaining what a web hook is, and things like that. And in my experience, kind of converting that, you know, all that invested time and resources into converting that specific user to a paying customer. You know, the ROI is way less. Yeah, so I'm kind of vague in some parts. I'm kind of vague in like, some of the pricing, I'm vague in kind of like some of the examples that we put out. Um, and for me, that has worked. So I can definitely speak to two.

 

Carlton Gibson  30:49  

I think that's the standard story from SAS businesses, right? It's the big the few big enterprise clients that the one where you're off the end of the pricing thing and it's called us for a quote That's where all the profit comes from.

 

Will Vincent  31:03  

Yeah, yeah. Well, I mean, I'm slow coming to this realization, but it sort of makes sense because I've always, I've always wondered Well, so what so what is the stack at at zero? Because you mentioned you not using Python there, what what sort of technologies are you using day to day so

 

José Padilla  31:16  

it's mostly it's mostly, um, note j s. Like from the ground up main core service interesting been built in node j. s, and we, you know, we have tons of people that are very experienced and how to make note scale. There are some services that are built with Java and go there's some Python I think that it's like for some like ops related things.

 

Will Vincent  31:44  

Yeah, as you switch over to a sink. Just do a total rewrite. No big deal.

 

José Padilla  31:48  

Yeah, I'm currently so when I joined I joined the, the platform domain so we are like the backbone of trust for you know our customers, but we Kind of enable other developers and ops zero to, you know, be able to

 

be as productive as they can be.

 

So I'm working on our untrusted code execution platform. So we basically all this code that our customers can write in JavaScript to extend like different flows of authentication authorization, we basically execute their code and inject the results in different parts of the flow. That's it. And yeah, and all that's a

 

Carlton Gibson  32:36  

great problem, though, like you because hey, give me some code. I'm not gonna know what it is. And I'm gonna run it on my computer.

 

That's

 

that's a tricky problem.

 

José Padilla  32:49  

Yeah, doing that at scale seems hard. But we're, I'm really excited. Like it's, it's, I've been here for a couple months now and I've learned so much from, like, so many smart people, and you know, coming from a small kind of consulting and small products, the experience I've kind of gained, you know, doing things at this scale is amazing. And that's, you know, kind of why I wanted to switch for all the seniors, you know, seeing the entrepreneur thing.

 

Will Vincent  33:23  

Yeah. That's, that's great. Well, and they're they and this may be ignorant question, but so it's node. So are they using? You know, for the web stuff? Is it Express? I mean, I know the web part is just a teeny piece of it. But what is beyond node? What is the stack? I mean, I know it's hard to define in such a big organization, but

 

José Padilla  33:42  

there's some Express on there some happy,

 

Carlton Gibson  33:45  

happy and happy is a kind of batteries included web frame or API framework, right. It's got a provides things like joy for the validation the back end that the authentication back ends Yeah. Don't go looking for them in mind. It's

 

José Padilla  34:01  

definitely

 

it's definitely a little a little more than Express way less than Django. Yeah, and that's that's but I think that there's like work on standardization around. You know, which one should we use one

 

Will Vincent  34:21  

that's all I know. I know that I'm brave browser which actually I use, which is Brendan Eich involved with Yeah, they're big on the the happy train.

 

José Padilla  34:32  

Yep. And I probably could stamp off zero Also, I'm actually they sponsor a lot of open source projects around authentication identity, you know. So, before I even joined on zero, they reached out to me one day and they, you know, asked me if I want to confirm to sponsor the page, a web library, and that's what lots of amazing and I learned that they Do that with many other libraries. So they're actually, you know, we use these libraries internally. And even if we don't use them, the fact that there are other developers are working on this. They care about that. It's really good.

 

Will Vincent  35:17  

You know, I like that they're doing the positive education approach, as opposed to the fear approach, probably because fears can fear something you don't understand. Because certainly, you know, as a parent, I look at all the ads, they're all like, your kid might die, or you're a bad parent, like they're pretty unsettled about it for you know, a car seat or a car. So I appreciate them having that approach. I did want to ask, are coming up on time, I think you just gave a talk at pi Gotham. Right on which will is that out yet? Can we link to that? Yes,

 

José Padilla  35:49  

I can definitely get that the show notes.

 

Will Vincent  35:51  

Um, so can you talk about I thought

 

José Padilla  35:54  

so. Yeah. So it's, it's it's called Python government and contracts. It's It's a talk I gave on, I think September at pi Gotham, and actually my partner in crime Braylon really sadly, he gave a version of that talk at North Bay Python. Okay, that recordings also out. So, you know, after Hurricane Maria passed through Puerto Rico, um, we, you know, I was not in politics at the time. And, you know, I spent weeks if not months without hearing from my parents, you know, due to infrastructure issues, you know, that power was out, salt antennas were out. Water, you know, was And so that kind of leads to some kind of questionable contracts that were awarded to companies to kind of, you know, supposedly rebuild infrastructure in particular and kind of help after the destruction The FEMA was involved in some of that. And that leads to I kind of learning about the so the Office of the Comptroller in Puerto Rico, they put out the contracts that government agencies award to other contractors. And sometime after that, I had some questions as a civilian I am you know, I am not a investigative journalist or anything like that. But I, I did have some questions about, you know, how money was being spent in particular agencies, and I kind of wanted to see, you know, relations between, you know, different contractors doing business as other, you know, companies. And so, I basically build a project together worthwhile and where we can see Took the data, we kind of pulled and downloaded all this data from the Office of the controllers websites. We downloaded the actual PDF documents for the contracts. And we're kind of extracting text from, you know, kind of things that I've learned doing fall for us. We're extracting that text, we're indexing that data. So now it can be searched and correlated. So this is kind of like a, you know, civic hacking project that I'm very passionate about. I've been working on it for on and off for almost a year. And finally, this year, we decided to talk about it. Because we thought it would be interesting story about, you know, kind of how we went about it. This is open source, it's built Django, Django is framework, Postgres, our front end, just like next year. Yes. And yeah, so I'm definitely very, very excited to keep talking about this. We we're, we're at a point where we're, you know, talking about How we can be more transparent about, you know, how this is financially sustainable. It does cost some money to run, not that much. So we, you know, we want to be transparent about that and see if we need help from, you know, others. And I'm finally, like, excited to see where this goes, You know, I want to empower other people like me to kind of know what questions to make, and kind of be able to visualize and at the end of the day, just hold accountable, whoever needs to be hold accountable,

 

Carlton Gibson  39:36  

these kind of things, you will see this, you know, billions of dollars being budgeted for rebuilding and yet somehow on the grounds that you know, the stuff wasn't rebuilt, and where's that money gone? And it all it takes is the light to be shown on where that money goes, and it will be spent in the right places, whereas when there's darkness, it goes into people's pockets. So that's super,

 

José Padilla  39:57  

I mean, and this would seem this would seem like Like the recent, we talked about hearing muddy and how that leads up to this. And even after, you know, two plus years I've heard of that happening is that, you know, recently the FBI arrested, a former employee and the CEO of one of the companies involved in Well, these contracts. So this is still relevant today. And there's ongoing investigations to multiple, you know, some politician has a daughter working for, you know, $10,000 a month in some like, unrelated thing in the government. And now we're starting to see this thing's like popping up and it's like, yes, we're fighting against like corruption.

 

Will Vincent  40:42  

So what do you do you know if any of these federal agencies are using your work to help in their investigations, or do they have their own tools? How does

 

José Padilla  40:51  

so so I don't think so. At the end of the Davis, actual contracts, they are available for us. The Office of the Comptroller I do know that, you know, there is like some journalists kind of using this tool instead of the Comptroller's tool. Just because it's, it sucks less. And it's, you know, you get some insights. Even at the simplest level, you get some insights of spending over time. And just doing, you know, correlation between, you know, I don't know, a contractor being entered into the system with a typo. And you can still kind of search through that and see, oh, you know, this two contractors are the same entity, and they've been awarded this amount of money and you can see the actual PDF there without having to download it. Yeah, so like, at the end, I wouldn't have wanted to build something Yes, I would have wanted the tool put out by the opposite controller to be this. Yeah, but um, you know, I'm happy to have done it. And, you know, keep, like, I spent a little time, you know, kind of figuring out how to gather this data. It's definitely not open data. It's, you know, it's on their site somehow. And I managed to just download it very slowly and make it happen. So like, if now I can, like empower, I don't know, some data scientists to build cool visualizations on on something else that I don't know, like, I'm spending by particular geographic, geographic location by fiscal year so that they can now do that without having to spend all this time and effort into actually gathering data super.

 

Will Vincent  42:47  

Well. That's fantastic. I mean, because I had to make this about me. I spent two years of my life on a project with school data in the United States where every state has all this reporting data on it. these various tests that ever wants to take third eighth grade High School, all these types of things, and I probably just got a teeny glimpse of what you had where it's just a nightmare getting this information. I mean, first of all, you know, some states had it online, you could just download the CSV files, you know, like, Massachusetts, California, some states. You know, it's like a PDF. I think it was, I think it was Kansas, who said, I had to file a Freedom of Information Act request to get a CD with the data, you know, and it's just I, so I guess, it's a combination of nobody has the ability and impetus to actually pull all these things together. And you realize how futile all these government agencies are where they're just checking their box? And it's like, no, it doesn't even occur to anyone to use it for these broader like, well, what if we wanted to analyze the data? What if we wanted to use it for something they spent millions and 10s of millions collecting it and then it just

 

Carlton Gibson  43:53  

goes deeply more deeply sinister than that is that there are institutional biases against making the data openly available because people in positions of power have a vested interest in keeping it quiet. Sure. So it's important that people like Jose do the civil society stuff to bring it out into the light and call them to account. That's just that's how democracy is maintained. That's how I'm welling up. It's

 

Will Vincent  44:18  

Yeah, well, and that's one thing I love about Django is it's I think, unusual, its roots in journalism and investigative journalism. You know, from the beginning, right. I mean, Jacob worked at 18 F. I mean, Simon's on a fellowship at Stanford on this, I mean, I kind of love that, you know, non pure tech background of Django, whereas you said you wanted an easy way to display this data in the Django part didn't stand in your way it was the collection and all the rest.

 

José Padilla  44:43  

Yeah, I mean, I keep going back to you know, Django because I hate reinventing the wheel. And so it's, it's, you know, I have a project template that kind of have some of my preference baked into it, and I feel so comfortable.

 

Carlton Gibson  44:58  

No, I use that. I it's one My favorite

 

Will Vincent  45:01  

way to use Minecraft Oh

 

Carlton Gibson  45:02  

use yours is fine, but I use Jose's Okay.

 

José Padilla  45:07  

Sorry.

 

Will Vincent  45:09  

I'm competing over free tools.

 

José Padilla  45:11  

Yeah, so it's it, you know, it goes back to not having to reinvent the wheel all the time. Like, I just want to make things and do it fast and you know, gonna get out the door as fast as possible. And you know, Django and Python are things to do that. As we wrap up or any plugs you want to make for us. We have a personal site jpd calm which will link to anything around side projects or Azzurro, you want to mention as we head out the door, so I mean, Azzurro is always hiring. So I'll put that link in the show notes. You know, I'll also link to that pi Gotham talk. It's something I'm really passionate about. I also set up my kind of sponsorship profile on GitHub. So like, maybe like that. Hoping to kind of get back on track with some of the projects that I've, I'm working on, that I know I've kind of over time, let you know, kind of sit away and be entertained. And I kind of feel really bad about that, um, because you know, people actually use them. So I'm hoping to, you know, start spending some more of my free time, kind of getting them back to date, especially like dropping to Python to support.

 

Carlton Gibson  46:28  

Right, but that it's, there's just let me just say, I just have to say that you mustn't feel guilty because the time that you give to open source is phenomenal. And it has been phenomenal over a massively sustained period of time. You've got a life you've got family, you've got work, you've got other commitments, the guilt is is not necessary. Your your work is stands on its own. Thank you.

 

Will Vincent  46:52  

Right. Well, again, I'm so pleased we could have you on

 

José Padilla  46:55  

my pleasure, a

 

Will Vincent  46:57  

great thrill for me to meet you in person, you know, a year ago. So yeah, we'll put links to everything. Thanks again for having me guys. Thanks for coming on.

 

Unknown Speaker  47:06  

So Joe