Omnindex Journal 27/jan - Too big to encrypt!
This is a journal I write while I build Omnindex, a search app, and post here, without much editing, just writing what comes to mind
Just when I thought I had solved the encryption problem, I hit a wall: size limit. Apparently I cannot encrypt large texts, it fails:
I then tried to just increase my key size from 2048 to 4096, it worked for larger text, but still not enough for what I want (10k+ characters). Then I tried 20480, but the code just hangs forever, apparently its suuuper slow.
Searching on the interwebs I found this: https://crypto.stackexchange.com/questions/41745/encryption-of-large-files-rsa
Interesting! Hybrid cryptosystem? Is it common? Do people use that?
Apparently yes! They do! For some reason I had never stumbled on that before, how is it that GPG does it? Because I had encrypted large files in the past with my public key using GPG
PGP can be used to send messages confidentially.[7] For this, PGP uses a hybrid cryptosystem by combining symmetric-key encryption and public-key encryption. The message is encrypted using a symmetric encryption algorithm, which requires a symmetric key generated by the sender. The symmetric key is used only once and is also called a session key. The message and its session key are sent to the receiver. The session key must be sent to the receiver so they know how to decrypt the message, but to protect it during transmission it is encrypted with the receiver's public key. Only the private key belonging to the receiver can decrypt the session key, and use it to symmetrically decrypt the message.
Ooohh so they do use the hybrid approach, interesting!
But how will I save this symmetric key? On a separate column? Together with the symmetric encrypted content? I think together, I don't want to create another column. But then how do I specify which is which? And should I also specify which algorithm I'm using to be able to change later? A JSON maybe? A separator? Like a pipe? |
Wait a minute… JWT just comes to my mind, I think that's exactly what they do isn't it? A JSON with all this information? Yeaaaah I think it is, their webpage summarizes it beautifully right on the home!
I found Joken lib for elixir for JWT, let's try it out!
Meh, the more I read about it and try stuff out, the less it seems that JWT is what I want, I don't seem much asymmetric keys support as a first-class concern, and I don't think I want to wast space with the header overhead or if need signatures to verify. I think I will roll my own™️
Starting out, unfortunately ExCrypto is not working for symmetric encryption, not even for the main example listed on their docs, for some reason
I checked the issues on GitHub, and apparently it indeed broke since OTP 24, but it works if we point to the master branch of the repo instead of downloading form hex.pm, let's see, in mix.exs:
(I learned my lesson the hard way to always specify the latest commit "ref" if you are downloading from git, otherwise next week it might already be a completely different beast.
Okay it works now, awesome!
Okay so this is it!
It works! I'm merely concatenating the encrypted key with the encrypt content, awesome!
Can I decrypt it now? Nope, did not work, I need the IV as well. Okay one more try:
Yes! I can now decrypt it no problem!
Cool… but I'm not sure I like it very much to have all this work, all this decoding and two algorithms and larger size. What is the benefit I'm getting with this asymmetry? And what performance impact am I paying for it?
Well, the benefit for me is… not clear, both encrypting and decrypting are happening on server side, so I have access to public key on the same place I have access to the private one. I had the idea of sending out the public key to the extension installed on user browser, but I'm not sure what much extra safety that would add as well, since they are already communicating through https with Omnindex.
Let's see… I'm thinking, I think we only need a public/private scheme like that when we don't trust the mean the content is being transmitted right? Or when we are going to post the content publicly… I keep thinking of where this would be needed in Omnindex but nothing comes to mind… okay I think I will get rid of public/private key and got with a symmetric key instead. But first, let me do a benchmark of both just for curiosity:
2.89ms on average vs 0.0853ms… well, 2.89ms is not bad, but if I want to render 20 search results it's already 20ms extra I'm adding to every search result. Not counting other columns I want to encrypt, like url and title, there it goes 60ms, and I'm already over the 600ms mark on average. Moreover, it will also take more space on disk for the key itself:
Okay I think I will revert back to symmetric keys only, it's not like I can't change my mind later, I can, would just need to decrypt it all and migrate each user.
Cool, maybe tomorrow I will do that