RT @mattcantstop
New keyboard. I’ll slow down for a while, but should speed up considerably. I can type at ~80 words per minute now. Hopefully can get into the many 100s with some practice. @CharaChorder

RT @doctorow
When you buy a record or CD, you own it, thanks to copyright's "first sale" principle. I have criticisms of copyright law, but at least it's created by a democratically accountable legislature. When you buy a digital download, your use is governed by private ToS, not law. 1/

RT @phil_eaton
I've been hacking on DataStation for a year now so time for a retro and what's next!

4k+ stars in 15 repos. 7 posts on the front of HN. Dozens of failed investor chats (including YC).

tldr; DataStation has a bright future & I'm on the job market. :)


I just started listening to @PJVogt's new podcast about web3, and he so eloquently articulates what I find intriguing about web3 despite my many misgivings.

Pleased to open Hacker News today and see @monicalent on the front page.

My blog post about my homelab NAS server is #2 on Hacker News today!

I submitted the post on the day I published it, and it didn't even make it to the front page. With some posts, it pays to try again on the weekend when there's less competition.

The Morning Show is exactly how I imagine it would be if Aaron Sorkin wrote a series exploring life behind the scenes of a struggling TV show.

RT @allison_seboldt
The results are in for April:

My programmatic SEO experiment is killing it 6 months in! Over 2k visitors in April alone. That's +226% from last month!

All the nitty gritty details in my April retrospective: allisonseboldt.com/april-2022/

Update! @wmertens and @dholth showed me a simpler way to achieve the same performance without storing the file size redundantly. Creating an index achieves the same thing.

Show thread

The last tricky part was writing a SQL migration to populate the sizes of files that were uploaded and stored in the DB before this change. I've never written an update that derives from other data in the DB before, but it wasn't too hard.

Show thread

And we have a winner! For the same 1.1 GB file, latency dropped from 9s to 9ms, a 100x speedup.

Show thread

Next, I tried storing the file size along with the file metadata

Show thread

This surprised me, and I still don't have a good explanation for it. It's 3,708 rows, so it doesn't seem like it should take SQLite *that* long to calculate the SUM of 3708 values.

I'm guessing the large blob in each row slows down the query even though we don't read it.

Show thread

Storing the chunk size worked, and it brought the latency down from 9s to 839ms, a 10x performance boost.

But 839ms to calculate the size of a single file was still pretty slow...

Show thread

But based on the 9s latency, calculating sizes on the fly wasn't going to work.

My first thought was to store the chunk size alongside the blob in the table containing file data. That had the advantage of keeping size close to the data it described.

Show thread

I checked the SQLite docs. They didn't explicitly say that LENGTH reads the full blob data, but it suggested for strings, it calculated length on the fly by looking for the first null byte. I'm assuming for BLOB types, SQLite iterates through the full contents of the column

Show thread

It worked! Page load time dropped to 8ms.

I was confident that the SUM(LENGTH(chunk)) line was causing the latency.

Show thread

I tried removing the LENGTH() function and just hardcoding the size calculation to 1.

Show thread

I was able to reproduce the issue locally by uploading a 1.1 GB file. The page load time jumped to 9.3 seconds.

Show thread
Show older
Michael Lynch's Mastodon

Michael Lynch's personal Mastodon instance