In 9542c6da
I added video support in posts on this website!
My use-case is small looping .mp4 files without controls (without sound, so they can autoplay).
<videostyle={{ display: 'block', margin: '0 auto', maxWidth: '100%' }}key={props.src}controls={false}autoPlayplaysInlineloopmutedwidth={videoMetadata[props.src].width}height={videoMetadata[props.src].height}><source src={`/posts/${id}/${props.src}`} type="video/mp4"></source></video>
I get the video metadata from a library called mp4box during my build step.
I added a few features to my nodots programming language (it's just a little tree-walk interpreter, nothing fancy).
These PRs were fun to implement. They're a little hacky.
I was mentioned in Val Town's Restricted Library Mode blog after reporting a series of critical security vunerabilities in their JavaScript runtime.
If you are familiar with the challenges of sandboxing user code, you might realize that we set ourselves a herculean task: keeping user secrets sandboxed while also allowing them to pass arbitrarily rich computational objects between each other. After playing and losing this cat-and-mouse game with the fantastic exploit-finder Andrew Healey one too many times, we decided to admit defeat and race to a more securable semantics ASAP. Specifically, we needed semantics that allowed for process isolation and serialization between all user code.
I'm really happy with how they responded to the issues I raised β and I continue to think it's a neat platform!
I found an optimization that speeds up the large benchmark build of Ter by 20-30sec!
[Ter is a] tiny wiki-style site builder with Zettelkasten flavor
My PR was just merged, it's a tiny change I found after profiling Deno using --inspect-wait
and chrome://inspect
.
I got a shoutout in the Val Town newsletter!
The award for best Val Town user this month clearly goes to Andrew Healey (@healeycodes) for his tireless work trying to outwit our and Deno's sandboxing. Heβs up to maybe 5 exploits so far. So hats off to Andrew for keeping us all secure π
Over the past few weeks, I've been poking at some JavaScript sandboxes (including Val Town's Deno runtime before it was released). My experience has confirmed something I already expected β which is that you need to sandbox "from the outside" or there will be obvious exploits.
Some examples of sandboxing from the inside:
- vm2 (see some of the past breakouts)
- Running user code in an isolated WebWorker but relying on
fetch
being hidden e.g. so you can limit the amount/type of network access. You can dofetch = fetchWrapper
but people will still find a way to access the original.
Some examples of sandboxing from the outside:
- V8 isolates
- Running code in a AWS Lambda (and controlling access to the Lambda, and not caring if hostile code accesses the runtime environment you've created)
- Or using the software that powers AWS Lambda (Firecracker).
Here's a performance-focused refactoring pattern I've applied a few times to great effect. In my head, I call it a "promise pipeline" but there's nothing super special about it β you could also just call it "writing fast concurrent code".
Let's say I've been tasked with speeding up βimage jobsβ while working at some company that mails framed photographs to customers every month. I'm working with code that already exists and is running in production so I don't have time to rewrite it from scratch or ship new infrastructure. The goal is to just make the code faster.
Since I profiled before optimizing, I've narrowed the issue down to a slow function that glues together calls to external services. This function receives a list of image links and needs to:
- Download them
- Upscale them using AI
- Make a third-party API to mail the customer each image
Here's what we're starting with:
async function handleImages(user: User, imageURLs: string[]) {// Download imagesconst images: Blob[] = [];for (const imageURL of imageURLs) {images.push(await (await fetch(imageURL)).blob());}// Upscale themconst upscaledImages: Blob[] = [];for (const image of images) {upscaledImages.push(await upscale(image));}// Mail them to the userfor (const upscaleImage of upscaledImages) {await user.mail(upscaleImage);}}
Maybe you've spotted the first problem. This function does one thing at a time! It uploads images one-by-one, and then upscales them one-by-one, and then finally mails them one-by-one. For 50 images, it takes ~8.5 seconds to complete.
Let's add some concurrency β with limits so that we can respect our contracts with external services. For same-process concurrency limits in JavaScript, I like the semaphore pattern (e.g. deno-semaphore, await-semaphore).
async function handleImages2(user: User, imageURLs: string[]) {// Through trial and error, we found that other services// can handle up to this amount of loadconst downloadSemaphore = new Semaphore(5);const upscaleSemaphore = new Semaphore(5);const mailSemaphore = new Semaphore(5);// Download images (5 at a time)const images = imageURLs.map(async (imageURL) => {await downloadSemaphore.acquire();const image = (await fetch(imageURL)).blob();downloadSemaphore.release();return image;});await Promise.all(images);// Upscale them (5 at a time)const upscaledImages = images.map(async (image) => {await upscaleSemaphore.acquire();const blob = await upscale(image);upscaleSemaphore.release();return blob;});await Promise.all(upscaledImages);// Mail them to the user (5 at a time)const mailTasks = upscaledImages.map(async (upscaledImage) => {await mailSemaphore.acquire();await user.mail(upscaledImage);mailSemaphore.release();});await Promise.all(mailTasks);}
We had to make the function longer. But it's faster. It takes ~1.2s β an improvement of 7x.
Let's say that this isn't enough. Speeding up image jobs is priority zero on the roadmap.
We can take our performance refactoring one step further if we think about how data flows through our function. As it's currently written, there are no guarantees around when images should be mailed to users β only that they should all be mailed by the time the function returns.
Here's the key fact: an image doesn't depend on the progress of another image.
Instead of designing our function with shared steps that block the progress of all images:
- download images β upscale images β mail images
We can instead think about the flow of each individual image:
- download image A β upscale image A β mail image A
- download image B β upscale image B β mail image B
- download image C β upscale image C β mail image C
If we run these flows concurrently, while still respecting contracts with external services, we will achieve optimal concurrency.
We end up needing less code too.
async function handleImages3(user: User, imageURLs: string[]) {const downloadSemaphore = new Semaphore(5);const upscaleSemaphore = new Semaphore(5);const mailSemaphore = new Semaphore(5);// Images don't depend on each other!// They can flow between steps independentlyconst pipeline = imageURLs.map(async (imageURL) => {await downloadSemaphore.acquire();const image = (await fetch(imageURL)).blob();downloadSemaphore.release();await upscaleSemaphore.acquire();const upscaledImage = await upscale(image);upscaleSemaphore.release();await mailSemaphore.acquire();await user.mail(upscaledImage);mailSemaphore.release();});await Promise.all(pipeline);}
Instead of taking as long as the sum of: the slowest download, the slowest upscale, and the slowest mail call. The function will now take as long as: the image with the slowest combination of calls.
This final version of the function takes ~700ms β a speed-up of 1.7x.
If you're wondering how I arrived at these numbers, I wrote a test script to measure each version with 50 image URLs, and mocked external calls so they would take Math.random() * 100
milliseconds.
In the real world, calls to other services don't take an evenly distributed amount of time. Calls have spikes and high P99 latencies β so the impact of going from version two to three is actually much higher!
(In theory, there are rare circumstances where version two and version three can take the same amount of time for some inputs but in practice version three will always be faster).
The amount of joy I get from making tiny changes to this website is unreasonable high.
I am reflecting on this after adding some border radius to all code blocks in b1dd2af8.
I also made it easy to start writing a note (like the one you're reading). All I have to do is run node createNode.js
in the root directory of this repository.
// Creates a note in ./notes/ with the schema `${TIMESTAMP}.md`const fs = require('fs')const path = require('path')const notesDir = './notes'if (!fs.existsSync(notesDir)) {fs.mkdirSync(notesDir);}const filepath = path.join(notesDir, `${Date.now()}.md`)fs.writeFileSync(filepath, '')// This can be cmd+clicked from VS Code's terminal to open/edit it!console.log(filepath)
In comparison, my flow for adding a new blog post is fraught with friction.
- Write the blog in Notion
- Get feedback from friends who add Notion comments
- Export from Notion to markdown
- Create an empty markdown file in
./posts/
- Copy frontmatter from an existing post (and edit the front matter)
- Paste in the exported post from Notion (minus the title)
- Fix apostrophes and quote marks to be the ASCII variants (
'
and"
) - Fix any weird markdown differences (sometimes I need to do stuff like add or delete empty lines)
- Manually move the images from the exported dump to:
public/posts/$postId/$imageName.png
- Update the image markdown links in the created markdown file I just created
Most of the steps after Export to markdown could be automated by a script that consumes an exported Notion dump. Seeing that I've used this tedious blog creation flow for more than a year, it's probably worth writing a quick script to automate it!
I am sunsetting tags.
All my tag pages (e.g. /tags/go
) were getting very low traffic compared to all other pages. Plus, due to the type of content I write, I never really bought into the use case. If I were writing programming tutorials then a tag system might make more sense.
More benefits of removing tags: freeing up a tiny bit of UI space, deleting code, not having to decide what kind of tag a post belongs in.
To be a good netizen, I won't kill any external links. /tags/*
will redirect to /articles
.
I'm also adding a "star system", where my favorite posts have an asterisk next to them. The visual design of this is identical to Linus Lee's list of posts because I've always been a fan of the design! Check it out on /articles.
Playing chess via voice commands.
I'm often pacing around the room while holding my new baby β putting him to sleep, soothing him, burping him. I've also been playing more chess lately.
I hacked together a script to play lichess games via voice commands (so I could put my laptop on top of my wardrobe while holding my almost-asleep baby). The results were okay, I got it to play some moves for me, but it wasn't very reliable (~40% chance I'd need to repeat myself). So I put it aside for now.
Getting a voice recognizer to parse standard notation ("e2e4") seems harder than getting it to parse normal speech (like nouns, verbs, etc). I used Whisper via speech_recognition
, and pyautogui
to handle clicking.
As you can see below, I tried some manual parsing to help with the accuracy. It would work better if I used more distinct words for the squares rather than standard notation but I wanted to use standard notation.
import pyautoguiimport speech_recognitionr = speech_recognition.Recognizer()# lichess.org full screen, 14in MacBook Protop_left = [472, 217]bot_right = [991, 738]# calculate rowsx_space = (bot_right[0] - top_left[0]) / 7x_rows = [top_left[0] + (x_space * i) for i in range(0, 8)]y_space = (top_left[1] - bot_right[1]) / 7y_rows = [top_left[1] - (y_space * i) for i in range(0, 8)]# e.g "A1" -> (472.0, 217.0)def position_to_xy(pos: str):letters = list("HGFEDCBA")numbers = list("12345678")letter, number = pos[0], pos[1]xy = (x_rows[letters.index(letter)],y_rows[numbers.index(number)],)return xywhile True:with speech_recognition.Microphone() as source:print("waiting..")voice = r.listen(source)command = r.recognize_whisper(voice)print(f"got command: {command}")# examples: "E2, E4.", "E2 E4"trimmed = command.upper().strip().replace(".", " ").replace(",", " ")positions = trimmed.split()if len(positions) != 2 or len(positions[0]) != 2 or len(positions[1]) != 2:print(f"warn: didn't get a valid position, got: {positions}")continuefrom_square = position_to_xy(positions[0])to_square = position_to_xy(positions[1])print(f"moving {positions[0]} -> {positions[1]}")pyautogui.moveTo(from_square[0], from_square[1])pyautogui.click()pyautogui.moveTo(to_square[0], to_square[1])pyautogui.click()
Hello, World!
The design of this small notes system is stolen from inspired by https://muan.co/notes.