TIL about cuckoo filter, a similar algorithm to bloom filters. Both algorithms support fast set membership testing, but cuckoo filters expand on this concept by providing limited counting, deletion and a bounded false positive probability while maintaining a similar space complexity.

cuckoo filters do not play nice with concurrency.

TIL about the uuencode and uudecode command utilities. It allows us to decode and encode binary files respectively.

$ echo 'foo' | uuencode -o my_binary my_binary
$ file my_binary
my_binary: uuencoded or xxencoded text, ASCII text
$ cat my_binary
begin 644 my_binary

$ uudecode my_binary
$ cat my_binary

TIL about quickfix-reflector to make the quickfix window modifiable. I’ve been using it in conjunction with plugins like fzf.vim and async-run that writes to the quickfix window, after that the plugin makes commands like :w and :x reflect the action to every file listed in the quickfix window. It’s a great plugin to enable search and replace, allowing things like replacing words through regular expressions across files in a very convenient way.

TIL about async run to run shell commands asynchronously in Vim 8 or NeoVim. The output of the shell command can be read in real time using the quickfix window. Let’s see an example:

" Running the command asynchronously
:AsyncRun git push origin master

" Opening the quickfix window

An interesting option is to set g:asyncrun_open with a value like 8 to open the quickfix window automatically with a certain height when an asynchronous command is running.

TIL about the RUM index extension in PostgreSQL. It is a variant of the GIN index that stores additional information in the posting tree like lexemes position and timestamps. This solves known GIN problems like slow ranking and phrase searching.

The drawback of course is that this additional information incurs extra build and insert time, making it not ideal for tables that change often.

TIL about the Power of Two Choices algorithm used in load balancing, queuing theory and distributed systems in general.

The main takeaway is that it is better to pick two queues at random and choose the one with the least amount of work than finding the best queue and sending a workload there. Why? If we are dealing with many decision makers (as in distributed load balancers for example) and each one choose the best queue disregarding each other choices then all their choices will go to the same queue, overwhelming it.

NGINX implements this algorithm for their load balancers for example, check it out!

TIL that we have a builtin bsearch in the Array class. It’s common for Rubyists to default to find, but as we know bsearch performs much better when we have a sorted Array.

Let’s see an example:

ary = [0, 4, 7, 10, 12]
ary.bsearch {|x| x >= 4 } #=> 4
ary.bsearch {|x| x >=  -1 } #=> 0

We can also use it to search String objects, but in this case we need to use the <=> operator in order to return -1, 0 or 1. Here is an explanation of the specifics of this operator.

options = ['Deleted', 'Draft', 'In Review', 'Published', 'Unpublished']
options.bsearch { |option| 'Draft' <=> option }  #=> 'Draft'
options.bsearch { |option| 'Random' <=> option } #=> nil

Notice that the term being compared must be on the left side.

TIL about Amdahl’s Law as a way to measure the result of a given optimization. It’s especially useful to compare different optimizations and see the maximum performance each optimization provides for your system. Another cool thing is that Amdahl’s Law works for parallel or serial programs.

It has a few implications:

  • Make the common case fast
  • Optimizations wil have less and less effects (Law of diminishing returns)

Here’s an interesting video about it.

TIL that Hash#deep_merge accepts a block in which you can specify how values from the same key should be merged:

h1 = { a: { b: 1 }, c: 2}
h2 = { a: { d: 3 }, c: 4}

# => { a: { b: 1, d: 3 }, c: 4 }

h1.deep_merge(h2) { |k, v1, v2| v1 + v2 }
# => { a: { b: 1, d: 3 }, c: 6 }

TIL about running tests through a spawned database instance off a RAM disk. This has been applied for a long time, but completely flew under my radar. An early example of this setup with Rails and MySQL can be seen here.

The idea is to make your test database’s storage live in memory instead of on-disk since it makes for faster reads and writes, even when considering the use of SSD.

Copied to clipboard