Skip to main content

Posts

HashFile: A disk-based hash structure

Previously, I introduced a problem I was trying to solve where a large datastructure was being pinned into memory for occasional lookups. This post delves into the implemented solution which pushes it onto disk but retains (relatively) fast lookups. I think using a database or a B-Tree is a good solution to this kind of problem in general, but it was fun and inexpensive to implement this utility, and it turned out to be generally useful. Bear with me if you already understand HashMaps pretty well, because I'm basically describing a HashMap here, but it's a disk-based HashMap.Logically, the data consists of a series of key-value pairs. The keys and values are of variable size, because they contain strings. If we were to write only the values to disk in a binary format, we might have something like this for the JSON example in the previous post: There are two records, at offsets 0x00000000 and 0x000000A4. If we had some way to map a key to one of these offsets, we could seek() to…
Recent posts

The case of the unwieldy HashMap

Some data structure was pinning over 70MB of heap space in Android Studio. Our developers have limited memory on laptops, and are often upset about memory consumption in general. This behemoth (retained by our internal plugins) was the second largest allocated and pinned single object in the whole of AS's heap.buck project generates IDE project definitions from buck targets. It can be configured it to emit a target-info.json file, which contains simple mappings that look something like this: { "//src/com/foo/bar/baz:baz" : { "buck.type": "java_library", "intellij.file_path" : ".idea/modules/src_com_foo_bar_baz_baz.iml", "intellij.name" : "src_com_foo_bar_baz_baz", "intellij.type" : "module" "module.lang" : "KOTLIN", "generated_sources" : [ "buck-out/fe3a3a3/src/com/foo/bar/baz/__gen__", "buck-out/fe3a3a3/src/…

In the lunch line with Larry Page

The Whisper project (which became Nearby) got started as part of the Google+ org. The Google+ team sat in the same building as Larry Page, and we'd often see him in his office or walking around the building.
Google building 1900 to the right One day, a few of my teammates and I were standing in line at Cloud Cafe, in the restricted part of building 1900 where the Google+ team sat. Right in front of us in the line was none other than Larry Page himself. One of the engineers on the team struck up a conversation with him, and Larry asked us how Whisper was going. We hadn't launched anything yet, and were deeply in the midst of doing crazy cool things with ultrasound.My teammate answered quite honestly that it was hard, and it was taking much longer than we expected to get it to a dogfoodable and eventually launchable state. Larry asked him why - what made it hard? Probably a fairly innocuous, polite question, but coming from the CEO and founder of Google, it sort of takes on an ext…

Shoes and secret projects with Vic

Google used to have a tradition called TGIF. It still seems sad that I'm talking about this in past tense, but hey ho. At TGIF, the founders and other executives got up on stage, welcomed nooglers (new Google people), introduced a bunch of prepared speakers who talked about interesting things that were going on, then opened themselves and other executives up to pre-voted and audience questions. There are many infamous things that happened at TGIF during my time there that I can't talk about, but I was physically present in Charlies for these:the time they let off fireworks inside Charlies to celebrate a Nexus device launch and gave everyone in the company the new devicethe time they announced that everyone at Google was getting a significant pay raise and bonus. People went wild. There was screaming and yipping. It was like a music concert in the 80s.the several times Patrick Pichette came by with a huge backpack of cash and everyone got an envelope with 10x$100 bills as a Chr…

The very scientific microkitchen testing event

One of the things I miss most about the office are the microkitchens. Facebook and Google both have fantastic selections of yummy snacks to fuel folks through the day. It's actually quite great for adhoc conversations and just getting away from your desk for a bit. I've tried to replicate this by purchasing a box of Funyuns from Amazon to keep at home, but it's just not the same. As I'm... you know... pathetically eating my Funyuns at home on my own with my shorts on.At Google, the microkitchens were legendary when I started in 2008. But by the time I left in 2019, they had changed a lot. The stock was intentionally kept low, and the snacks were healthier. This wasn't always a popular thing. For a period of about 3 years or so, the snack selection, which used to rotate fairly regularly, was frozen in time with the same set of things. I think this was due to Google trying to plan the future of the microkitchen program, and it just took a while. Or something like tha…

It's a like a startup inside a startup, Topaz

At Google, I knew a great engineer (let's call them Topaz) who, after working on a bunch of different teams and being pretty successful at what they did, decided to join the hot new team that was hiring like crazy. It was pretty exciting for them, they told me many times. For me, I've been in some of those situations where you're anticipating something new in your life so much you have the most vivid dreams about it actually happening. As if it were actually happening now. It was like that for this person. Now this was part of Google was hot at the time; it was in the realm of all things social when Google was trying to do that. It was breathtaking how all-in Google went on the social stuff so quickly, and there was a buzzy aura around the teams who worked on it. The building they were in had restricted access (because reasons), still a relatively rare thing at that time, and its own not-so-secret restaurant.  So, the big day came, and Topaz joined the new team (let's c…

Intercepting behavior with java agents

A previous post showed using JVMTI to log method calls in a non-intrusive way, and without having to make modifications to upstream libraries. JVMTI is much more powerful than that post showed - for example it can replace and modify code in a running JVM altogether, which can be useful for things like logging or performance measurements, but also intercepting or changing behavior at runtime. It is, however, quite cumbersome to write code for that sort of thing in C or C++ using the JNI interfaces. It turns out Java provides a higher level interface to instrument or redefine classes using the Java programming language itself. This post will demonstrate a ridiculously simple example of such an agent. You can find the example code in GitHub. A simple program Let's start off with the really simple program that we want to instrument. The Greeter class does the time honored thing of saying Hello World. We've for some reason awkwardly and weirdly moved the World part of that…