Skip to main content

Posts

Showing posts from October, 2020

HashFile: A disk-based hash structure

Previously, I introduced a problem I was trying to solve where a large datastructure was being pinned into memory for occasional lookups. This post delves into the implemented solution which pushes it onto disk but retains (relatively) fast lookups. I think using a database or a B-Tree is a good solution to this kind of problem in general, but it was fun and inexpensive to implement this utility, and it turned out to be generally useful. Bear with me if you already understand HashMaps pretty well, because I'm basically describing a HashMap here, but it's a disk-based HashMap.Logically, the data consists of a series of key-value pairs. The keys and values are of variable size, because they contain strings. If we were to write only the values to disk in a binary format, we might have something like this for the JSON example in the previous post: There are two records, at offsets 0x00000000 and 0x000000A4. If we had some way to map a key to one of these offsets, we could seek() to…

The case of the unwieldy HashMap

Some data structure was pinning over 70MB of heap space in Android Studio. Our developers have limited memory on laptops, and are often upset about memory consumption in general. This behemoth (retained by our internal plugins) was the second largest allocated and pinned single object in the whole of AS's heap.buck project generates IDE project definitions from buck targets. It can be configured it to emit a target-info.json file, which contains simple mappings that look something like this: { "//src/com/foo/bar/baz:baz" : { "buck.type": "java_library", "intellij.file_path" : ".idea/modules/src_com_foo_bar_baz_baz.iml", "intellij.name" : "src_com_foo_bar_baz_baz", "intellij.type" : "module" "module.lang" : "KOTLIN", "generated_sources" : [ "buck-out/fe3a3a3/src/com/foo/bar/baz/__gen__", "buck-out/fe3a3a3/src/…

In the lunch line with Larry Page

The Whisper project (which became Nearby) got started as part of the Google+ org. The Google+ team sat in the same building as Larry Page, and we'd often see him in his office or walking around the building.
Google building 1900 to the right One day, a few of my teammates and I were standing in line at Cloud Cafe, in the restricted part of building 1900 where the Google+ team sat. Right in front of us in the line was none other than Larry Page himself. One of the engineers on the team struck up a conversation with him, and Larry asked us how Whisper was going. We hadn't launched anything yet, and were deeply in the midst of doing crazy cool things with ultrasound.My teammate answered quite honestly that it was hard, and it was taking much longer than we expected to get it to a dogfoodable and eventually launchable state. Larry asked him why - what made it hard? Probably a fairly innocuous, polite question, but coming from the CEO and founder of Google, it sort of takes on an ext…

Shoes and secret projects with Vic

Google used to have a tradition called TGIF. It still seems sad that I'm talking about this in past tense, but hey ho. At TGIF, the founders and other executives got up on stage, welcomed nooglers (new Google people), introduced a bunch of prepared speakers who talked about interesting things that were going on, then opened themselves and other executives up to pre-voted and audience questions. There are many infamous things that happened at TGIF during my time there that I can't talk about, but I was physically present in Charlies for these:the time they let off fireworks inside Charlies to celebrate a Nexus device launch and gave everyone in the company the new devicethe time they announced that everyone at Google was getting a significant pay raise and bonus. People went wild. There was screaming and yipping. It was like a music concert in the 80s.the several times Patrick Pichette came by with a huge backpack of cash and everyone got an envelope with 10x$100 bills as a Chr…

The very scientific microkitchen testing event

One of the things I miss most about the office are the microkitchens. Facebook and Google both have fantastic selections of yummy snacks to fuel folks through the day. It's actually quite great for adhoc conversations and just getting away from your desk for a bit. I've tried to replicate this by purchasing a box of Funyuns from Amazon to keep at home, but it's just not the same. As I'm... you know... pathetically eating my Funyuns at home on my own with my shorts on.At Google, the microkitchens were legendary when I started in 2008. But by the time I left in 2019, they had changed a lot. The stock was intentionally kept low, and the snacks were healthier. This wasn't always a popular thing. For a period of about 3 years or so, the snack selection, which used to rotate fairly regularly, was frozen in time with the same set of things. I think this was due to Google trying to plan the future of the microkitchen program, and it just took a while. Or something like tha…

It's a like a startup inside a startup, Topaz

At Google, I knew a great engineer (let's call them Topaz) who, after working on a bunch of different teams and being pretty successful at what they did, decided to join the hot new team that was hiring like crazy. It was pretty exciting for them, they told me many times. For me, I've been in some of those situations where you're anticipating something new in your life so much you have the most vivid dreams about it actually happening. As if it were actually happening now. It was like that for this person. Now this was part of Google was hot at the time; it was in the realm of all things social when Google was trying to do that. It was breathtaking how all-in Google went on the social stuff so quickly, and there was a buzzy aura around the teams who worked on it. The building they were in had restricted access (because reasons), still a relatively rare thing at that time, and its own not-so-secret restaurant.  So, the big day came, and Topaz joined the new team (let's c…

Intercepting behavior with java agents

A previous post showed using JVMTI to log method calls in a non-intrusive way, and without having to make modifications to upstream libraries. JVMTI is much more powerful than that post showed - for example it can replace and modify code in a running JVM altogether, which can be useful for things like logging or performance measurements, but also intercepting or changing behavior at runtime. It is, however, quite cumbersome to write code for that sort of thing in C or C++ using the JNI interfaces. It turns out Java provides a higher level interface to instrument or redefine classes using the Java programming language itself. This post will demonstrate a ridiculously simple example of such an agent. You can find the example code in GitHub. A simple program Let's start off with the really simple program that we want to instrument. The Greeter class does the time honored thing of saying Hello World. We've for some reason awkwardly and weirdly moved the World part of that…

Aapt2: Tower of Babel

OK, we'd fixed a bug that brought us back to our baseline build speed with aapt2. But could it be made even faster?This is the final part of a three part series about adventures with aapt2, Android's resource compiler / optimizer. You can read the intro bit and get more context here. The second post explained a small fix that resulted in a nice performance win.Big Android apps like the hypothetical ones from that hypothetical company that I hypothetically work for tend to contain a lot of strings that are translated into hypothetical languages (er, maybe I didn't need that last hypothetical). However, during the compile-run cycle, most developers are usually working with a single language. This is not to understate the importance of testing with a variety of languages, but it's reasonable and normal to restrict things somewhat in dev builds for developer efficiency reasons.There are really a lot of strings in a lot of languages in some of these hypothetical apps. Li…

Aapt2: Please don't delete me!

By making one weird change in aapt2, we sped up our build by 45 seconds. Developers love that stuff. This is the second in a three part series about adventures with aapt2, Android's resource compiler / optimizer. You can read the first bit and get more context here.Proguard is an optimizer that many Android apps use. It can do nifty things like removing unused code and resources, inlining things that have no real reason to be in separate methods, and even obfuscating symbols so you can pretend like nobody will ever be able to figure out what your clever code is doing. In modern Android, the r8 shrinker has a similar function, and is driven by proguard configuration files.However, r8 / Proguard can't always figure out if something is used, or sometimes optimizes more aggressively than you'd like. Configuration directives can be used to tell it to keep things that would otherwise be removed. aapt2 has options that let it emit configuration files for resource related code. …

How I learned to love aapt2

The Android Asset Packaging Tool (aapt2) takes all those lovely resources (images, strings, cat pictures, and whatnot) in your Android app, and compiles them into a binary format that the runtime understands. It's also the thing that generates numerical identifiers and constants for them in R.java, which is the class you use to refer to resources in code.You should rarely have reasons to interact with aapt2 directly, since for most Android developers, it's something that happens automatically during a build with Gradle, or your build system of choice (e.g. Bazel, or in our case Buck). Suffice to say, you're either doing seriously hardcore interesting things, or you're maybe working on a build / developer infra or something like that (we're hiring!) if you're interacting directly with this tool.aapt2 operates in two phases; in the compile phase, it converts individual resources into a binary representation (either a .flat file or a .flata, which is just a zip of…