Hi! I’m Dylan, a full-stack software engineer and journalist. I strive to code new tools to help journalists and the public.
I work at The Washington Post as a lead engineer on the Elections Platforms team. This team is part of the greater newsroom engineering team, which builds tools, graphics, and data pipelines to support the newsroom.
Previously, I worked at MuckRock, where I led development of journalism technology platform DocumentCloud, and at Google on the Machine Perception team. I’ve taught data journalism and studied computational journalism, computer science, and music.
To get in touch, feel free to reach out at firstname.lastname@example.org.
I pursue many projects on and off work. Here are some highlights. For reporting work, see Media below.
Semantra is a multipurpose tool for semantically searching documents. Query by meaning rather than just by matching text. The tool, made to run on the command line, analyzes specified text and PDF files on your computer and launches a local web search application for interactively querying them. The purpose of Semantra is to make running a specialized semantic search engine easy, friendly, configurable, and private/secure.
Textra is a command-line tool for Mac OS to convert images, PDFs, and audio files to text. Leveraging Apple’s APIs, Textra is able to perform OCR (optical character recognition) and audio transcription entirely on-device. The tool has flexible options and a colorful output. It requires Mac OS 13+.
Crosswalker is a general purpose tool for joining columns of text that don’t perfectly match. It features a custom matching algorithm to populate initial predictions and a spreadsheet-like interface for refining. Crosswalker was built at the Washington Post and has many applications, but was designed for the purpose of matching precinct names released on election day to past elections as quickly as possible to power the Post’s election night model. It is open source.
FastFEC is a command-line tool and library for parsing U.S. campaign finance data quickly. It is written in C with a focus on being as performant and memory-efficient as possible. FastFEC powers the parsing in the Washington Post’s campaign finance pipeline and has helped with some stories. The tool is open source and there is an online demo.
Covid Map is an interactive, explorable, and zoomable map of current and historical COVID-19 cases and deaths in the United States. The site utilizes deck.gl, a WebGL-based library for displaying large datasets, to performantly map each county’s data. The data is sourced from the New York Times’ open source covid data. First published in March 2020, the site automatically pulls in data updates every day using a custom Google Cloud functions workflow. Source code.
Planet Gallery is a virtual showcase of every known exoplanet displayed as a countour plot of its surrounding starfield. This work was done in collaboration with Lawrence Peirson. The site was featured in Stanford’s Art of Science 2020 Exhibition.
Course Website: datajourn.com
Ripple Plastic is a virtual reality experience I created in conjunction with award-winning photographer Mandy Barker and the Stanford Journalism Program. The interactive exploration highlights the growing plastic pollution in the world’s oceans by juxtaposing plastic debris Barker photographed on beaches with a guided narration.
I coded the experience using the A-Frame Web.VR framework (source code). Steps involved include 1) mapping 2D photoshop layers into 3D objects using a customly designed Blender pipeline, 2) creating a layout algorithm to place objects randomly on a sphere using a combination of Perlin noise and Poisson-disc sampling, 3) coordinating production with fellow journalism students, and 4) composing custom theme music for the experience. I wrote a blog post about the process. The experience premiered at the Our Plastic Ocean exhibition at Impressions Gallery, England in 2019.
Inferactive is an interactive data tool for finding insights and inferences I made as part of my master’s thesis. The entirely web-based platform features a four-step analysis and discovery flow: 1) upload CSV data or select an example dataset, 2) refine by selecting columns to include/exclude from a table/detail view, 3) view statistics and one-dimensional charts for each data column, and 4) discover insights and trends in the data by viewing a shuffleable assortment of auto-generated plots correlating two data columns.
The project is written in Svelte and geared for frontend performance and the ability to handle large datasets.
“Sounds” is an interactive sound wave primer I wrote for fun in 2018. This online computational essay guides readers through the basics of sound wave theory. Part 2 delves into more experimental sound functions and 8-bit chiptunes.
“Auto-complete for music composition.” Tapcompose is a web app I built as part of a Stanford course that lets anyone write sheet music, bar by bar, with auto-generated suggestions. The web app features dynamic sheet music generation and a synced playback bar, with sounds generated using the Web Audio API in-browser. Source code. YouTube demo.
AudioSet is a static website I designed at Google to showcase an ontology of sound events and collection of over 2 million manually annotated YouTube clips. The website presents all the YouTube clip thumbnails, dynamically requesting more as the user scrolls using JSONP. I used Closure Templates to build reusable components for static pages. I also contributed to the paper behind the website, which was presented at the 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) in New Orleans.
Sonority is an interactive visualization of the harmonic similarity between all the Beatles’ songs. I built the graphic using D3.js. Using a custom Python processing script, I statically generated thousands of audio clips representing the most aligned excerpt of each pair of songs. This work was part of my undergraduate thesis which I presented at the 2015 International Society for Music Information Retrieval (ISMIR) conference in Malaga, Spain.
For this report, I interviewed record-holder Michael Granville, whose 1996 race performance at the semifinals of the California State Meet has yet to be matched. Beyond his running performance, the video explores his relationship with his father who coached him and unearths never-before-uploaded footage of the actual race. The video features original music I composed in high school when I was an 800m athlete.
In recent years, residents of the Santa Cruz mountains have organized social media communities on Facebook to share trail camera footage of their wooded backyards. The shared media reveals just how prevalent, and elusive, mountain lions are in the bay area. I interviewed leaders of the social media groups and produced a video report and article.
An article I co-authored with Jackie Botts on the impact of the 2017 “Wine Country” fires on undocumented immigrants. The article got picked up by PRI’s The World and features a written component and video report.