I was building a digital hymnal app — lyrics for hundreds of songs, search, setlist management, the works. But something was missing. Musicians don’t just want lyrics. They want sheet music. The problem: there’s no convenient API for hymn notation. What does exist? Thousands of MIDI files scattered across public collections. So I built a pipeline to convert them.
The Pipeline at a Glance
The process breaks into four steps: collect MIDI files from public hymn archives, convert them to ABC notation (a text-based music format), fuzzy-match converted titles against the hymnal database, and render the notation as interactive sheet music in the browser.
Each step had its own challenges, but the trickiest part wasn’t the conversion — it was matching filenames like "It is well with my soul (VILLE DU HAVRE).mid" to database entries that might be stored as "It Is Well With My Soul" or "It Is Well".
MIDI to ABC Conversion
ABC notation is a text-based format for music. A melody that would take kilobytes in MIDI or megabytes in MusicXML fits in a few hundred characters of plain text. It’s perfect for storing in a database column and rendering in the browser.
The midi2abc CLI tool (part of the abcmidi suite) handles the heavy lifting. The conversion function shells out to it and cleans up the output:
function midiToAbc(midiPath: string): string | null {
try {
const output = execSync(`midi2abc "${midiPath}" 2>/dev/null`, {
encoding: "utf-8",
timeout: 10000,
});
const lines = output.split("\n");
const filtered = lines.filter((l) => l !== "calling midi2abc");
return filtered.join("\n").trim();
} catch {
return null;
}
}
MIDI filenames carry useful metadata. The English Hymns collection uses the format "Hymn Title (TUNE NAME).mid", so extracting the title means stripping the extension and the parenthetical tune name:
function extractTitleFromFilename(filename: string): string {
return filename
.replace(/\.mid$/, "")
.replace(/\s*\([^)]+\)\s*$/, "")
.trim();
}
The script also extracts the key signature from the ABC output’s K: field — midi2abc outputs lines like K:C % 0 sharps, and I parse just the key letter to store alongside the notation.
The MIDI collection is organized into letter-range subdirectories (A-C, D-H, I-N, etc.), and I filter out -Hymnal and -Melody variant files to avoid duplicates. After processing all subdirectories, the script typically converts 1,000+ files in under a minute.
Fuzzy Matching Against the Database
This was the real puzzle. MIDI filenames rarely match database titles exactly. “A Mighty Fortress Is Our God” might be stored as “A Mighty Fortress” in the database, or vice versa. I needed a matching strategy that was flexible enough to catch these variations but strict enough to avoid false positives.
I settled on a three-tier approach:
function fuzzyMatch(
title: string,
dbTitles: Map<string, string>,
alreadyMatched: Set<string>
): string | null {
const norm = normalize(title);
// Tier 1: Exact normalized match
for (const [slug, dbTitle] of dbTitles) {
if (alreadyMatched.has(slug)) continue;
if (normalize(dbTitle) === norm) return slug;
}
// Tier 2: One contains the other (min 4 words)
for (const [slug, dbTitle] of dbTitles) {
if (alreadyMatched.has(slug)) continue;
const existNorm = normalize(dbTitle);
const shorter = Math.min(norm.split(" ").length, existNorm.split(" ").length);
if (shorter >= 4 && (norm.includes(existNorm) || existNorm.includes(norm))) {
return slug;
}
}
// Tier 3: 90%+ word overlap
const normWords = norm.split(" ");
for (const [slug, dbTitle] of dbTitles) {
if (alreadyMatched.has(slug)) continue;
const existWords = normalize(dbTitle).split(" ");
const shorter = Math.min(normWords.length, existWords.length);
if (shorter >= 4) {
const overlap = normWords.filter((w) => existWords.includes(w)).length;
if (overlap / shorter >= 0.9) return slug;
}
}
return null;
}
The normalize() function lowercases, strips punctuation, and collapses whitespace — so "O Come, All Ye Faithful" and "o come all ye faithful" become identical.
The alreadyMatched set is critical. Without it, a popular title might match to multiple database entries, or a second MIDI file could overwrite a better match. Once a database song is claimed, it’s off the table for subsequent matches.
A Second Source: Open Hymnal
The English Hymns MIDI collection is large but not exhaustive. For songs it didn’t cover, I pulled from the Open Hymnal project, which distributes hymn arrangements as ABC files directly — no conversion needed.
The Open Hymnal enrichment script uses the same fuzzy matching logic but skips songs that already have notation from the MIDI pipeline. This layered approach — primary source first, fallback second — filled gaps without creating conflicts. Between the two sources, the pipeline matched notation to the majority of songs in the database.
Rendering with ABCjs
With ABC notation stored in the database, the hymnal app renders it client-side using the ABCjs library. The core rendering call is straightforward:
const tuneObjects = abcjs.renderAbc(containerRef.current, notation, {
responsive: "resize",
staffwidth: 700,
paddingtop: 0,
paddingbottom: 0,
visualTranspose: transpose,
add_classes: true,
foregroundColor: isDark ? "#e8e0d0" : undefined,
});
ABCjs turns the text notation into SVG — scalable, crisp on any screen, and styleable. On top of basic rendering, I added several interactive features:
- Playback using ABCjs’s
CreateSynthAPI, which generates audio from the notation without needing sound files - Transposition controls that shift the notation up or down by semitones, updating both the visual rendering and the audio playback
- Speed adjustment from 0.5x to 1.5x, useful for learning or practicing
- Cursor tracking that highlights notes as they play, so musicians can follow along
- Dark mode support by manipulating SVG
fillandstrokeattributes to match the app’s theme
The combination of text-based storage and client-side rendering means there’s no server load for sheet music — it’s just a string in the database and JavaScript in the browser.
Key Takeaways
- Text-based music formats are powerful. ABC notation stores a full musical score in a few hundred bytes. It’s diffable, searchable, and trivial to store in any database.
- Fuzzy matching needs guardrails. Without minimum word counts and the
alreadyMatchedset, the matching would produce too many false positives. Each tier adds flexibility while the constraints prevent chaos. - Layer your data sources. No single collection had everything. Combining the English Hymns MIDI files with Open Hymnal’s ABC files covered far more ground than either alone.
midi2abc+ ABCjs is a powerful free stack. Converting MIDI to ABC with a CLI tool and rendering it with a JavaScript library gives you interactive sheet music without any paid services or proprietary formats.