Abstract
This document discloses a novel method implemented in IKANDY — a music visualizer application — for automatically detecting which audio-producing process on a Windows host is the dominant source, and silently switching the application's metadata retrieval backend to the most appropriate provider for that process, with hysteresis to prevent thrashing. This method is referred to internally as Smart Auto.
This publication is intended to establish prior art. Any subsequent patent claim covering this method or substantially similar methods is anticipated by this disclosure.
Background & Problem Statement
Music visualizer applications that display track metadata (title, artist, album art) face a fundamental challenge: a user's listening environment is rarely uniform. A user may listen via Spotify, then switch to VLC for a local file, then open a browser to stream a set. Each source exposes metadata through a different mechanism:
| Source App | Native Metadata API | Limitations |
|---|---|---|
| Spotify | PKCE OAuth Web API | Requires auth; fails silently if another app is playing |
| VLC | HTTP API (Beefweb-style) | Must be manually enabled; SMTC data is often generic |
| foobar2000 | Beefweb REST Plugin | Plugin must be installed; port varies |
| Browser / other | Windows SMTC | Best-effort; rich for some apps, empty for others |
Prior art (SMTC, last.fm scrobbling, unified Now Playing widgets) reads from a single source or polls all sources indiscriminately with no awareness of which process is currently producing audio at a meaningful level. No existing system arbitrates metadata backends based on per-process audio dominance.
The Novel Method
Smart Auto operates as follows:
Step 1 — Per-Process Audio Enumeration
A native Node.js addon (compiled via node-addon-api) calls the Windows ApplicationLoopback API family via WASAPI (IAudioSessionManager2, IAudioSessionEnumerator) to enumerate all active audio sessions. For each session, it retrieves the owning process ID and current audio level (peak meter value). This list is polled on an interval of approximately 1 second.
Step 2 — Dominance Detection
Sessions are filtered against an audio threshold (e.g., peak > 0.01) to exclude silent or idle processes. The producing process with the highest sustained peak level is identified as the dominant source. Process name is resolved from PID using QueryFullProcessImageName.
// Pseudocode — dominance detection function getDominantSource(sessions, threshold = 0.01) { const active = sessions.filter(s => s.peak > threshold); return active.sort((a, b) => b.peak - a.peak)[0] ?? null; }
Step 3 — Metadata Backend Mapping
The dominant process name is matched against a priority map of known producing applications:
| Detected Process | Metadata Backend Selected | Fallback |
|---|---|---|
Spotify.exe |
Spotify PKCE Web API | SMTC |
vlc.exe |
VLC HTTP API | SMTC |
foobar2000.exe |
Beefweb REST API | SMTC |
chrome.exe, msedge.exe |
SMTC (browser SMTC hint) | — |
| Unknown | SMTC | — |
If the selected backend's prerequisite is not configured (e.g., VLC HTTP API is disabled), the system falls back to SMTC automatically and surfaces a non-blocking UI hint.
Step 4 — Hysteresis (Anti-Thrash)
A new dominant source must sustain dominance for a minimum dwell period (approximately 3 seconds) before a backend switch is committed. This prevents rapid toggling when two apps produce audio simultaneously (e.g., a notification sound during VLC playback).
// Pseudocode — hysteresis guard let candidate = null; let candidateSince = 0; const DWELL_MS = 3000; function evaluateSwitch(dominant, now) { if (dominant?.pid !== candidate?.pid) { candidate = dominant; candidateSince = now; } if (now - candidateSince >= DWELL_MS) { commitBackendSwitch(candidate); } }
Step 5 — Silent Switch + UI Indicator
Backend switches are performed without interrupting playback or visualization. A transient UI indicator briefly surfaces to inform the user (e.g., "Now following: VLC"), then auto-dismisses. The user does not need to manually select a source.
What Makes This Novel
Existing solutions (SMTC widgets, last.fm scrobblers, unified Now Playing daemons) read from a single metadata source or poll all sources equally with no audio-level awareness. No prior system uses per-process audio peak level as the arbitration signal for selecting a metadata retrieval backend, nor implements hysteresis on that signal to govern backend switching in a desktop visualizer context.
The combination of these elements is the claimed novel method:
1 Per-process audio session enumeration via WASAPI
2 Peak-level dominance detection with configurable threshold
3 Priority-mapped metadata backend selection per process identity
4 Hysteresis guard preventing thrash on transient audio events
5 Silent backend switch with non-blocking user feedback
Implementation Context
Smart Auto is implemented in IKANDY, an Electron 30-based music visualizer running on Windows. The native audio session enumeration layer is built with node-addon-api and compiled against the Electron runtime. The metadata backend layer communicates via IPC between the Electron main process and renderer. The visualizer renders via Butterchurn WebGL (MilkDrop-compatible).
This technical disclosure was published at ikandy.app and constitutes prior art as of its effective date.
Prior Art Search Summary
A search of npm, GitHub, and general web sources conducted May 2026 found no implementation combining per-process audio dominance detection with metadata backend arbitration for a music visualizer or similar application. Existing packages such as electron-audio-loopback address system-level audio capture but do not perform process-aware metadata routing. Windows SMTC provides a unified metadata surface but does not select backends based on audio level or process identity.