U-Guitar (FFT+MLP)

U-Guitar is an automated performance feedback system for guitar scales, built as a final project for an Augmented Intelligence course at USC. The system records or accepts a guitar scale performance, analyzes it at the note level, and returns structured feedback on timing and pitch accuracy — giving players the kind of objective, data-driven insight that would otherwise require a human instructor listening in real time.

Watch Demo

Type

Machine Learning

Year

2026

Type

Personal Project

Problem

Guitar students consistently struggle to get quality practice feedback between lessons. We observed this firsthand: a student will run a scale twenty times with the same error, have no idea it's happening, and walk into their next lesson having reinforced a bad habit. The feedback loop that should exist — play, evaluate, correct — was broken outside the studio. The core problem wasn't access to learning materials; it was the absence of real-time, objective feedback at the moment of practice.

Solution

U-Guitar is an AI-powered guitar performance analyzer that listens to a student play a major scale and returns immediate, note-level feedback on timing and pitch accuracy. Record a scale, and the system isolates each of the eight notes, scores them independently on a 0–10 scale across timing deviation, pitch deviation, and overall quality, and surfaces specific guidance on what to fix. Rather than a general "good job" or "needs work," students receive a precise diagnostic — which note was flat, by how many cents, and by how many milliseconds they rushed the third beat.

Emaildelamazamarcelo@gmail.com

LinkedIn↗ Open

Design Process

The project went through two full architectural versions. V1 used a CNN+LSTM pipeline operating on mel spectrograms — it achieved strong training accuracy but collapsed on real recordings because the model had learned the timbre of synthetic training audio rather than performance quality. I diagnosed this as a domain gap problem and rebuilt from scratch. V2 replaced spectrogram features with physical measurements derived from FFT analysis — timing deviation in milliseconds and pitch deviation in cents — fed into a Transformer encoder. Switching onset detection from librosa to madmom alone pushed validation accuracy from 79% to 97%, confirming that data quality, not architecture, was the real bottleneck.

Input

I owned the full stack: dataset generation (synthesizing ~12,000 WAV files across 12 major keys using FluidSynth and pretty_midi), feature engineering, model training on an NVIDIA RTX A6000 GPU, and the Streamlit application that powers the student-facing interface. I built a Python 3.10 subprocess bridge to run madmom alongside a Python 3.14 main stack, diagnosed and fixed a double-resampling bug that was corrupting onset timestamps by 40–100ms, and implemented per-note audio playback so students could hear exactly which moment the system flagged. The project was validated by both my course professor and an independent guitar teacher who confirmed the scale is the right unit of analysis for this type of performance feedback.