Suno's CEO Mikey Shulman dropped significant technical and strategic insights in a fresh Sequoia interview that's circulating widely on X. The company has hit $300M ARR by focusing on creation-first experiences where 90% of users actively generate music daily, rejecting the assumption that bigger models and more data always win in music generation.
๐ง Raw Audio Over Music Theory
Shulman revealed Suno abandoned traditional 12-tone music theory entirely. Instead, models work directly on raw 48kHz sound waves. This approach captures nuance, timbre, and emotional dynamics that symbolic representations miss. The bet seems to be paying off as the platform scales revenue faster than many expected, even after the recent licensing deals with WMG and UMG that require retraining on approved catalogs.
The interview emphasizes that music generation differs fundamentally from language or vision tasks. Preference is the ground truth. Unlike other domains where sycophancy poses risks, user satisfaction with generated tracks provides clear, immediate feedback for model improvement. This has allowed Suno to iterate rapidly with smaller models that outperform larger ones on song coherence.
๐ Auto-Regression Wins for Full Tracks
One of the sharper takes: auto-regression beats diffusion when the goal is complete songs rather than 30-second clips. Diffusion excels at high-fidelity short segments but struggles with long-term structure. Suno's architecture prioritizes maintaining narrative and musical arc across verses, choruses, and bridges. Shulman noted that music isn't purely a scale game. Clever architecture and high-quality preference data matter more than parameter count.
These revelations come as Google DeepMind also pushes forward with Lyria 3 Pro, which reportedly composes full arrangements in seconds and pairs with Veo 3 for natively synchronized video and audio. The X conversation yesterday highlighted both announcements as proof that 2026 is the year AI music moves from experiment to professional workflow. Creators are already discussing new prompting techniques that leverage the improved structure these models deliver.
โ๏ธ Workflow Changes for Power Users
Professional Suno users on X are updating their processes based on the CEO's comments. Emphasis has shifted toward detailed emotional descriptors and reference track analysis rather than complex music theory terms that the model ignores anyway. The combination of licensed training data from the new label deals plus these architectural insights should reduce the "uncanny valley" effect common in earlier generations.
Early tests shared in communities suggest the upcoming licensed models maintain the creative spark while adding consistency that clients demand. For AI music professionals, this means faster iteration cycles and higher client acceptance rates. The interview also touched on monetization reality: while many AI apps struggle to convert users, Suno's creator-first product has achieved rare product-market fit with two million paying subscribers.
As the industry digests both the financial numbers and technical philosophy, one theme emerges. The winners in AI music will combine strong licensing positions with models optimized specifically for musical structure rather than generic scaling laws. Suno appears positioned to lead on both fronts following this week's developments.
Bottom line: Suno's success proves music AI rewards targeted architecture and creator focus over brute-force scale, reshaping how professionals build workflows in 2026.
DRULES AI