As I read this chapter, one of the principles that stuck out to me was
Design Principle 5.5: Have Your Machine Learning – And the Human in the Loop!
During my enrollment in Music and AI (CS470) last spring, I was profoundly impacted by this principle. The surge of generative AI tools, such as ChatGPT and Stable Diffusion, has stirred both excitement and trepidation in various sectors. Creatives, including artists and writers, express concerns about their place in an AI-dominated world. Startups like Showrunner AI, which attempted to automate episodes of South Park, exemplify these fears. Although I found the initial attempts lacking in authenticity, it’s undeniable that with minor refinements, the output could be enhanced.
It’s no surprise then that entities like the Writer’s Guild of America are advocating for boundaries on AI usage. Despite my aspirations in the AI field, I echo these sentiments. The mere thought of AI overshadowing human creativity is disconcerting. If AI usurps our innate ability to create, what’s left of our unique human essence?
Yet, I remain optimistic about AI’s potential. Its capacity to revolutionize previously impossible tasks and reduce mundane chores is unparalleled.
My current research is anchored in the realm of audio and AI. While I’m constrained from divulging specifics, my focus lies in enhancing the steerability of AI audio tools and refining their interface.
Ge’s assertion in this chapter, emphasizing the predictability of interfaces, resonated deeply. Contrary to this, many neural networks today are intricate black boxes. Their behavior is unpredictable and often at odds with user expectations. Maneesh Agrawal’s insightful blog post sheds light on this very challenge. This thought segues into another poignant design principle:
Design Principle 5.19: Interfaces Should Extend Us
AI’s potential extends beyond mere automation. The true challenge lies in harnessing its power to augment and extend human capabilities. In the realm of arts and music, where the journey is as fulfilling as the destination, the mere idea of complete automation is counterintuitive. I’m reminded of my experiences in CS470, where students, including myself, harnessed the power of machine learning to create tools that retained the human touch. My music audio mosaic, for instance, blended the raw emotion of my guitar with the precision of synthesized sounds. This synergy led to an unparalleled sense of empowerment.
Leading on from this, another design principle comes to mind:
Design Principle 5.18: Re-mutualize! Input + Output + Human
I endeavored to create a harmonious blend of my guitar’s melodies and the AI-generated synth sounds. The AI served not as a replacement, but as an enhancer, amplifying the nuances of my playing.