·9 min

RedDiss Technical Deep Dive: Complete AI-Powered Diss Track Generation Pipeline with Reddit Sentiment Analysis, LLM Lyrics Crafting, and Audio Production Automation

Detailed technical examination of RedDiss, an end-to-end AI system for generating diss tracks from Reddit discussions, featuring advanced NLP, multimodal audio synthesis, and full-stack development with Streamlit and FastAPI.

DK

Daniel Kliewer

Author, Sovereign AI

RedDissRedditAILLMTTSBeat SyncDiss TracksStreamlitFastAPIMusic GenerationAudio ProcessingText-to-SpeechAsync API
Sovereign AI book cover

From the Book

This is from Sovereign AI: Building Local-First Intelligent Systems.

Get the Book — $88
RedDiss Technical Deep Dive: Complete AI-Powered Diss Track Generation Pipeline with Reddit Sentiment Analysis, LLM Lyrics Crafting, and Audio Production Automation

Image

RedDiss

Repo

Behind the Scenes of RedDiss: Crafting AI-Powered Diss Tracks from Reddit

In the ever-evolving landscape of artificial intelligence and social media, innovative projects continually push the boundaries of what's possible. One such pioneering endeavor is RedDiss, an AI-powered diss track generator developed by Daniel Kliewer. As an entry for the Loco Local LocalLLaMa Hackathon 1.0, RedDiss seamlessly blends Reddit data extraction with cutting-edge AI technologies to produce personalized diss tracks. This blog post delves deep into the architecture, functionalities, and inner workings of RedDiss, offering a comprehensive overview of how this project transforms raw Reddit content into polished auditory art.

Table of Contents

  1. Introduction to RedDiss
  2. Project Architecture
  3. Core Components
  4. Streamlit Front-End
  5. Backend Integration with FastAPI
  6. Testing and Quality Assurance
  7. Installation and Deployment
  8. Conclusion and Future Prospects

Introduction to RedDiss

RedDiss stands at the intersection of social media analytics, natural language processing, and audio engineering. By harnessing the wealth of conversations on Reddit, RedDiss extracts relevant themes and sentiments to craft diss track lyrics tailored to specific Reddit posts or comments. These lyrics are then refined for flow, converted to speech, synchronized with beats, and masterfully processed into a final audio track—all within an intuitive Streamlit application.

Project Architecture

RedDiss is structured to ensure maintainability, scalability, and efficiency. The project repository is organized into several key directories:

  • agents/: Contains modules responsible for each processing step, from scraping to mastering.
  • models/: Hosts AI models and related files.
  • data/: Stores raw, processed, and generated data, including lyrics and audio files.
  • tests/: Includes test cases to validate the functionality of various components.
  • streamlit_app.py: The front-end interface built with Streamlit.
  • main.py: The FastAPI backend handling API requests.
  • combined_output.txt: Aggregated logs or outputs from the combine script.
  • requirements.txt: Lists all dependencies required to run RedDiss.
  • .env: Stores environment variables, such as Reddit API credentials.

This modular architecture allows each component to operate independently while seamlessly integrating with others, fostering an environment conducive to continuous development and improvement.

Core Components

Let's explore each core component of RedDiss, understanding its purpose and implementation.

1. Reddit Data Scraper

File: agents/scraper.py

RedDiss begins its magic by tapping into Reddit's vast repository of posts and comments. Utilizing the asyncpraw library, an asynchronous Reddit API wrapper, the scraper fetches content based on user-provided URLs. Here's a glimpse into its functionality:

python
1class RedditScraper:
2 def __init__(self):
3 # Initialize Reddit client with credentials
4 self.reddit = asyncpraw.Reddit(
5 client_id=os.getenv("REDDIT_CLIENT_ID"),
6 client_secret=os.getenv("REDDIT_CLIENT_SECRET"),
7 user_agent=os.getenv("REDDIT_USER_AGENT")
8 )
9
10 async def extract_post_data(self, url: str) -> Dict[str, Any]:
11 # Fetch and process submission data
12 submission = await self.reddit.submission(url=url)
13 await submission.load()
14 # Extract relevant details and comments
15 # ...

The scraper ensures that only meaningful and non-deprecated directories (like venv/) are accessed, maintaining the integrity and security of the data extraction process.

2. Text Sanitization

File: agents/sanitizer.py

Raw Reddit data often contains noise—URLs, markdown formatting, special characters, and more. The sanitizer cleans and normalizes this content, making it suitable for further processing.

python
1async def clean_text(content: Dict[str, Any]) -> Dict[str, Any]:
2 # Clean title and main text
3 cleaned_data = {
4 "title": _clean_string(content["title"]),
5 "main_text": _clean_string(content["selftext"]),
6 # ...
7 }
8 # Filter and clean comments
9 # ...
10 return cleaned_data

This step is crucial for ensuring that subsequent analyses, like theme extraction and lyrics generation, operate on clear and concise text.

3. Theme Extraction

File: agents/theme_extractor.py

Understanding the themes and sentiments within the Reddit content is pivotal for generating relevant diss tracks. Leveraging Hugging Face's transformers library, RedDiss employs a zero-shot classification pipeline to identify dominant themes.

python
1class ThemeExtractor:
2 def __init__(self):
3 self.classifier = pipeline(
4 "zero-shot-classification",
5 model="facebook/bart-large-mnli",
6 device=-1 # CPU usage
7 )
8 self.candidate_themes = ["wealth/money", "success/achievements", ...]
9
10 async def extract_themes(self, content: Dict[str, Any]) -> Dict[str, Any]:
11 main_themes = await self._classify_text(main_content)
12 # Extract themes from comments
13 # ...
14 return themes_data

By analyzing both the main content and top comments, the theme extractor ensures a comprehensive understanding of the target's discourse.

4. Lyrics Generation

File: agents/lyrics_generator.py

At the heart of RedDiss lies its ability to craft diss track lyrics. Utilizing Llama 3.3 through the litellm library, the generator produces verses tailored to the extracted themes and chosen style.

python
1class LyricsGenerator:
2 def __init__(self):
3 self.model = "ollama/llama3.3:latest"
4
5 async def generate_lyrics(self, themes: Dict[str, Any], style: str) -> Dict[str, Any]:
6 context = self._build_context(themes, style)
7 lyrics = await self._generate_verses(context)
8 structured_lyrics = self._structure_lyrics(lyrics)
9 return structured_lyrics

The lyrics are scaffolded into structured formats, including verses, chorus, and outro, ensuring a coherent and impactful flow.

5. Flow Refinement

File: agents/flow_refiner.py

Raw lyrics can benefit from refinement to enhance their rhythmic and rhyming quality. The flow refiner employs Llama 3.3 to polish the generated lyrics, focusing on internal rhyme schemes, wordplay, and punchline effectiveness.

python
1class FlowRefiner:
2 def __init__(self):
3 self.model = "ollama/llama3.3:latest"
4
5 async def refine_flow(self, lyrics: Dict[str, Any], flow_complexity: int) -> Dict[str, Any]:
6 refined_lyrics = {}
7 for section, content in lyrics.items():
8 refined_lyrics[section] = await self._enhance_section(content, section, flow_complexity)
9 return refined_lyrics

This iterative process ensures that the diss tracks resonate with the desired intensity and sophistication.

6. Text-to-Speech (TTS) Engine

File: agents/tts_engine.py

Transforming written lyrics into spoken word is achieved through the TTS engine. On macOS, RedDiss leverages the native say command, combined with ffmpeg for audio processing, to generate high-quality vocal tracks.

python
1class TTSEngine:
2 def __init__(self):
3 # Verify availability of 'say' and 'ffmpeg'
4 subprocess.run(['say', '-?'], capture_output=True)
5 subprocess.run(['ffmpeg', '-version'], capture_output=True)
6
7 async def text_to_speech(self, lyrics: Dict[str, Any]) -> str:
8 audio_sections = []
9 for section, content in lyrics.items():
10 # Generate audio for each section
11 subprocess.run(['say', '-v', 'Daniel', '-r', '220', '-f', temp_txt.name, '-o', temp_aiff.name], check=True)
12 # Process with ffmpeg
13 subprocess.run(['ffmpeg', '-i', temp_aiff.name, '-af', 'acompressor=...', '-ar', '44100', '-ac', '1', '-y', temp_wav.name], check=True)
14 # Normalize and append
15 audio_sections.append(audio_array)
16 # Combine sections and save
17 final_audio = np.concatenate(audio_sections)
18 sf.write("data/audio/raw_vocals.wav", final_audio, 44100)
19 return "data/audio/raw_vocals.wav"

This component ensures that the diss tracks not only look good on paper but also sound compelling to the ear.

7. Beat Synchronization

File: agents/beat_sync.py

No diss track is complete without the right beat. The beat synchronizer aligns the vocal tracks with the chosen beats, ensuring timed precision and harmonious integration.

python
1class BeatSynchronizer:
2 def __init__(self):
3 self.target_tempo = 90 # BPM
4
5 async def sync_to_beat(self, vocals_path: str, beat_url: str) -> str:
6 # Load vocals and beat
7 vocals, sr_vocals = librosa.load(vocals_path)
8 beat_path = await self._download_beat(beat_url)
9 beat, sr_beat = librosa.load(beat_path)
10 # Analyze tempo and synchronize
11 # Mix and save the final track
12 return "data/audio/synced_track.wav"

By adjusting tempos and aligning beats, this module ensures that the diss tracks maintain a steady and immersive rhythm.

8. Audio Mastering

File: agents/mastering.py

The final polish comes from the audio mastering component, which enhances the track's quality, balances audio levels, and ensures consistency across platforms.

python
1class AudioMaster:
2 def __init__(self):
3 self.target_lufs = -14.0
4 self.target_peak = -1.0
5
6 async def master_audio(self, audio_path: str) -> str:
7 # Load audio, apply compression, EQ, stereo enhancement, and limiting
8 # Save the mastered audio
9 sf.write("data/audio/mastered/final_track.wav", processed, sr)
10 return "data/audio/mastered/final_track.wav"

This meticulous process guarantees that each diss track is studio-quality, ready for listeners to engage and enjoy.

Streamlit Front-End

File: streamlit_app.py

The user-facing interface of RedDiss is built with Streamlit, offering an intuitive platform for users to generate diss tracks effortlessly.

  • Input Section: Users provide a Reddit post URL.
  • Settings: Options to select diss track style (Aggressive, Playful, Sarcastic), adjust flow complexity, and beat intensity.
  • Generate Button: Initiates the diss track creation process.
  • Output: Displays generated lyrics and an audio player for the final track, along with a download option.
python
1def main():
2 st.title("Reddit Diss Track Generator")
3 reddit_url = st.text_input("Enter Reddit Post URL", placeholder="https://reddit.com/r/...")
4 style = st.selectbox("Diss Track Style", ["Aggressive", "Playful", "Sarcastic"])
5 flow_complexity = st.slider("Flow Complexity", 1, 10, 5)
6 beat_intensity = st.slider("Beat Intensity", 1, 10, 5)
7 if st.button("Generate Diss Track"):
8 # Orchestrate the diss track generation process
9 # Display lyrics and audio

This seamless user experience ensures that both novices and experts can harness the power of RedDiss with ease.

Backend Integration with FastAPI

File: main.py

RedDiss's backend is powered by FastAPI, facilitating efficient handling of API requests and orchestrating the diss track generation workflow.

python
1app = FastAPI(title="Diss Track AI", description="AI-powered diss track generator using Reddit content", version="1.0.0")
2
3@app.get("/generate_diss")
4async def generate_diss(url: str, style: str, beat_url: Optional[str] = None, flow_complexity: int = 5):
5 try:
6 # Sequentially execute scraping, sanitization, theme extraction, lyrics generation, flow refinement, TTS, beat sync, and mastering
7 return {"status": "success", "lyrics": refined_lyrics, "audio_file": final_track}
8 except Exception as e:
9 # Handle and log errors
10 raise HTTPException(status_code=500, detail={"error": str(e), "step": "unknown"})

This robust backend ensures that RedDiss can handle multiple simultaneous requests, maintaining performance and reliability.

Testing and Quality Assurance

File: tests/test_sanitizer.py

To maintain high-quality outputs, RedDiss incorporates a suite of tests using pytest. For instance, the sanitizer module is rigorously tested to ensure it effectively cleans and preprocesses Reddit content.

python
1@pytest.mark.asyncio
2async def test_clean_text():
3 test_data = {
4 "title": "Test [Post] with http://example.com URLs",
5 "selftext": "Some & special characters < here >",
6 # ...
7 }
8 result = await clean_text(test_data)
9 assert result["title"] == "test post with urls"
10 # Additional assertions

These tests validate the functionality of each component, ensuring that RedDiss operates smoothly and produces accurate results.

Installation and Deployment

File: README.md

Setting up RedDiss is straightforward, guided by comprehensive documentation. Here's a condensed version of the installation steps:

  1. Clone the Repository

    bash
    1git clone https://github.com/kliewerdaniel/RedDiss.git
    2cd RedDiss
  2. Install Dependencies

    bash
    1pip install -r requirements.txt
  3. Set Up Environment Variables Create a .env file in the root directory with Reddit API credentials:

    text
    1REDDIT_CLIENT_ID=your_client_id
    2REDDIT_CLIENT_SECRET=your_client_secret
    3REDDIT_USER_AGENT=DissTrackAI/1.0.0
  4. Run the Streamlit App

    bash
    1streamlit run streamlit_app.py

This streamlined setup ensures that users can quickly get started, tapping into the full potential of RedDiss without unnecessary hurdles.

Conclusion and Future Prospects

RedDiss exemplifies the harmonious integration of data extraction, natural language processing, and audio engineering. By transforming raw Reddit content into personalized diss tracks, it not only showcases the capabilities of modern AI but also underscores the potential for creative applications in digital entertainment.

Daniel Kliewer's methodical approach—evident in the project's structured architecture and comprehensive testing—lays a solid foundation for future enhancements. Potential avenues for expansion include incorporating more diverse AI models for lyric generation, enhancing beat synchronization with a broader library of beats, and expanding the application's reach to other social media platforms.

As AI continues to redefine the boundaries of creativity, projects like RedDiss pave the way for innovative applications that blend technology with artistic expression, offering users unique and personalized experiences in the digital age.

Sovereign AI book cover

Sovereign AI: Building Local-First Intelligent Systems

by Daniel Kliewer · Paperback · 72 pages

The hands-on guide to building AI that runs on your hardware, keeps your data private, and eliminates cloud dependence. Working code included.