Here's an uncomfortable truth: I spend way too much time on Twitter. Or X. Whatever we're calling it now.
Part of it is genuine interest—there's real value in tech Twitter, the discussions around AI, startups, remote work. But part of it is just the dopamine hit of engagement. Someone posts something interesting, I reply, maybe it gets some likes, maybe it starts a conversation.
What if I could automate that last part?
The Idea
Not bot spam. Not generic "Great post!" replies. What if I could build an AI that actually understood my voice, my interests, my way of communicating—and replied to tweets as I would, while I'm doing something else?
The ethical questions are obvious and I'll address them. But the technical challenge was interesting enough that I decided to build it anyway.
Capturing My Voice
The first step was figuring out what makes my tweets mine.
I exported my tweet history and started analyzing patterns. Turns out I have some pretty consistent quirks:
- Heavy use of abbreviations: "imo", "tbh", "def", "idk", "rn"
- Casual openers: "Just...", "Soo...", "I feel like..."
- Questions that aren't really questions: "Anyone else think X?" (rhetorical)
- Bilingual switches: English mostly, German when talking to German accounts
- Specific interests: AI, remote work, Germany vs. other countries, Tesla/EVs
I compiled all of this into a detailed style guide—about 3000 words describing how I write, what topics I engage with, what takes I'd have on various subjects, and crucially, what I wouldn't say.
The style guide became the system prompt for GPT-4o-mini. Every reply request includes this context.
The Architecture
The bot is surprisingly simple:
Every 20 minutes:
1. Fetch new tweets from target accounts
2. For each tweet:
a. Ask LLM: "Is this relevant to Julian's interests?" (yes/no)
b. If yes, ask LLM: "Generate a Julian-style reply"
c. Send reply to Telegram for approval
3. Wait for user approval or timeout
Target accounts are people whose content I genuinely find interesting—other builders, tech commentators, a few friends. The bot monitors their tweets specifically rather than my entire timeline.
Relevance filtering is crucial. Not every tweet deserves a reply, even from accounts I follow. The LLM gets context about my interests and decides whether I'd actually engage with this specific tweet.
Telegram approval is the safety net. Every generated reply comes to my phone for review. I can approve (post it), edit (adjust and post), or reject (skip). This keeps me in the loop without requiring constant attention.
The Results
After a month of running, some observations:
Hit rate: About 60% of generated replies I'd approve as-is. Another 25% need minor edits. 15% are off-base and get rejected.
Voice accuracy: The style guide works. Several times I've approved a reply and forgotten about it, then later saw the tweet and thought "that sounds like me." That's both the goal and slightly unnerving.
Engagement: The approved replies perform roughly the same as my manual replies. No magic boost, but no penalty either. People can't tell the difference (or if they can, they're not calling it out).
Time saved: Hard to quantify. I still read Twitter, still engage manually with things that matter. But the bot handles the "quick take on interesting post" category that used to eat 30+ minutes of my day.
The Ethics
Let's address the elephant: is this deceptive?
I've thought about this a lot. Here's where I landed:
It's still my voice. The style guide is based on how I actually communicate. The opinions expressed are opinions I actually hold. The bot isn't inventing positions—it's articulating mine.
I approve everything. No reply goes out without my explicit approval. The AI drafts, I decide. That's not different from having an assistant draft emails.
The alternative isn't silence. Before the bot, I'd often see a tweet, think "I should reply to that," then get distracted and forget. The bot makes sure I actually engage with content I find interesting, rather than just intending to.
But also: I don't disclose that specific replies are AI-assisted. Is that deceptive? Maybe. The line between "AI-drafted, human-approved" and "human-written" is blurrier than I expected. I'm not sure I have a clean answer.
What Didn't Work
Autopilot mode: I built a fully autonomous mode where the bot would post without approval. Ran it for two days. Never again. The 15% of bad replies aren't just mediocre—they can be genuinely off. Wrong tone for a sensitive topic, misunderstanding context, occasionally just factually wrong. Human review is mandatory.
Generic replies: Early versions would sometimes generate perfectly reasonable replies that just... weren't me. "Great insight!" "Interesting perspective!" Technically fine, completely lacking personality. The detailed style guide fixed this, but it took many iterations.
Controversial topics: The bot now has a hard-coded list of topics it will never engage with: politics (mostly), religion, personal drama. Not because I don't have opinions, but because nuance is hard and the risk of a bad take going viral isn't worth it.
The Technical Details
For the curious:
- Language: TypeScript/Node.js
- Twitter API: Using v2 with OAuth 1.0a user authentication
- LLM: GPT-4o-mini for both relevance checking and reply generation
- Telegram: Telegraf library for the approval bot
- Hosting: Docker container on my VPS
- Rate limiting: Max 3 replies per hour (Twitter's free tier limits + not wanting to spam)
Total cost: about $5/month in API calls. The Twitter API free tier handles the read/write volume easily.
Where This Goes
I'm genuinely uncertain whether to continue using this.
On one hand, it works. It saves time, maintains engagement, produces content I'm happy with.
On the other hand, there's something uncomfortable about automating self-expression. Twitter is supposed to be spontaneous, raw, real. Is a carefully reviewed AI draft still "real"?
Maybe the distinction doesn't matter as much as I think. Every tweet I write goes through some mental filtering—"is this how I want to be perceived?"—before posting. The AI just externalizes that process.
Or maybe that's just rationalization.
For now, the bot continues running. I continue approving or rejecting its suggestions. And I continue being unsure whether I've built something useful or something I should be uncomfortable about.
The line between augmentation and automation is thinner than I expected.