Project Boo 🤖
Boo is a general-purpose discord bot with the lofty goal of becoming J.A.R.V.I.S. sidekick
Jump to: – Status – Features – Upcoming – Ultra secret feature
It fully respects privacy and is a free-for-now personal hobby bot.
Once, a friend asked why Boo is free, am I collecting data? I never collected data to train, although I hand-feed a few details of my friends to make it understand nuances. It is accessible simply because I chose to think of it as a hobby to tinker with tech, and it is run over trial credits, student credits, and researcher perks. You may have heard people tinkering with electronic gadgets or code and using it as an adrenaline rush, Boo is the same for me. When it helps a user with their immediate information need, I feel gratified with my core values of AI as a human intellect augmenter. Although it started just as a chat companion and to get specific details like books, Wikipedia, and dramas, ... the long-term goal of Boo is always to reach the level of Jarvis's sidekick. The same product can be integrated into Discord, X, Reddit, WhatsApp, telegram, SMS, voice, email, speakers ... and all seems seamless for any user. Consider a few examples:
- I want to turn on the AC/geyser half an hour before I reach home – Boo should figure out when my Uber ride is booked, my geolocation, and google maps traffic estimates to automatically figure out when is the right time to turn on.
- I want to cook Italian because some Italian friends are coming for dinner – Boo should use kitchen cameras and mini cameras in the fridge to figure out what items are there, plan possible Italian dishes with those items and automatically figure out which ingredients are missing, cater to any friends allergy lists, and give me heads up and order them instamart/BigBasket/zepto/amazon fresh all by the time I reach home just in time to hand pick up
- I want to go to Luxembourg, Boo should not only plan itineraries but peruse dozens of travel blogs, friends' profiles and websites I care about and in my bookmarks to automatically summarise all the highs and lows while giving me backlink sources to go read longer parts.
- I am cooking a new dish; it should not only find the recipe but guide me at exact intervals with visual feedback and historical information about the dish, nutrition, stories, notes, and my past cooking prefernces all the while I am cooking it with my hands on the fly with my own creative twists.
- I want to schedule an orthodontist appointment, given my intent of treatment and preferences, it should automatically find phone numbers from Google Maps, call them all at a time, in my cloned voice, while speaking the language of the callee, and real-time updating all transcripts parallelly, taking further questions to insert during the call smoothly, and summarising at the end based on further intents.
- It has to figure things out on its own; if I have given my bucket list, and say one includes watching rare astronomical phenomenon, it should keep track of when it is next from suitable resources automatically to inform me just before it, even if I forget.
- Most notably, as a debate skill level raiser, it has to seamlessly reform, question, and defend multiple creative ideas to explore an area.
- It sees my porch video camera feed and detects true abnormalities automatically
- Schedule restaurant meetups automatically based on food tastes, location, timings, calendar events, and reminders. Further analyse of reviews, menu, and personal tastes to recommend the suitable dishes to order automatically when visited.
- None of these are improbable, nor do they not exist now. Still, for me, I need a unified product that can seamlessly pull off multiple tasks, such as a single product personalised to my tasks, my preferences, and my computational setup with a high degree of customisation.
- Towards this goal, Version 2.0 Boo that is currently underway can perform things more accurately by doing RAG on steroids with integrating google results, Wikipedia, popular websites like youtube, amazon, IMDb, Goodreads, BookMyShow, calculator, code generation to auto-solve complex problems, get the latest news, auto-generate memes, get images, all while seamlessly able to understand in whatever way I interact with it – text or voice message or image.
- For example, asking “When is XYZ event?” should answer accurately, unlike current LLMs, and if I ask for further details, it should summarize from dozens of news articles succinctly; if it's a concert, grab the music album links when I ask, check if I can book on bookmyshow, calculate how much it costs for say three friends, and answer in appropriate currency, create a funky meme out of original photo with context to share on social media, ...
All code updates for Boo:
DONE [newer on top]
- url virus scan command ';vt'
- removed guilt tripping food pics and auto-delete morning coffee pics after few seconds
memix
gif generation- insta embed links in threads
- auto-generate
tldr
of long messages or URLs [beta v2] - google and fact-check [beta v2]
- currency message recognition and converter [beta v2]
- math and metric units recognise and convert [beta v2]
- added auto routing model for commands vs chat [beta v2]
Gemini Vision
integration done to understand images betterPalm 2
toGemini pro
migration- migrated from Duckduckgo to Google for image search
- New state-of-the-art model for faster Audio transcriptions
nova
- image model integration for sarcastic image comebacks
resnet
andotter.
- memes generated from slickest model
mistral
- auto-generate more quality memes from popular memes
memegen
- on-the-fly mashup of classical art with famous artists for creative memes with a
realistic vision.
- A.I. gifs for boson call with
bge
embeddings - safeguards on L.L.M. prompt hacking and sexual filter
lakera
- inform in first reply if updated/crashed
- asynchronous audio transcriptions and cooldown timer on overload/fails
- helpful error messages and log errors properly
- change the Goodreads link to
biblioreads
- better urban dictionary and hidden command
- sync emoji list for msg reacts
- pings back if trigger word in msg; auto dnd when the user is alive
- nytimes
spelling bee
in “Hyderabad” server – auto check, leaderboard, hints - simpler
;help
command - plays music from a laptop
- new commands: Wikipedia, news, videos, translate, and maps
- rewrote code for random coffee, tea, food, cat, dog pics in chat. personalise with username
- daily weather alerts; cron on I.B.M. weather API and AccuWeather
- auto personalised join message poem based on username – “Bangalore” server
- auto-delete text messages in gallery – “Bangalore” server
- send auto Insta and Reddit embed fix links.
ddinstagram
andrxddit
- delete command
- more finetuned replies to bglr and hyd locality; flirting and trolling :P
- username into context for reply
- safety filter for GIFs
- use the context of previous few replies while replying
- G.P.T. -3 to
PaLM 2
migration - auto transcribe Voice messages.
Whisper
- leave server command
- searches movies.
omdb
andlibremdb
- chat blacklist
- does emoji react
- commands for Google Knowledge Graph and Wolfram Alpha
- commands for quotes, dictionary and dating questions.
wiktionary
andokcupid
- command for hugging random users in the server
- searches images and books.
duckduckgo
andgoodreads
- make memes on the fly from templates.
Ada
embeddings - more personalised replies for friends
- replies gifs.
Giphy
- boo profile's status' updates for Boson's friends
Chatgpt
became public two months after Boo- switching to a better model.
G.P.T. -3
- trained on multiple kdramas subtitles to reply as kdrama character
- boo started chatting
- searches kdramas
mydramalist
- boo was born on Azure in “kdrama” server
- bored Boson in the hospital, so it's a hobby project :)
TODO [priority on top]
- Pomodoro timer and group notifications – joining.
- auto-roast troll users from the bot after warnings
- auto roast the banned/muted users
- Thanos ban command for server owners for inactive purge or mass bans
- Karaoke option with AI model that gives only music, music + low sound vocals, normal song with music gaps skipped, and lyrics
- more rickrolling and pranks
- auto-reverse user gif
- recognise and inform acrostics
- test whether the cache works properly. – monitor discord API calls
- increase chain length for context if the cache works
- auto-documentation and sanitisation of entire code with A.I.
- auto sketches animation videos as replies –
livesketch
Tel-Aviv - A.I. auto-dance moves conditioned on music being played –
EDGE
Stanford - auto-generate A.I. gifs on fly –
animatediff.
- finetuned
LoRAs
on mistral for multiple topics and auto-routing - auto friendship match for opt-in users? all pairs processing on these users' chat patterns
- if new knn/ann updates in radius R, auto announce new friendship match
- voice cloning of famous personalities in replies
- seamlessly voice chat in voice channels. Integrate Boson's Project Roo