-
-
Notifications
You must be signed in to change notification settings - Fork 105
XKCD: Create XKCD comic generator #1404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Use RAG with ChatGPT to store all XKCD comics and display the correct one based on the chat history. There is also the possibility to specify your own XKCD if you wish. Signed-off-by: Chris Sdogkos <work@chris-sdogkos.com>
Zabuzard
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the chatgptservice class should not deal with xkcd stuff. refactor the code so that anything xkcd is done in the xkcd classes instead. the responsibilities should be correct.
why does it need to do file io? i would like to avoid that if possible.
@Zabuzard I agree with you on that. The
Could you clarify what you mean by that? Are you trying to avoid storing files completely? Or just using |
Signed-off-by: Chris Sdogkos <work@chris-sdogkos.com>
Signed-off-by: Chris Sdogkos <work@chris-sdogkos.com>
- ChatGptService: Refrain from polluting it with XKCD related calls,
- ChatGptService: provide JavaDocs to the methods that don't have one,
- ChatGptService: remove unused sendWebPrompt method
- XkcdCommand and XkcdRetriever: Refactor code into functions for
readability.
Signed-off-by: Chris Sdogkos <work@chris-sdogkos.com>
Signed-off-by: Chris Sdogkos <work@chris-sdogkos.com>
|
@Zabuzard |
|
|
||
| private static final Logger logger = LoggerFactory.getLogger(XkcdCommand.class); | ||
|
|
||
| public static final String COMMAND_NAME = "xkcd"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These don't need to be public
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has been fixed.
| private static final HttpClient CLIENT = | ||
| HttpClient.newBuilder().connectTimeout(Duration.ofSeconds(10)).build(); | ||
| private static final String XKCD_GET_URL = "https://xkcd.com/%d/info.0.json"; | ||
| public static final String SAVED_XKCD_PATH = "xkcd.generated.json"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please can you group public and private seperately so the public hiding amoungst a bunch of private is clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has been fixed.
| Semaphore semaphore = new Semaphore(FETCH_XKCD_POSTS_SEMAPHORE_SIZE); | ||
|
|
||
| logger.info("Fetching {} XKCD posts...", XKCD_POSTS_AMOUNT); | ||
| try (ExecutorService executor = Executors.newFixedThreadPool(FETCH_XCKD_POSTS_POOL_SIZE)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this method supposed to be doing? I don't understand the context enough to know why we need an executor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's supposed to safely retrieve all XKCD comics in groups with the goal of not oversaturating the API endpoint used to get them. Without it, we will get rate limited.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're downloading all the comics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, for reasons explained further down this pull request.
| private static String getChatgptRelevantPrompt(String discordChat) { | ||
| return """ | ||
| <discord-chat> | ||
| %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A single discord message could be 2000 characters. This will blow the context way out. Please handle this edgecase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will blow the context way out.
At what limit would the context be blown way out? 2000 characters is a limit imposed by Discord for chat messages, but I suspect that it's different when using OpenAI.
Signed-off-by: Chris Sdogkos <work@chris-sdogkos.com>
|
could u explain quickly why it does need to do file stuff in general? id like to avoid reading/writing files if possible. config, database and in-memory cache should ideally be enough for everything we do. this would be a first-time for this bot, making the setup more complex overall. so id like to avoid it but first id like to understand why we need files for this feature, cheers |
@Zabuzard We need to be able to locally reference information about XKCD comics in order to not oversaturate the XKCD API endpoint.
One way we can avoid it would be to use the SQLite database we have and store all of the XKCD comics there. Would that be a solution that could work? |
|
Why do we need to store the comics instead of just posting a URL or downloading the content from the URL adhoc if really needed? Like, pick the comic u want and then post the URL to it in the embedded / download the content from the URL, attach it to the embedded and that's it's. I don't see why we need to store all comics on our side. |
Sure, for just displaying a comic in an embed, it's easy to just put the URL directly from the XKCD API, but OpenAI needs to have a vector store in order to be aware of all the posts so that it knows which one is more relevant when it gets asked. That's the reason this file is made in the first place, because it's uploaded as a vector store for RAG purposes. |
|
(Comment deleted due to duplicate) |
What
When some user calls the command$n$ messages and tries to post a relevant XKCD comic depending on the dialogue that's being held as of that particular moment the command was executed.
/xkcd relevant [n], our integrated ChatGPT reads the lastIf the user manages to supply no$n$ , then a default of $n = 100$ is selected.
For the times when it's needed,$id$ stands for the XKCD comic number) the bot will send that XKCD comic in the chat.
/xkcd custom <id>, (whereIt's as simple as that.
How
Every time the bot launches, if there's no
xkcd.generated.jsonfile found and there's no vector store uploaded on OpenAI of the XKCD comics, the bot spends some time downloading all of the XKCD comics available and creates that file. It then attempts to upload it as a vector store.If the file is not found but the vector store is uploaded on OpenAI, the bot will still fetch for the file because it needs it for retrieving individual XKCDs and because we should not bother the XKCD API every time. And it's a 3MB file anyways.
Why
It's a fun feature, people can get a laugh out of it. It's harmless, and a fun use of ChatGPT. I don't see why it shouldn't be included.
Preview