pull down to refresh

Let us know how this works for you!


Also found the blog post describing what's what on https://sci-hub.box

Hear the good news: recent advances in artificial intelligence enabled Sci-Hub to launch a robot that gives scientifically-grounded responses to questions. The robot starts with searching for relevant literature in Sci-Hub database, then turns to selecting and reading most recent studies, and composes the answer based on this information. The answer includes all the references, and each referenced article can be read on Sci-Hub with one click.

Unlike question-answering robots that were based upon the early generation of neural networks, Sci-Hub bot does not hallucinate and is not making up scientific facts and does not cite sources that do not exist. To support its statements, Sci-Bot uses articles from Sci-Hub database. Questions can be asked in any language, and answers can be saved on server and shared.

The alpha version only supports answerig one question, and a more advanced variation that supports conversation mode is coming soon. Right column displays example questions that has been answered by robot - push the question to see the generated answer.

That's useful context.

Also

I forgot she's a bit of a shitcoiner... #981532

reply

Most people are shitcoiners - this shouldn't be surprising

However, the entire sci-hub collection is available through torrents. This can be reproduced, shared, published:

  1. Get a huge drive
  2. Iterate through 100TB of libgen torrents
  3. Normalize everything into markdown -> share this
  4. Index it all -> share this
  5. Train a small model on searching the index (with structured output) -> share this
  6. Inject the found references to a big model and let it reason -> share this
reply

Looks like a fun little (hmm) side project :)

reply

I'm quite sure that this is one of the things you can let an LLM build for you. Run it against a small test corpus, check results, improve, repeat.

Doesn't have to take a lot of your time, mostly wall time, gpu ticks, cpu ticks, network.

reply