On Thu, Nov 28, 2024, 11:11 AM Edouard Klein <edouardklein(a)gmail.com> wrote:
So, I ran the following experiment:
I downloaded the last month of email from the archive, and split them
into one email per file using a quick and dirty state machine script in
python looking an empty line followed by two lines starting with "From"
in order to know where to split.
I then uploaded the files into
https://openwebui.com/ built-in RAG
system.
The vectorization model is nvidia/NV-Embed-v2, and the LLM is
https://huggingface.co/bartowski/gemma-2-27b-it-GGUF/blob/main/gemma-2-27b-…
I then put the following very vague query, but which I knew was relevent
to last month's discussion:
What has changed in the C language in the last few decades ?
I copied the response at the end of the email. It seems quite good to
me, but it may get worse when the whole archive is uploaded.
I think this proof of concept shows that there could be value in doing
the whole list, with a cleaner email extraction a few tweaks to the
system prompt.
I can't do it now because I have a few deadlines coming soon (not the
least of which is the International Workshop on Plan 9
https://iwp9.org/ SEND YOUR PAPERS, PEOPLE !), but I would like to
tackle that in the near future.
Hosting this require significant but not overly expansive hardware (from
1 to few k€), if a university lab or a generous patron wants to see it
done and make it available to the public, please contact me :)
Otherwise llamafile (
https://github.com/Mozilla-Ocho/llamafile justine
single-handedly restored my faith in Mozilla) could be used to
distribute a file that would contain the model+the emails+the vector
database in a single huge file that can run anywhere.
Exciting times !
Here is the LLM answer:
Based on the provided context, it appears that the C language has
undergone significant changes over the last few decades. Here are some
key changes mentioned:
Standardization and Formalization : The C language has become more
formalized and standardized, with the introduction of new standards
(e.g., C89, C99, C23). This has led to a more rigid and legalistic
approach to the language, which some users find less useful.
Changes to realloc function : The behavior of the realloc function
has changed over time. In the first ANSI C standard, realloc(0, ptr)
was well-defined when ptr was not nil. However, this behavior was
changed in C99, and now, in C23, realloc(0, ptr) is undefined
behavior (UB) regardless of the value of ptr.
Strict Aliasing : The introduction of strict aliasing rules has led
to compilers assuming that pointers to incompatible types always
point to different non-overlapping locations. This has caused issues
for some programmers.
Additional Features and Functions : The C language has gained new
features and functions over the years, such as stdint.h (introduced
in C89), which provides integer types with specific widths.
Evolution of Language Principles : The language has evolved with no
appreciation or understanding of the aesthetic and practical
principles of the original language effort, leading to a more
verbose and less concise language.
These changes reflect the C language's growth and adaptation to new
requirements and use cases over the decades. However, some users, like
the ones in the provided context, feel that these changes have made the
language less useful and more complicated.
Now you kids get off my damn lawn!
But it is a fairly accurate summary.
Warner
Warner
Warren Toomey via COFF <coff(a)tuhs.org> writes:
I was just trying to find out if there was a way
of uploading the TUHS
mailing list as a corpus of text into a LLM so that I could ask questions
based on the knowledge contained therein. I'm not having much luck yet.
Anyway, while messing around with things I asked a simple question and
got
this answer:
What is a spin lock?
A spin lock, also known as a spinning lock or spinning lock mechanism,
is
a type of mechanical locking device used to
secure and protect
sensitive
electronic components, such as microprocessors,
memory chips, and other
high-value devices.
Here's how it worked: when a CPU instruction was executed and the
necessary data was stored in registers, the CPU would "lock" the
registers
by spinning them around to prevent other
instructions from accessing
them.
Those poor registers must get dizzy with all the spinning that they
have to endure :-)
Cheers, Warren
P.S. If anybody has ideas on how to usefully import the TUHS mailing list
into a LLM, let me/us know.