Amen

Sent from a handheld expect more typos than usual


On Thu, Nov 28, 2024 at 1:17 PM Warner Losh <imp@bsdimp.com> wrote:


On Thu, Nov 28, 2024, 11:11 AM Edouard Klein <edouardklein@gmail.com> wrote:
So, I ran the following experiment:

I downloaded the last month of email from the archive, and split them
into one email per file using a quick and dirty state machine script in
python looking an empty line followed by two lines starting with "From"
in order to know where to split.

I then uploaded the files into https://openwebui.com/ built-in RAG
system.
The vectorization model is nvidia/NV-Embed-v2, and the LLM is
https://huggingface.co/bartowski/gemma-2-27b-it-GGUF/blob/main/gemma-2-27b-it-Q8_0.gguf

I then put the following very vague query, but which I knew was relevent
to last month's discussion:

What has changed in the C language in the last few decades ?

I copied the response at the end of the email. It seems quite good to
me, but it may get worse when the whole archive is uploaded.

I think this proof of concept shows that there could be value in doing
the whole list, with a cleaner email extraction a few tweaks to the
system prompt.

I can't do it now because I have a few deadlines coming soon (not the
least of which is the International Workshop on Plan 9
https://iwp9.org/ SEND YOUR PAPERS, PEOPLE !), but I would like to
tackle that in the near future.

Hosting this require significant but not overly expansive hardware (from
1 to few k€), if a university lab or a generous patron wants to see it
done and make it available to the public, please contact me :)

Otherwise llamafile (https://github.com/Mozilla-Ocho/llamafile  justine
single-handedly restored my faith in Mozilla) could be used to
distribute a file that would contain the model+the emails+the vector
database in a single huge file that can run anywhere.

Exciting times !


Here is the LLM answer:

Based on the provided context, it appears that the C language has
undergone significant changes over the last few decades. Here are some
key changes mentioned:

    Standardization and Formalization : The C language has become more
    formalized and standardized, with the introduction of new standards
    (e.g., C89, C99, C23). This has led to a more rigid and legalistic
    approach to the language, which some users find less useful.

    Changes to realloc function : The behavior of the realloc function
    has changed over time. In the first ANSI C standard, realloc(0, ptr)
    was well-defined when ptr was not nil. However, this behavior was
    changed in C99, and now, in C23, realloc(0, ptr) is undefined
    behavior (UB) regardless of the value of ptr.

    Strict Aliasing : The introduction of strict aliasing rules has led
    to compilers assuming that pointers to incompatible types always
    point to different non-overlapping locations. This has caused issues
    for some programmers.

    Additional Features and Functions : The C language has gained new
    features and functions over the years, such as stdint.h (introduced
    in C89), which provides integer types with specific widths.

    Evolution of Language Principles : The language has evolved with no
    appreciation or understanding of the aesthetic and practical
    principles of the original language effort, leading to a more
    verbose and less concise language.


These changes reflect the C language's growth and adaptation to new
requirements and use cases over the decades. However, some users, like
the ones in the provided context, feel that these changes have made the
language less useful and more complicated.

Now you kids get off my damn lawn!

But it is a fairly accurate summary. 

Warner

Warner

Warren Toomey via COFF <coff@tuhs.org> writes:

> I was just trying to find out if there was a way of uploading the TUHS
> mailing list as a corpus of text into a LLM so that I could ask questions
> based on the knowledge contained therein. I'm not having much luck yet.
>
> Anyway, while messing around with things I asked a simple question and got
> this answer:
>
> What is a spin lock?
>
>   A spin lock, also known as a spinning lock or spinning lock mechanism, is
>   a type of mechanical locking device used to secure and protect sensitive
>   electronic components, such as microprocessors, memory chips, and other
>   high-value devices.
>
>   Here's how it worked: when a CPU instruction was executed and the
>   necessary data was stored in registers, the CPU would "lock" the registers
>   by spinning them around to prevent other instructions from accessing them.
>
> Those poor registers must get dizzy with all the spinning that they
> have to endure :-)
>
> Cheers, Warren
>
> P.S. If anybody has ideas on how to usefully import the TUHS mailing list
> into a LLM, let me/us know.