On Fri, Aug 4, 2023 at 12:57 PM Alejandro Colomar
<alx.manpages(a)gmail.com> wrote:
On 2023-08-04 18:06, Dan Cross wrote:
On Thu, Aug 3, 2023 at 7:55 PM Alejandro Colomar
<alx.manpages(a)gmail.com> wrote:
On 2023-08-03 23:29, Dan Cross wrote:
On Thu, Aug 3, 2023 at 2:05 PM Alejandro Colomar
<alx.manpages(a)gmail.com> wrote:
> - It is type-safe, with the right tools.
No it's not, and it really can't be. True, there are linters that can
try to match up types _if_ the format string is a constant and all the
arguments are known at e.g. compile time, but C permits one to
construct the format string at run time (or just select between a
bunch of variants); the language gives you no tools to enforce type
safety in a meaningful way once you do that.
Isn't a variable format string a security vulnerability? Where do you
need it?
It _can_ be a security vulnerability, but it doesn't necessarily
_need_ to be. If one is careful in how one constructs it, such things
can be very safe indeed.
As to where one needs it, there are examples like `vsyslog()`,
I guessed you'd mention v*() formatting functions, as that's the only
case where a variable format string is indeed necessary (or kind of).
I think you are conflating "necessary" with "possible."
I'll simplify your example to vwarnx(3), from the
BSDs, which does less
job, but has a similar API regarding our discussion.
I'm not sure if you meant vsyslog() uses or its implementation, but
I'll cover both (but for vwarnx(3)).
Uses:
This function (and all v*() functions) will be used to implement a
wrapper variadic function, like for example warnx(3). It's there, in
the variadic function, where the string /must be/ a literal, and where
No, the format string does not need to be a literal at all: it can be
constructed at runtime. Is that a good idea? Perhaps not. Is it
possible? Yes. Can the compiler type-check it in that case? No, it
cannot (since it hasn't been constructed at compile time). Consider
this program:
: chandra; cat warn.c
#include <err.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int
main(void)
{
char buf[1024];
strlcpy(buf, "%s ", sizeof(buf));
strlcat(buf, "%s ", sizeof(buf));
strlcat(buf, "%d", sizeof(buf));
warnx(buf, "Hello", "World", 42);
return EXIT_SUCCESS;
}
: chandra; cc -Wall -Werror -o warn warn.c
: chandra; ./warn
warn: Hello World 42
: chandra;
That's a perfectly legal C program, even if it is a silly one. "Don't
do that" isn't a statement about the language, it's a statement about
programmer practice, which is the point.
the arguments are checked. There's never a good
reason to use a
non-literal there (AFAIK),
I believe that you believe that. You may even be right. However,
that's not how the language works.
and there are compiler warnings and linters
to enforce that. Since those args have been previously checked, you
should just pass the va_list pristine to other formatting functions.
I'm afraid that this reasonable advice misses the point: there's
nothing in the language that says you _have_ to do it this way. Some
tools may _help_, but they cannot cover all (reasonable) situations.
Here again `syslog()` is an interesting example, as it supports the
`%m` formatting verb. _An_ implementation of this may work by
interpreting the format string and constructing a new one,
substituting `strerror(errno)` whenever it hits "%m" and then using
`snprintf` (or equivalent) to create the file string that is sent to
`syslogd`. You may argue that programmers should only pass constant
strings (left deliberately vague since there are reasonable cases
where named string constants may be passed as a format string argument
in lieu of a literal) that can be checked by clang and gcc, but again,
nothing in the language _requires_ that, but the implementation of
`vsyslog` that actually implements that logic has no way of knowing
that its caller has done this correctly.
Similarly, someone may choose to implement a templating language that
converts a custom format to a new format string, but assumes that the
arguments are in a `va_list` or similar. Bad idea? Probably. Legal in
C? Yes.
Then, as long as libc doesn't have bugs,
you're fine.
That's a tall order.
In the implementation of a v*() function:
Do /not/ touch the va_list. Just pass it to the next function. Of
course, in the end, libc will have to iterate over it and do the job,
but that's not the typical programmer's problem. Here's the libbsd
implementation of vwarnx(3), which does exactly that: no messing with
the va_list.
$ grepc vwarnx
./include/bsd/err.h:63:
void vwarnx(const char *format, va_list ap)
__printflike(1, 0);
./src/err.c:97:
void
vwarnx(const char *format, va_list ap)
{
fprintf(stderr, "%s: ", getprogname());
if (format)
vfprintf(stderr, format, ap);
fprintf(stderr, "\n");
}
Just put a [[gnu::format(printf)]] in the outermost wrapper, which
should be using a string literal, and you'll be fine.
Using a number of extensions aside here, again, that's just (sadly)
not how the language works.
but
that's almost besides the point, which is that given that you _can_ do
things like that, the language can't really save you by type-checking
the arguments to printf; and once varargs are in the mix? Forget about
it.
Not really. You can do that _only_ if you really want.
Yes, that's the point: if we're talking about language-level
guarantees, the language can't help you here. It can try, and it can
hit a lot of really useful cases, but not all. By contrast, formatting
in Go and Rust is type-safe by construction.
If you want to
not be able, you can "drop privileges" by adding a few flags to your
compiler, such as -Werror=format-security -Werror=format-nonliteral,
and add a bunch of linters to your build system for more redundancy,
and voila, your project is now safe.
Provided that you use a compiler that provides those options, or that
those linters are viable in your codebase. ;-)
- Dan C.