I noticed a bug in the semaphore handling, when using the System V semaphore
backend:
$ LD_PRELOAD=./src/libfaketime.so.1 bash -c "echo foo | sed s/foo/bar/"
libfaketime: In lock_for_stat(), ft_sem_lock failed: Invalid argument
[...exited with error...]
(Beware, the above command-line is not 100% deterministic; sometimes it
succeeds.)
Looking at the strace for the above command-line, it seems the bash echo
builtin process (or thread?) decides to remove the semaphore upon
exiting, while it's still in use by the sed process. sed then gets
EINVAL error ("Invalid argument") on its next semop call.
The root cause is a semantic difference between POSIX sem_unlink and
SysV semop(..., IPC_RMID), the two implementations for ft_sem_unlink:
* sem_unlink allows the semaphore to be used afterwards, as long as a
process has a reference to the semaphore.
* semop(..., IPC_RMID) removes the semaphore immediately, and further
use results in EINVAL error.
AFAICT, the simplest fix is to only let the owner of the semaphore (and
shared memory) do the clean up, which is what this patch does. Both
semaphore backends pass the tests with this change.
ft_sem_create() is called with an argument located on the stack, which
means it's a bad idea to keep a reference to it in the 'name' field of
ft_sem_t -- the pointed to data goes out of scope and results in
unpredictable behaviour.
Fix it by making a copy of the semaphore name. Allocate a 256 char
buffer, to match existing code.
Fixes: 2649cdb156 ("Add semaphore abstraction layer")
musl defines stat64 as stat, leading to this build error:
gcc -o libfaketime.o -c -std=gnu99 -Wall -Wextra -Werror -DFAKE_PTHREAD -DFAKE_STAT -DFAKE_UTIME -DFAKE_SLEEP -DFAKE_TIMERS -DFAKE_INTERNAL_CALLS -fPIC -DPREFIX='"'/nix/store/qpyvvrcas950da98mssw6ixlw7ckvyrb-libfaketime-0.9.11'"' -DLIBDIRNAME='"'/lib'"' -Wno-nonnull-compare libfaketime.c
In file included from libfaketime.c:55:
libfaketime.c:1276:5: error: redefinition of ‘stat’
1276 | int stat64 (const char *path, struct stat64 *buf)
| ^~~~~~
/nix/store/g9cgi4yyn5vrd1f9axj8gxdvwzv5ssvk-musl-1.2.5-dev/include/sys/stat.h:80:5: note: previous definition of ‘stat’ with type ‘int(const char *, struct stat *)’
80 | int stat(const char *__restrict, struct stat *__restrict);
| ^~~~
make[1]: *** [Makefile:161: libfaketime.o] Error 1
Fix it by only defining stat64 when building against glibc, since it's
not straight forward to detect musl, and it's the safest approach; there
might be other libc implementations that behave like musl.
Fixes: 53ba71e547 ("Handle stat64() call")
Add ft_sem_*() functions that use the POSIX semaphore API.
In preparation for adding System V semaphores as an alternative to POSIX
semaphores, because glibc breaks POSIX semaphores when operating in
mixed 32- and 64-bit environments[1].
[1] https://sourceware.org/bugzilla/show_bug.cgi?id=17980
To give more context, stat64 is a child of large-file support (LFS)
back in 1996, during the transition from 32-bit to 64-bit. People
wanted 64-bit inodes in 32-bit systems, hence stat and stat64.
Nowadays where everything is 64-bit, stat64 is mostly just an alias
to stat, as stat is already native 64-bit. On modern implementations
like musl, stat64 is even dropped entirely as a sane default. We
observe the same in darwin's stat.h:
#if !__DARWIN_ONLY_64_BIT_INO_T
struct stat64 __DARWIN_STRUCT_STAT64;
#endif /* !__DARWIN_ONLY_64_BIT_INO_T */
Because struct stat64 doesn't ever exist on aarch64-darwin, and we
don't have to worry about people using stat64 calls, we can safely
remove all stat64 bloat, according to __DARWIN_ONLY_64_BIT_INO_T.
I nuked fake_stat64buf because only STAT64_HANDLER is using it, and
only non-darwin stat64 things use that handler. I didn't do more
because people might still use stat64 things on x86_64 (on glibc)
and other older 32-bit platforms, and we still need to hook those.
A loose follow up to PR #453. Fixes the remaining clang warnings on
aarch64-darwin.
This fixes the recursive pthread_once deadlock on darwin platforms.
It looks something like this:
Trace/BPT trap: 5
BUG IN CLIENT OF LIBPLATFORM: Trying to recursively lock an os_once_t
The macro __APPLEOSX__ is never defined, instead __APPLE__ should be used.
This mistake inadvertently caused system_time_from_system() to always take
the linux code path on darwin, leading to recursive calls during ftpl_init().
This was exposed by PR #488 which removed the ad-hoc recursion detection
that previously masked this issue.