Right. And it's just rather difficult for the kernel to know the user-land LC_CTYPE setting, so it doesn't, so it doesn't know the encoding of any string, so the best thing to do is assume UTF-8. If you want some other encoding, then libc's syscall stubs will have to do codeset conversions, or if not then you have to make sure that you only use that codeset everywhere.