As Zig has evolved, it has become a target to avoid calling Win32 APIs from kernel32.dll etc., instead using lower-level ones in ntdll.dll. This has been discussed in #1840, but this issue serves as its replacement, and a reformulation of why this practice is preferred.
Firstly: I am not a core Zig developer, and I do not speak for the team. I have made some pull requests in this area, and have helped out core developers to do the same. I am also not an expert on Windows Internals™, but I do read what other experts write and use their tools.
On modern Windows, the Win32 APIs are implemented as a “Subsystem”1 of the operating system, with its implementation relying on the “Native APIs” provided by the kernel. User-mode access to the Native API is done by calling functions exported by ntdll, which issues system calls to the NT kernel with the same arguments.
Comparing the comprehensive Win32 API reference against the incidentally documented Native APIs, its clear which one Microsoft would prefer you use. The native API is treated as an implementation detail, whilst core parts of Windows’ backwards compatibility strategy are implemented in Windows subsystem2.
Why then would anyone use the Native API? The best answer I’ve found is in Windows Native API Programming3:
Using the native APIs can have the following benefits:
- Performance – using the native API bypasses the standard Windows API, thus removing a software layer, speeding things up.
- Power – some capabilities are not provided by the standard Windows API, but are available with the native API.
- Dependencies – using the native API removes dependencies on subsystem DLLs, creating potentially smaller, leaner executables.
- Flexibility – in the early stages of Windows boot, native applications (those dependent on NtDll.dll only) can execute, while others cannot.
That said, there are potential disadvantages to using the native API:
- It’s mostly undocumented, making it more difficult to use from the get go. This is where this book comes in!
- Being undocumented also means that Microsoft can make changes to the native API without warning, and no one can complain. That said, removal of capabilities from the native API or modification of existing functionality are rare. This is because some of Microsoft’s own tools and applications leverage the native API.
The points on “performance” and “power” are more relevant for Zig for a couple reasons:
- The backwards compatibility features come at the cost of unexpected (to the programmer) DLL loading, which in turn requires heap allocation, file loading and use of critical sections. These are things that Zig code should handle by itself, but a practical benefit is a reduction of memory/file errors that occur when the system is overloaded. This has been observed on the Windows CI machines.
- Most Win32 APIs return a
BOOLfor success/failure, with the failure code needing to be fetched by callingGetLastError. Native APIs return anNTSTATUSdirectly, which is a richer set of error/informational codes. Using the native APIs avoids having to call intokernel324 and allows for code to handle errors in the sameswitchwhere a function is called. - Speaking of types, the ones used for Native API programming are just better than the Win32 ones:
- Times/timeouts are specified as
LARGE_INTEGERs, a signed 64-bit time value with a 100ns resolution. An upgrade compared to thedwMillisecondsfound in some Win32 functions, orFILETIMEs which require bit-shifting. ACCESS_MASKs, used when setting/querying rights on a kernel object, map nicely to Zig’spacked structs.
- Times/timeouts are specified as
- Many Native APIs take an
IO_STATUS_BLOCK(essentially one part of the Win32LPOVERLAPPED), but these are often hidden by the Win32 APIs. - More features are exposed by the Native APIs than the Win32 APIs that call them. In addition, there are some replacement/extended APIs that are unused or have no equivalent:
ReadDirectoryChangesWis implemented usingNtNotifyChangeDirectoryFile. On my Windows 11 machine, this is a thin wrapper aroundNtNotifyChangeDirectoryFileEx, which exposes more information viaFILE_NOTIFY_FULL_INFORMATION.NtFlushBuffersFileExallows for more granular flushing thanFlushBuffersFile, and is used by e.g. PostgreSQL to provide the equivalent offdatasync.
Does this mean that the Zig standard library will replace every Win32 API with the corresponding Native ones? No, and don’t expect it ever will. There are certain parts of the Win32 API that are too complicated and/or undocumented enough that it’s not worth the effort. These include:
The console APIs, where the implementation was completely rewritten in Windows 8+ and remains under-researched.Nevermind, this happened as I was writing this.- Bypassing the Winsock API and using
\Device\Afddirectly. - Anything to do with loading libraries.
- Creating Win32 processes with
NtCreateUserProcess. Heed this note from Windows Native API Programming:
There have been several attempts in the Infosec community to utilize
RtlCreateUserProcessand/orNtCreateUserProcessto allow running Windows subsystem applications, with varying degrees of success. Part of the problem is the need to communicate with the Windows Subsystem process (csrss.exe) to notify it of the new process and thread. This turns out to be fragile, as different Windows versions may have somewhat different expectations.
Everything else is fair game:
- Some Win32 functions are ABI compatible with the Native ones, thus the implementation is a set of jumps from
kernel32.dll -> kernelbase.dll -> ntdll.dll -> ntoskrnl.exe. #25766 eliminated all of the remaining “forwarders”, and you can see how e.g.ReleaseSRWLockExclusiveforwards toRtlReleaseSRWLockExclusive. - Others are “thin wrappers” that call the native APIs and convert types/errors:
CopySidis justRtlCopySid,NT_SUCCESSto check the error code andBaseSetLastNTErrorfor error mapping.- Some convert types like the Win32
BOOL(i32) into the NativeBOOLEAN(i8) before calling the Native APIs. SeeSetSystemTimeAdjustment,AddAuditAccessAcefor examples.
- Some are “combination” wrappers that call multiple Native APIs that can be split into multiple parts:
CreateIoCompletionPortcallsNtCreateIoCompletionifExistingCompletionPortisNULL, then associates theFileHandlewith the port andCompletionKeyviaNtSetInformationFile.SetHandleInformationqueries the status ofhObjectviaNtQueryObjectbefore callingNtSetInformationObject.
- Finally, some functions setup the aforementioned compatibility features before calling Native APIs:
ReadDirectoryChangesWmay create an activation context to load a Side-by-side assembly iflpCompletionRoutineis notNULL.GetOverlappedResultalways callsSbSelectProcedureifbWaitisTRUE. This is part of the SwitchBack system introduced in Windows Vista.
You, dear reader, may have some doubts as to whether this is all worth it. As someone who has done this work before, let me answer some common questions:
I use the functions in
std.os.windows.kernel32directly, what will happen if I upgrade Zig and they are removed?
If a re-implementation wasn’t made, then a compile error most likely. The standard library should contain only the host functions/types needed to support itself. We recommend copying the necessary function definitions/types into your project, or using something like zigwin32.
Doing this breaks compatibility with older versions of Windows, which is important for me.
True! But that is not (currently) a goal for Zig; standard library support is based on the earliest platform release supported by its developers. For Windows that means 10 and 11; and the server variants based on them. If you value backwards compatibility, replace calls to the standard library with Win32 ones instead. You can even replace your main function with wWinMain!. This practice is encouraged, as most of the standard library should work on free-standing environments.
Won’t this get flagged by anti-virus scanners as suspicious?
Unfortunately, yes. We consider this a problem for the anti-virus scanners to solve.
Microsoft are free to change the Native API at will, and you will be left holding both pieces when things break.
While this can happen, we have not (yet) been affected by any changes in the Win32 -> Native layers. Also, both API sets are quite stable and functionality is rarely removed. If anything we prefer using more modern Native APIs than older ones. That said, if there are issues then we prefer to handle it in the standard library. Calling the Native APIs can bypass Windows-on-Windows support, but we have not seen significant issues by doing so.
I want to run my Zig programs on Wine. What if Wine’s implementation differs from Windows, or is stubbed out?
We consider this a bug in Wine, and not something to work around unless it significantly impacts us – e.g. CI failures on Linux due to running the test suite with Wine. The cases I know of involve the behaviour of NtQueryObject and the console API re: using the UTF-8 codepage. The former requires a small workaround, the latter led to the standard library continuing to rely on kernel32.
Doing this requires observing the same implementation details as Windows/Wine/ReactOS. This is a fool’s errand for a programming language.
True! But that is our cross to bear. The standard library most definitely contains different logic re: path-name validation than Windows. This means that WoW64 path redirection most likely doesn’t work when compiling for x86-windows. On the other hand, this allows the standard library to handle issues arising from BadBatBut.
Using undocumented APIs like this will just lead to runtime errors, SEH exceptions, and other nasal daemons.
If some API is more sophisticated than a forwarder/wrapper, we investigate if the replacement is worth pursuing in the first case. We prefer the Native API, not require it; if you’ve encountered an issue due to this policy: report it! We are mostly reasonable people. But arguments like “Microsoft doesn’t want you to do this” or “obviously this will lead to problems later on!” don’t count.
Doing this means that Microsoft can’t change/improve their internal APIs!
While well-intentioned, this is incorrect for a couple reasons:
- The design of the Native APIs are built to allow for future functionality. Instead of tightly coupling a structure to a system call as you often see for the *Nixen, they are passed in as a tuple of
{OperationEnum, struct pointer, length of struct}. For example,NtQueryInformationFiletakes aFILE_INFORMATION_CLASSenum, and there has been numerous additions made over the many releases of Windows. The kernel also validates the structure length and will return an error if it is incorrect/too small. This can be used to allocate enough storage for flexible structures if the API takes a return length pointer. - If a new API is needed, the old one is never removed, instead:
- The new API is exported via
ntdll, and a superset of the old one with the addition of one-or-more parameters. You can distinguish them by theirExsuffix (orEx2if the first time wasn’t enough). The same thing happens in Win32 land where you have threeMapViewOfFiles (four if you count the Numa variant). - The Win32 code is updated to call the new API, or the old API calls the new one. See the comment about
ReadDirectoryChangesW. - Alternatively, the old code isn’t changed at all and multiple Native APIs will live… side-by-side.
- The new API is exported via
- The existence of the Wine and ReactOS projects, along with Microsoft’s own Side-by-side/Switchback systems, proves that has this already happened. There are large swathes of programs/libraries that depend on this functionality, and it is in everybody’s best interest that they are kept available.
-
Older versions of Windows NT had additional subsystems to support running POSIX and OS/2 applications, but these have since been removed. Aside from the introduction of WSL 1, the separation between Win32 and Native APIs have coalesced over time. ↩︎
-
This includes the Side-by-side assembly and SwitchBack compatability features. ↩︎
-
Yosifovich, P. (2024). Windows Native API Programming. [online] Leanpub, pp.7–8. Available at: https://leanpub.com/windowsnativeapiprogramming. ↩︎
-
When a native API returns an error,
kernel32code may callRtlNtStatusToDosErrorto mapNTSTATUSto a Win32ERROR, or just select an error. This error is passed toRtlRestoreLastWin32Error, which is just a wrapper around settingLastErrorValuein the Process Environment Block. ↩︎
As Zig has evolved, it has become a target to avoid calling Win32 APIs from `kernel32.dll` etc., instead using lower-level ones in `ntdll.dll`. This has been discussed in [#1840][gh], but this issue serves as its replacement, and a reformulation of why this practice is preferred.
[gh]: https://github.com/ziglang/zig/issues/1840
Firstly: I am not a core Zig developer, and I do not speak for the team. I have made some pull requests in this area, and have helped out core developers to do the same. I am also not an expert on Windows Internals™, but I do read what other experts write and use their tools.
—
On modern Windows, the Win32 APIs are implemented as a “Subsystem”[^ss] of the operating system, with its implementation relying on the “Native APIs” provided by the kernel. User-mode access to the Native API is done by calling functions exported by `ntdll`, which issues system calls to the NT kernel with the same arguments.
[^ss]: Older versions of Windows NT had additional subsystems to support running POSIX and OS/2 applications, but these have since been removed. Aside from the introduction of WSL 1, the separation between Win32 and Native APIs have coalesced over time.
Comparing the comprehensive [Win32][win32] API reference against the incidentally documented [Native][native] APIs, its clear which one Microsoft would prefer you use. The native API is treated as an implementation detail, whilst core parts of Windows’ backwards compatibility strategy are implemented in Windows subsystem[^backcompat].
[win32]: https://learn.microsoft.com/en-us/windows/win32/api/
[native]: https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/_kernel/
[sxs]: https://en.wikipedia.org/wiki/Side-by-side_assembly
[sb]: https://learn.microsoft.com/en-us/windows/client-management/mdm/policy-csp-admx-appcompat#appcompatturnoffswitchback
[^backcompat]: This includes the [Side-by-side assembly][sxs] and [SwitchBack][sb] compatability features.
Why then would anyone use the Native API? The best answer I’ve found is in *Windows Native API Programming*[^win-native-pavel]:
> Using the native APIs can have the following benefits:
> – Performance – using the native API bypasses the standard Windows API, thus removing a software layer, speeding things up.
> – Power – some capabilities are not provided by the standard Windows API, but are available with the native API.
> – Dependencies – using the native API removes dependencies on subsystem DLLs, creating potentially smaller, leaner executables.
> – Flexibility – in the early stages of Windows boot, native applications (those dependent on NtDll.dll only) can execute, while others cannot.
>
> That said, there are potential disadvantages to using the native API:
> – It’s mostly undocumented, making it more difficult to use from the get go. This is where this book comes in!
> – Being undocumented also means that Microsoft can make changes to the native API without warning, and no one can complain. That said, removal of capabilities from the native API or modification of existing functionality are rare. This is because some of Microsoft’s own tools and applications leverage the native API.
[^win-native-pavel]: Yosifovich, P. (2024). Windows Native API Programming. [online] Leanpub, pp.7–8. Available at: https://leanpub.com/windowsnativeapiprogramming.
The points on “performance” and “power” are more relevant for Zig for a couple reasons:
– The backwards compatibility features come at the cost of unexpected (to the programmer) DLL loading, which in turn requires heap allocation, file loading and use of critical sections. These are things that Zig code [should handle by itself][why-zig], but a practical benefit is a reduction of memory/file errors that occur when the system is overloaded. This has been observed on the Windows CI machines.
– Most Win32 APIs return a `BOOL` for success/failure, with the failure code needing to be fetched by calling `GetLastError`. Native APIs return an [`NTSTATUS`][ntstatus] directly, which is a richer set of error/informational codes. Using the native APIs avoids having to call into `kernel32`[^err] and allows for code to handle errors in the same `switch` where a function is called.
– Speaking of types, the ones used for Native API programming are just better than the Win32 ones:
– Times/timeouts are specified as `LARGE_INTEGER`s, a signed 64-bit time value with a 100ns resolution. An upgrade compared to the `dwMilliseconds` found in some Win32 functions, or `FILETIME`s which require bit-shifting.
– `ACCESS_MASK`s, used when setting/querying rights on a kernel object, [map nicely][access-mask] to Zig’s `packed struct`s.
– Many Native APIs take an `IO_STATUS_BLOCK` (essentially one part of the Win32 `LPOVERLAPPED`), but these are often hidden by the Win32 APIs.
– More features are exposed by the Native APIs than the Win32 APIs that call them. In addition, there are some replacement/extended APIs that are unused or have no equivalent:
– `ReadDirectoryChangesW` is implemented using `NtNotifyChangeDirectoryFile`. On my Windows 11 machine, this is a thin wrapper around `NtNotifyChangeDirectoryFileEx`, which exposes more information via `FILE_NOTIFY_FULL_INFORMATION`.
– [`NtFlushBuffersFileEx`][flush] allows for more granular flushing than `FlushBuffersFile`, and is used by e.g. [PostgreSQL][postgres] to provide the equivalent of `fdatasync`.
[ntstatus]: https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-erref/87fba13e-bf06-450e-83b1-9241dc81e781
[access-mask]: https://codeberg.org/ziglang/zig/src/commit/fcef9905ae859601d085576012b81dc05f67c46f/lib/std/os/windows.zig#L1240
[why-zig]: https://ziglang.org/learn/why_zig_rust_d_cpp/
[flush]: https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/nf-ntifs-ntflushbuffersfileex
[postgres]: https://doxygen.postgresql.org/win32fdatasync_8c_source.html#l00023
[^err]: When a native API returns an error, `kernel32` code may call `RtlNtStatusToDosError` to map `NTSTATUS` to a Win32 `ERROR`, or just select an error. This error is passed to `RtlRestoreLastWin32Error`, which is just a wrapper around setting `LastErrorValue` in the Process Environment Block.
—
Does this mean that the Zig standard library will replace every Win32 API with the corresponding Native ones? *No, and don’t expect it ever will*. There are certain parts of the Win32 API that are too complicated and/or undocumented enough that it’s not worth the effort. These include:
– ~~The console APIs, where the implementation was completely rewritten in Windows 8+ and remains under-researched.~~ Nevermind, [this happened](https://codeberg.org/ziglang/zig/pulls/31126) as I was writing this.
– Bypassing the Winsock API and using `\Device\Afd` directly.
– Anything to do with loading libraries.
– Creating Win32 processes with `NtCreateUserProcess`. Heed this note from *Windows Native API Programming*:
> There have been several attempts in the Infosec community to utilize `RtlCreateUserProcess` and/or `NtCreateUserProcess` to allow running Windows subsystem applications, with varying degrees of success. Part of the problem is the need to communicate with the Windows Subsystem process (`csrss.exe`) to notify it of the new process and thread. This turns out to be fragile, as different Windows versions may have somewhat different expectations.
Everything else is fair game:
– Some Win32 functions are ABI compatible with the Native ones, thus the implementation is a set of jumps from `kernel32.dll -> kernelbase.dll -> ntdll.dll -> ntoskrnl.exe`. [#25766][#25766] eliminated all of the remaining “forwarders”, and you can see how e.g. `ReleaseSRWLockExclusive` forwards to `RtlReleaseSRWLockExclusive`.
– Others are “thin wrappers” that call the native APIs and convert types/errors:
– `CopySid` is just `RtlCopySid`, `NT_SUCCESS` to check the error code and `BaseSetLastNTError` for error mapping.
– Some convert types like the Win32 `BOOL` (`i32`) into the Native `BOOLEAN` (`i8`) before calling the Native APIs. See `SetSystemTimeAdjustment`, `AddAuditAccessAce` for examples.
– Some are “combination” wrappers that call multiple Native APIs that can be split into multiple parts:
– `CreateIoCompletionPort` calls `NtCreateIoCompletion` if `ExistingCompletionPort` is `NULL`, then associates the `FileHandle` with the port and `CompletionKey` via `NtSetInformationFile`.
– `SetHandleInformation` queries the status of `hObject` via `NtQueryObject` before calling `NtSetInformationObject`.
– Finally, some functions setup the aforementioned compatibility features before calling Native APIs:
– `ReadDirectoryChangesW` may create an [activation context][act-ctx] to load a Side-by-side assembly if `lpCompletionRoutine` is not `NULL`.
– `GetOverlappedResult` always calls `SbSelectProcedure` if `bWait` is `TRUE`. This is part of the SwitchBack system introduced in Windows Vista.
[#25766]: https://github.com/ziglang/zig/pull/25766
[act-ctx]: https://en.wikipedia.org/wiki/Side-by-side_assembly#Activation_contexts
—
You, dear reader, may have some doubts as to whether this is all worth it. As someone who has done this work before, let me answer some common questions:
> I use the functions in `std.os.windows.kernel32` directly, what will happen if I upgrade Zig and they are removed?
If a re-implementation wasn’t made, then a compile error most likely. The standard library should contain [only the host functions/types needed to support itself][reduceapi]. We recommend copying the necessary function definitions/types into your project, or using something like [zigwin32][zigwin32].
[reduceapi]: https://github.com/ziglang/zig/issues/4426
[zigwin32]: https://github.com/marlersoft/zigwin32
> Doing this breaks compatibility with older versions of Windows, which is important for me.
True! But that is not (currently) a goal for Zig; standard library support is based on the earliest platform release supported by its developers. For Windows that means 10 and 11; and the server variants based on them. If *you* value backwards compatibility, replace calls to the standard library with Win32 ones instead. You can even replace your `main` function with `wWinMain`!. This practice is encouraged, as most of the standard library should work on free-standing environments.
> Won’t this get flagged by anti-virus scanners as suspicious?
Unfortunately, [yes](https://github.com/ziglang/zig/issues/23392). We consider this a problem for the anti-virus scanners to solve.
> Microsoft are free to change the Native API at will, and you will be left holding both pieces when things break.
While this *can* happen, we have not (yet) been affected by any changes in the Win32 -> Native layers. Also, both API sets are quite stable and functionality is rarely removed. If anything we prefer using more modern Native APIs than older ones. That said, if there are issues then we [prefer to handle it in the standard library](https://github.com/ziglang/zig/pull/19738). Calling the Native APIs can bypass Windows-on-Windows support, but we have not seen significant issues by doing so.
> I want to run my Zig programs on Wine. What if Wine’s implementation differs from Windows, or is stubbed out?
We consider this a bug in Wine, and not something to work around unless it significantly impacts us – e.g. CI failures on Linux due to running the test suite with Wine. The cases I know of involve the behaviour of [`NtQueryObject`][ntqueryobject] and the [console API][console] re: using the UTF-8 codepage. The former requires a [small workaround][workaround], the latter led to the standard library continuing to rely on `kernel32`.
[ntqueryobject]: https://github.com/ziglang/zig/issues/26029
[console]: https://github.com/ziglang/zig/pull/14411
[workaround]: https://github.com/ziglang/zig/pull/17541/changes
> Doing this requires observing the same implementation details as Windows/Wine/ReactOS. This is a fool’s errand for a programming language.
True! But that is our cross to bear. The standard library most definitely contains different logic re: path-name validation than Windows. This means that WoW64 path redirection most likely doesn’t work when compiling for `x86-windows`. On the other hand, this allows the standard library to handle issues arising from [BadBatBut][badbatbut].
[badbatbut]: https://github.com/ziglang/zig/pull/19698
> Using undocumented APIs like this will just lead to runtime errors, SEH exceptions, and other nasal daemons.
If some API is more sophisticated than a forwarder/wrapper, we investigate if the replacement is worth pursuing in the first case. We *prefer* the Native API, not *require* it; if you’ve encountered an issue due to this policy: **report it**! We are mostly reasonable people. But arguments like “Microsoft doesn’t want you to do this” or “obviously this will lead to problems later on!” don’t count.
> Doing this means that Microsoft can’t change/improve their internal APIs!
While well-intentioned, this is incorrect for a couple reasons:
– The design of the Native APIs are built to allow for future functionality. Instead of tightly coupling a structure to a system call as you often see for the \*Nixen, they are passed in as a tuple of `{OperationEnum, struct pointer, length of struct}`. For example, `NtQueryInformationFile` takes a `FILE_INFORMATION_CLASS` enum, and there has been [numerous additions][info-class] made over the many releases of Windows. The kernel also validates the structure length and will return an error if it is incorrect/too small. This can be used to allocate enough storage for flexible structures if the API takes a return length pointer.
– If a new API is needed, the old one is never removed, instead:
– The new API is exported via `ntdll`, and a superset of the old one with the addition of one-or-more parameters. You can distinguish them by their `Ex` suffix (or `Ex2` if the first time wasn’t enough). The same thing happens in Win32 land where you have three `MapViewOfFile`s (four if you count the Numa variant).
– The Win32 code is updated to call the new API, or the old API calls the new one. See the comment about `ReadDirectoryChangesW`.
– Alternatively, the old code isn’t changed at all and multiple Native APIs will live… *side-by-side*.
– The existence of the Wine and ReactOS projects, along with Microsoft’s own Side-by-side/Switchback systems, proves that has this *already happened*. There are large swathes of programs/libraries that depend on this functionality, and it is in everybody’s best interest that they are kept available.
[info-class]: https://ntdoc.m417z.com/file_information_class