Jan Kratochvil
Projects Products GIT Resume Contact
Projects
UNIX UNIX-devel Web Amiga MS-Windows MS-DOS Patches
Captive: The first free NTFS read/write filesystem for GNU/Linux

 

Previous document Parent Next document

API Function Implementation Choices

For each function exported by W32 ntoskrnl.exe and imported and called by the filesystem driver a decision needs to be made to properly implement its functionality. Currently implemented functionality statistics are provided below:

Function Implementation Types Statistics
Function type ItemsPortion
pass 81 26%
wrap 2 0%
native-ReactOS 113 36%
native-own 116 38%
Functions Reusal Ratio
Functions Reusal Ratio

 

As there are several choices to implement each function the usual attempts/investigations ordering is listed in the sections below.

Special case must be taken for data-type symbols since they are referenced without the possibility of catching the code flow by some breakpoints (it would be possible only in some special access cases). Data export symbols of unpatched libraries must contain already prepared content at the runtime. There is a problem with patched libraries where it is necessary to also fully implement the data symbol as native implementation since there is no possibility to pass the data symbol instead of the original W32 data location and therefore there will be two instances of such data variable place. As there will be also the uncaught references for such W32 data location from the patched library itself such symbols should be usually only some constants (such as KeNumberProcessors).

W32 platform symbols export/import can be based either on the symbol name itself or it can be also exported and imported just by its identification number called Ordinal. Although it saves some jumptables file binary size it is currently no longer used by W32 binaries and this project also does not support such Ordinal symbol reference type at all.

All the exporting magic is handled by custom script captivesym processing the definition file src/libcaptive/ke/exports.captivesym to produce the intermediate relaying code src/libcaptive/ke/exports.c. For details of the captivesym-specific source file syntax please see its documentation: src/libcaptive/ke/captivesym.pl

Direct Pass to Original "ntoskrnl.exe"

Simple (standalone) functions such as RtlTimeToSecondsSince1970() can be simply passed to the original implementation in ntoskrnl.exe as they make no hardware access and they do not expect any special internal data structures to be set up in advance by an earlier library initialization. A common case are all the data structures utility functions such as GenericTable subsystem or LargeMcb handling.

Pass from UNIX Code

Control flow begins in some standard UNIX code. Such code is always using cdecl call type for all its intracalls. Native functions compiled from ReactOS sources use their own cdecl/stdcall/fastcall declarations but these call type modifications are discarded during compilation for this project by the LIBCAPTIVE symbol.

UNIX code calls FUNCTIONNAME() relay from the generated UNIX jump table. Such relay will debug dump the passed arguments and finally pass the control to the original W32 function code in the proper call type cdecl/stdcall/fastcall for a given function.

Original W32 code entry point is always trapped by a breakpoint although it would not be needed during this specific direct pass from UNIX code to the original W32 implementation. Still the breakpoint has to be there to catch some other (such as intra-W32) possible calls described later. There are several more ways to define breakpoint in the code. One way is to use processor hardware breakpoint support but the number of breakpoints is limited. The other way is to patch in the int $3 instruction but it will invoke SIGTRAP signal handler conflicting with the possible debugger (gdb(1)) control. This project uses the hlt instruction, which also has a single-byte opcode as int $3 and it is a privileged instruction forbidden to be used from the UNIX user space code. hlt invokes SIGSEGV signal which can be resolved by a custom signal handler without any conflict with the possible debugger control; gdb(1) needs the following command to pass through such SIGSEGV signal:

handle SIGSEGV nostop noprint pass

When a breakpoint gets caught, we usually need to return to the running code. Unfortunately it is not possible because of the patched breakpoint opcode. The breakpoint cannot be simply removed upon return as it would permanently loose control over the point of entry. Even if the return would include faking of the return address in the bottom stack frame to patch the breakpoint back during later function exit it still would not solve the caughts of inner calls of recursive functions. One of the working possibilities would be to patch the original instruction back and perform a singlestep provided by ptrace(2) syscall. However such singlestep needs another controlling UNIX process and it would again conflict with the debuggers such as gdb(1). This project implements the singlestep functionality by two consecutive breakpoints (hlt instructions to be specific): The first two instruction addresses of the W32 functions are called slot #1 and slot #2, the length of the first function instruction has to be analyzed to get the right address of slot #2. When the first breakpoint is caught it is necessary to patch the original instruction back and also patch another breakpoint in place of slot #2. During the slot #2 breakpoint invocation the operation will be reverted — the breakpoint will be put to slot #1 again and the instruction of slot #2 will be restored to be able to continue the execution of the function.

W32 function will finish in its specific cdecl/stdcall/fastcall call type, the control will return to the UNIX jump table relay which will debug dump the return value and it will finally pass the control back to the UNIX caller in the standard UNIX cdecl call type.

Function Type: pass from UNIX Code
Function Type: pass from UNIX Code

 

Pass from W32 Code

This function type is similiar to the previous one with the exception of more complicated entry point. Unfortunately W32 libraries call their own functions directly, using the call instructions without any patchable jump table. Even the call argument itself cannot be patched according to the relocation table record as such library intra-call instruction has no relocation due to its relative argument offset on i386. This time the double-breakpoint mechanism described above gets handy since it will catch the entry point when the function gets called. SIGSEGV handler gets invoked by the hlt instruction and it will redirect the control to the jump table relay function to debug dump the function entry arguments (it has no other uses in this call type).

When the relay needs to call the original function it will reach exactly the same breakpoint instruction as during the recent SIGSEGV handling redirecting to this calling relay. But this time the through_w32_func field of this function record will be set to to prevent repeated redirection and to pass the control through the breakpoint mangle instead this time.

Returning is not much interesting as the first SIGSEGV handler did a straight jump for the redirection purposes without any needed consequent handling.

The jump table relay used for the callers from W32 code is a different one than the relay being used for the callers from UNIX code. UNIX code always uses relay with external cdecl call type but in this case a relay with the appropriate cdecl/stdcall/fastcall call type is used.

Function Type: pass from W32 Code
Function Type: pass from W32 Code

 

 

Function Type pass Characteristics
captivesym keywordpass
Native code function name (no implementation)
W32 traced code from UNIX function name FUNCNAME
W32 traced code from W32 function name FUNCNAME_cdecl/_stdcall/_fastcall
Entry/exit debug tracing from UNIX code yes
Entry/exit debug tracing from W32 code yes

Wrap of the Original "ntoskrnl.exe" Function

Wrapping of Call from UNIX Code

The code control flow has no special hardcore features since it is very similiar to the direct pass to W32 function from UNIX code. All the wrapping is done in the standard UNIX cdecl call type manner. Jump table debug dumping relays are provided twice — the "outer" one to trace the parameters from the function caller and the "inner" one to trace the call from the wrapper to the original W32 code. The "inner" relay also calls the W32 code with the appropriate cdecl/stdcall/fastcall call type.

Function Type: wrap from UNIX Code
Function Type: wrap from UNIX Code

 

Wrapping of Call from W32 Code

This scheme is a combination of the previous wrap of a call from UNIX code and the direct pass from the W32 code. The control is caught and redirected by SIGSEGV handler from the breakpoint placed at the entry to the original W32 function code. The second entry to the original W32 function with the through_w32_func field of this function description already set is done from the "inner" jump table relay with the appropriate cdecl/stdcall/fastcall call type.

Function Type: wrap from W32 Code
Function Type: wrap from W32 Code

 

 

Some functions can be passed to the original code but they need their parameters to be checked/prepared. Currently, such wrapping is only needed for the ExAllocateFromPagedLookasideList() function where it is required due to missing execution of ntoskrnl.exe initialization execution, which would otherwise properly initialize some internal data structures. In this case the wrapping code detects passing of an uninitialized parameter and will search through the whole ntoskrnl.exe code body at runtime to find the proper initialization routine containing the correct initialization parameters. Passed addresses of static structures must be differentiated as each of them usually has different initialization parameters. It is proactive to not to have fixed parameters array as these parameters may differ across different ntoskrnl.exe versions.

Function Type wrap Characteristics
captivesym keywordwrap
Native UNIX wrapping code function name FUNCNAME_wrap
W32 traced wraping code from UNIX func. name FUNCNAME
W32 traced wrapping code from W32 func. name FUNCNAME_cdecl/_stdcall/...
W32 traced original code function name FUNCNAME_orig
Entry/exit debug tracing from UNIX code yes
Entry/exit debug tracing from W32 code yes

Native Implementation

Native Implementation Called from UNIX Code

This is the simplest case of a function call as it is fully handled only by the compiler and/or linker.

In this case though, no debug dumping call relay is provided — such relay would need to rename the implementations of native functions to prevent its automatic linking with the caller code. This renaming would not be possible to do by simple #define since it would also rename any calling statements of such function in the same C sources. One of the possibilities to solve would be to utilize --redefine-sym feature of the objcopy(1) utility. On the other hand there is not much need to catch/debug such calls as both the caller and the callee are provided with full source file debug information for the debugger. Also the callee usually debug dumps its entry/exit parameters by custom debug dumps in the ReactOS implementations.

Function Type: native from UNIX Code
Function Type: native from UNIX Code

 

Native Implementation of "unpatched" Library Function Called from W32 Code

Function Type: native of unpatched from W32 Code
Function Type: native of unpatched from W32 Code

 

Here comes the differentiation if the project deals either with a patched or an unpatched version of the library (patched is a loaded W32 binary library while unpatched library is completely provided by this project with no use of the library's original W32 binary file). As the project adjusts the exported symbol address during the patching operation, in some cases the patched library call may be handled simply as unpatched library call even for the patched libraries. Fortunately the distinction is not much important as the project is prepared to properly handle both cases.

The W32 caller which imported the symbol will be pointed right to the relaying function. The debug dumping relay will be called from W32 code with the appropriate cdecl/stdcall/fastcall call type while the relay will call the implementation of the native function in the standard UNIX cdecl call type manner.

Native Implementation of "patched" Library Function Called from W32 Code

Function Type: native of patched from W32 Code
Function Type: native of patched from W32 Code

 

The calling scheme is similiar to the previous call of unpatched library function from W32 code but the call control is redirected from the entry point of the original W32 binary implementation by the breakpoint and its SIGSEGV handler as in the case of passing control from W32 call.

The original W32 function implementation located in the original loaded binary file is never executed but its entry point needs to be trapped by the breakpoint to be able to catch the function calls within the library.

 

In all cases the final function implementation is a standard UNIX code compiled from C sources with full debug information available for the debugger. Fortunately all such functions do not need to be coded from scratch for this project since there already exist Free ReactOS and Wine projects and their code can be used instead.

Wine project is listed mostly for a completeness as almost no code was suitable for reuse as it implements W32 user space while this project is running pure W32 kernel space environment (in GNU/Linux user space!).

Native Implementation - ReactOS

Some functions are already implemented in the ReactOS project and they can be used as they are. Although it would be possible to pass some function calls to the original code it is more handy to provide native implementation as there is better control of the data handling during debugging sessions due to the provided debugging symbols.

Such functions can be found in src/libcaptive/reactos/ subdirectory. Some functions had to be adjusted for this project - these modifications are compiled conditionally, depending on the LIBCAPTIVE symbol existence.

Later stages of this project reached the level where ReactOS is yet too immature and the needed functions are usually written just with the sad body:

UNIMPLEMENTED;

Functions that were not possible to pass were reimplemented by this project and placed in the project's implementation directories instead of extending ReactOS code.

Native Implementation – Wine

Even though Wine only implements the Microsoft Windows NT user space, there still are some common functions which could be copied from the Wine project.

Native Implementation – Project Specific

As the last resort it was necessary to provide completely own implementation of some API functions such as PC hardware dependent parts or memory management functions.

 

Function Type native Characteristics
captivesym keyword(none; just the symbol name)
Native code function name FUNCTIONNAME
Native traced code from W32 code func. name FUNCTIONNAME_cdecl/_std...
Entry/exit debug tracing from UNIX code no
Entry/exit debug tracing from W32 code yes

Undefined Function

Functions not defined by any of the previous function types cannot be called by any W32 code including the code of the library implementing such function. All functions of patched libraries not listed in the captivesym exports file are automatically set to be trapped as fatal program execution errors.

It is not necessary to list the symbols as undef as long as you are just loading the W32 PE-32 code and the symbols belong to patched library. On the other hand if you are loading W32 .so code or if such symbol is a part of unpatched library (and thus being completely provided by the project) you need to list such symbol as undef type to prevent unresolved symbol reference.

Function Type undef Characteristics
captivesym keywordundef
Native code function name (no implementation)
Native traced code function name FUNCTIONNAME_cdecl/_stdcall/_fastcall
Debug tracing message from UNIX code yes
Debug tracing message from W32 code yes

 

 

Previous document Next document

EOF