Friday, April 3, 2015

Linux Loadable Kernel Module in Assembly

Hello everyone! First of all, sorry for being silent for the last two years. There have been certain reasons for this. Anyway, I am back and I am going to share a portion of what I've learnt over this period.

Before I begin, as usual, a note for nerds: the code in this article is for demonstration purposes only and does not contain certain things, like error checking, that would otherwise be inevitable. 

I have recently seen tones of posts about writing kernel module for a pre-compiled kernel on the Internet. Guys are doing good work, but there is one thing that I personally did not like - they all refer you to the configuration file for such kernel, which may be obtained this way or the other. Well, having configuration of the running kernel makes it almost no different from building a module for a kernel you compiled yourself (just almost). The bottom line - you want something to be done your way, do it yourself.

Tools used

Since building a kernel module written in C for the kernel you have no .config for may become a huge pain in certain parts of your body, I decided to go as low as possible and chose flat assembler (the good old flat assembler that may be found here). This wonderful instrument provides you with everything you may need when it comes to x86/x86_64 development (of course, most of your potential projects may be too complex for being implemented in assembly).


Target system

I was brave enough to perform this experiment on my dev machine running Debian with 3.2.0-4 kernel. Obviously, I do have proper kernel sources installed, but made no use of them in this example.


Loadable Kernel Module

I am not going to dive into the basics of Linux kernel structure and the way LKM support is implemented. It is simply irrelevant at this time. What we are interested in, is the structure of a module. To put it simple, the structure of a LKM may be described as:
  1. .init.text section  - contains all the module initialization code.
  2. .exit.text section - contains all the cleanup code executed right before the module is unloaded.
  3. Module information.
  4. All the rest.
While we may keep "all the rest" out of it for now, we do need to take care of proper representation of the init/exit sections and module information. In fact, init/exit sections are not a problem at all - that's just code after all, whereas module information is a bit problematic. But, first things first.

.modinfo section

This section contains some strings that let the kernel identify our module as a one that may be safely loaded and executed.

The first string tells the kernel about how our module is licensed:

"license=GPL"

You may use other license (e.g. "proprietary"), but that would make some symbols exported by the kernel invisible for your module.

The next one is

"depends="

here you should list modules your module depends on. Since our tiny module has no dependencies, we leave this string empty.

The last and the most important one is:

"vermagic=3.2.0-4-amd64 SMP mod_unload modversions "

this string tells us (and the kernel) which kernel the module was built for and what LKM handling options are enabled. However, the above string contains information that is good for building a module on my system, but it may (and almost certainly will) be wrong for your system. Don't worry, there is a simple way to get this string - run /sbin/modinfo on any *.ko file in your /lib/modules/`uname -r`/ directory.

__versions section

You can try to build a module without this section and it may even load and do its job, but you will get some nasty complaints from the kernel on being tainted.

The purpose of this section is to make sure your module and kernel are speaking the same language, meaning they use identical symbols. The structure of it is rather simple - an array of checksum/name pairs, where checksum is (in my case it is a x86_64 system) 8 bytes followed by a 56 bytes name (since names are shorter they are padded with 0). It is not as simple to find the proper values if you do not have properly configured kernel sources, though. You would have to simply check some modules for presence of specific symbol. I would suggest doing so in IDA Pro, but any hex editor would suffice too. 

.gnu.linkonce.this_module section

This section contains just one structure - module. I would not like to dive into specifics of this structure, after all, you can download kernel source and check include/linux/module.h file for struct module declaration. What is important to know, however, is that this structure contains the name of the module (as it would appear in lsmod's output) and pointers to module_init() and module_cleanup() functions.

Implementation

Well, seems like we've covered all the most important aspects. Let's get to the implementation itself. The following code may be compiled with flat assembler.

format ELF64
extrn printk
section '.init.text' executable

module_init:
push rdi
mov rdi, str1
xor eax, eax
call printk
xor eax, eax
pop rdi
ret


section '.exit.text' executable
module_cleanup:
xor eax, eax
ret
section '.rodata.str1.1'
str1 db '<0> Here I am, gentlemen!', 0x0a, 0
section '.modinfo' align 10h
db 'license=GPL', 0
db 'depends=', 0
db 'vermagic=3.2.0-4-amd64 SMP mod_unload modversions ', 0
  db  'vermagic=3.16.0-4-amd64 SMP mod_unload modversions ', 0

section '.gnu.linkonce.this_module' writable
this_module:
rb 18h
db 'simple_module', 0
rb 148h - ($ - this_module)
rb 150h - ($ - this_module) dq module_init
rb 238h - ($ - this_module)
rb 248h - ($ - this_module) dq module_cleanup
dq 0
section '__versions'
dq 0x568fba06
dq 0x2ab9dba5  @@:
db 'module_layout', 0
rb 56 - ($ - @b)
dq 0x27e1a049
  @@:
db 'printk', 0
rb 56 - ($ - @b)

Hope this article is helpful in some way. Thanks for reading and see you with the next post!

P.S. Updated the source to fit the latest kernel version.