/LinuxAuditToBSM |
Non-BSM to BSM Conversion Tools
Student: MateuszPiotrowski (0mp@)
Mentors: KonradWitaszczyk (def@), PawelJakubDawidek (pjd@)
Contents
Ideas
Grep the Linux kernel for audit_log_format() and try to get an idea how the log format look like. Example:
Source: linux-4.9.13:drivers/tty/tty_audit.c:tty_audit_log()
Code
News
14.04.2017
I've started to develop a plugin for Linux audispd: https://github.com/0mp/audisp-auditdistd
02.12.2016
It's been a few months since the end of the GSoC. Recently, on 02.12.2016 I spoke with KonradWitaszczyk about the future of this project. We've exchanged thoughts and ideas and here's a short wrap-up:
Tasks
Write a plugin for Linux audispd capable of communicating with auditdistd.
This should be pretty easy and would allow us to securely stream Linux Audits to FreeBSD. The NonBSM to BSM conversion could be applied afterwards as auditdistd is not designed to interpret data on its own.
We took Rsyslog under consideration as a replacement for audispd but for the time being an audispd plugin is preferred as it does not add unnecessary dependencies.
- It might be a good idea to write the converter in a higher-level language like Python or Perl to make prototyping is easier as the final design is improved. Eventually, the converted would be rewritten to C so that it could be included in the base.
Notes & Questions
- How much work would it take to introduce OpenBSM format to Linux Audit?
- Does it make sense to compress Linux Audit logs before sending them to FreeBSD?
There is an interesting project called aushape. It is a library and a tool for converting Linux Audit logs to XML and JSON. It might help a lot with the conversion in the future.
There is a C library called LibXo which is a library for generating text, XML, JSON, and HTML output.
- Other projects worth mentioning are:
Project Description
Let's imagine a FreeBSD server which collects audit records from machines that are not necessarily using BSM as the format of their audit records. The idea is to create a tool which would be able read audit records in a non-BSM format and output those audit records in the BSM format. The aim is to lose as little information as possible which is very likely unavailable due to the differences between the standards.
The deliverables would allow to ease the handling of different audit log files collected from your servers and examine them using default FreeBSD administration tools which support the BSM format.
Approach to Solving the Problem
I will focus on the Linux Audit and Windows formats mainly.
Linux Audit
When it comes to Linux Audit, I will study the full list of the data fields and map the Linux audit fields to the BSM fields. Subsequently, I will be able to design a suitable parser from Linux Audit to the BSM format.
Windows Audit Format
Windows to BSM conversion is in demand according to this. The documentation of Windows fields can be found here and here.
Microsoft offers a 180-day long trial period for Windows Server 2012 Essentials (see here (link)).
Deliverables
- A library which allows to convert the Linux and Windows audit formats to the BSM format;
An extesion of auditdistd(8) which allows auditdistd(8) to receive the non-BSM format from within the scope of the project;
- A shell tool which takes non-BSM logs as an input and outputs the provided logs in the BSM format using the library;
- A document about the drawbacks of the current BSM format. I will sum up the obstacles encountered during the project. Such a document will be handy when the community discusses the possibility of extending BSM to make it a truly standard format that all other audit trail formats could convert into. At the moment there is a possibility that not all field of the non-BSM audit formats can be represented in the BSM format. If the number of the missing fields turns out to be significant I will contact the BSM community to encourage them to extend the standard in order to support a cross-platform conversion;
- A test suite for the library, the parser/converter tool and the extension of auditdistd(8);
- Updated man pages for auditdistd(8), the library and the tool.
Milestones
Weeks |
Days |
Description |
Status |
||
Start of Coding |
|
||||
1 |
May 23 |
Start of coding. |
|
||
Linux Audit to BSM conversion |
|
||||
1 |
May 23 - May 29 |
Learn the details of Linux Audit and BSM. |
|
||
2 |
May 30 - June 5 |
||||
3 |
June 6 - June 12 |
||||
4 |
June 13 - June 19 |
Design the structure and the interface of the library. |
|
||
5 |
June 20 - June 26 |
||||
Mid-term Evaluations |
|
||||
6 |
June 27 |
Mid-term evaluation. |
|
||
Implement the Conversion from Linux Audit to BSM / Shell Conversion Tool |
|
||||
6 |
June 27 - July 3 |
Implement the Linux Audit parser. |
|
||
7 |
July 4 - July 10 |
Implement the Linux Audit conversion. |
|
||
8 |
July 11 - July 17 |
Improve the API of the conversion of the Linux Audit format. |
|
||
9 |
July 18 - July 24 |
Improve the conversion/mapping. |
|
||
10 |
July 25 - July 31 |
||||
11 |
August 1 - August 7 |
||||
12 |
August 8 - August 14 |
Convert Linux syscalls. |
|
||
Convert Linux execs. |
|
||||
Bring au_to_attr(5) to the userland. |
|
||||
Extend auditdistd(8) with the Ability to Receive Linux Audit Logs |
|
||||
13 |
August 15 - August 21 |
Configure CentOS to send logs to FreeBSD. |
|
||
End of Coding |
|
||||
14 |
August 22 |
Soft deadline. |
|
||
14 |
August 23 |
Hard deadline: submit the code until 19:00 UTC. |
|
||
15 |
August 30 |
Successful student projects announced. |
|
Deferred & Uncompleted Milestones
- Windows audit format to BSM conversion.
- Document the obstacles encountered during the project.
- Write manual pages.
- Extend auditdistd(8) with the ability to receive Linux Audit logs.
- Create a test suite.
Weekly Reports
Summary
This is the summary for the final evaluation.
The code is available in this PR: https://github.com/0mp/freebsd/pull/9
The directory with the library and the shell tool is in contrib/openbsm/bin/bsmconv.
I kept some of my notes on the Wiki on GitHub: https://github.com/0mp/freebsd/wiki
Library
The interface of the library is available in linau.h.
A significant part of the library is now completed. The library is capable of parsing Linux Audit records and converting them to the BSM format. The conversion is not perfect but it handles the most common types of Linux Audit records and fields.
The library converts Linux Audit logs only; I had too little time to share my time between the Linux and Windows standards.
Perhaps the most interesting part of the library is the one responsible for the conversion. Here is a quick overview of the conversion framework I created.
Improve Conversion
Say that you want to improve the conversion. For example there is a new record type NEWTYPE and a typical record of this type looks like this:
type=NEWTYPE msg=audit(1464612294.816.1234): pid=400 newfield="text"
This is a list of steps required to introduce this record type to the library:
Add LINAU_TYPE_NEWTYPE and LINAU_TYPE_NEWTYPE_STR to linau_conv_impl.h.
Update linau_conv_get_type_number() in linau_conv.c.
Update linau_conv_to_au() in linau_conv.c.
Add an lcrectype structure:
to linau_conv.c.
&lctoken_process32 is added because of the pid field in this record. Because the Linux Audit framework is not yet to be standardized I had to decide how to convert records. The policy is to create BSM tokens even if there is only one field we can put into the token (pid is that lonely field in the example).
For the sake of this tutorial I will introduce a new BSM token which stores information from newfield.
Create all the missing lctokens from our example. Add
to linau_conv.c.
Add LINAU_FIELD_NAME_NEWFIELD and LINAU_FIELD_NAME_NEWFIELD_STR to linau_conv_impl.c.
Add an lcfield:
to linau_conv.c.
linau_conv_is_encoded() is used here because we assume that the value of the newfield field is always stored inside a pair of quotation marks ("...").
.lcf_validate = is used because the name of the field is predefined (its name is newfield obviously); we would use .lcf_match = in case the name of a group of fields is defined by a regex (see lcfield_a_execve_syscall; more details are available in this thread on the Linux Audit mailing list.).
Add a function which writes the token to the audit record descriptor (aurd). The function should be static void and should no attempt to write a token to the descriptor if there are no valid fields to create a token. It means that those functions have to check both the number of existence of at least one field required by a function from au_token(3) and that the value of the field is reasonable (most of the time it should basically return 1 when lcf_validate is called.
That's it. NEWTYPE has been introduced to the system together with the newfield field.
Architecture of the Conversion Framework
The whole conversion is based on the 3 structures: linau_conv_field, linau_conv_token and linau_conv_record_type. lcrectypes know the lctokens which can be possibly generated using the fields they have and lctokens know the lcfields they might require to create generate and write a BSM token.
The flow of the conversion procedure looks like this in general (a parsed Linux Audit record is an input here):
Get the lcrectype related to the type of the Linux Audit record.
Iterate over every lctoken of the lcrectype and try to generate a BSM token with the rules defined in the lctoken.
- Find out which fields were not included in any BSM token and write them as BSM text tokens.
Todo List
There are quite a few TODO, STYLE and XXX tags scattered around the source files. Apart from that there is a TODO file in the project's directory with a list of tasks.
This is a list of the most vital parts of the library that are missing:
The fields are not well validated at the moment. The linau_conv_is_* functions are mostly not implemented yet.
The function au_event_type_from_linau_event is not implemented.
As system calls in FreeBSD and Linux differ significantly we should not use the FreeBSD system call numbers from /etc/security/audit_event as mapping values for Linux Audit events. Instead, we should add new identifiers.
Another idea is to ignore the /etc/security/audit_event file entirely and just map every Linux Audit event to 0 (AUE_NULL). The event's type would be passed as an extra text token instead. This approach is less aggressive towards FreeBSD.
Actually, /usr/include/bsm/audit_kevents.h and /usr/include/bsm/audit_uevents.h do contain some mappings specifically for Linux. It looks like it is just a matter of mapping the Linux Audit record types to the numbers found in those files. Additionally, those files might require a little refreshment since there were some changes in the Linux Audit standard.
The library does not support the ENRICHED format.
Currently, the conversion scheme suggests that a Linux Audit record is related to one or more BSM tokens forming one BSM record. It is not super accurate however. There are examples of Linux Audit records (like those of type PATH) which should be joined with the Linux Audit record preceding them. It could be achieved if another layer of conversion is introduced. The idea is to run another conversion on the already converted BSM records and merge some of them into single records.
There are still a lot of information stored inside subj and msg fields, In fact the msg field stores the payload of the audit record. The problem is that the msg field stores even more fields which does not fits the current architecture of the library directly.
When the logs are sent by audisp-remote from a Linux machine then every record has a prefix the size of which is 16 bytes. It is because audisp-remote adds a header to every log it sends over pure TCP (I don't know if it holds true when Kerberos is used). The first 4 bytes are a magic number fe0000ff. Then the version is added (which is always 0). Then again 0 for mver. Then 6 bytes for the type, 10 bytes for the length and 12 bytes for the sequence number (see audit-userspace/lib/private.h:AUDIT_RMW_PACK_HEADER). Additionally, audisp-remote(8) prefixes the actual record with fields like node indicating the machine from where the audit record came from.
The library assumes that a valid record starts with type=.
Linux Audit Framework
The Linux Audit framework is a little bit hard to understand due to the fact that the users of this framework do not follow the standard. Additionally, there are no documents describing the standard in depth - in fact, it is the source code and few short documents which define the standard. This situation is going to change in the near future as the testsuite is being developed to help developers unify their log messages.
If you want to understand how the Linux Audit format works you have to:
- Look at the Linux kernel source code;
Look at the audit-userspace source code;
- Generate some logs on your own;
- Read the documentation and the linux-audit at redhat mailing list archives;
- Ask your own questions on the mailing list.
There is no easy way. Sorry.
BTW, it is worth updating the kernel
and the audit framework
1 VERSION=2.6.6
2
3 curl -O http://people.redhat.com/sgrubb/audit/audit-${VERSION}.tar.gz
4 gzip -d audit-${VERSION}.tar.gz
5 tar xf audit-${VERSION}.tar
6 cd audit-${VERSION}
7
8 sudo yum update
9 sudo yum install \
10 libcap-ng-devel \
11 libtool \
12 openldap-devel \
13 python-devel \
14 swig \
15 tcp_wrappers-devel
16 ./configure --sbindir=/sbin --with-python=yes --with-libwrap --with-libcap-ng=yes \
17 --enable-gssapi-krb5=yes
18 make
19 sudo make install
to the latest versions if you plan to generate logs on a Linux machine.
Further Reading & References
There are some notes in the docs/ directory as well.
auditdistd Extension
The main purpose of this part of the project is to give auditdistd(8) the ability to receive audit trails from Linux Audit auditd.
The extension has neither been implemented nor designed.
Nevertheless, I had enough time to configure CentOS 7 to send logs directly to a netcat on FreeBSD. It is certainly not a solution but configuring CentOS was crucial; I was not possible to design the extension of auditdistd without the knowledge about the Linux Audit tools: auditd(8), audispd(8) and audisp-remote(8).
The configuration files are available here (link).
Further Reading & References
auditdistd.conf(5)
Shell Tool
A simple shell tool has been implemented. It is possible to convert Linux Audit logs to the BSM format in a command line.
It is possible to compile the tool (at the moment it is called 'bsmconv'):
The usage is fairly simple. Just run the tool and stream a Linux Record audit trail into the tool: ./bsmconv < audit.log If no command line options are used then the tool will print the logs in the BSM format. It might be handy to pipe the output straight to praudit. The only command line option available is -v which increases the verbosity level of debug messages. If specified, the tool will not print the output in the BSM format; instead, it will print a dump of the constructed structures.
Alternatively, you can use the script I used to automate my workflow during GSoC. This will compile the tool:
1 ./fu m
Report about the Drawbacks of the BSM Format
It turned out that I spent much more time understanding the Linux Audit format than on the BSM format. As a result my knowledge on the BSM format is not deep enough to prepare such a report.
Nevertheless, I had a discussion with RobertWatson about bringing au_to_attr functions to the userland (see this discussion (link)). At the moment a temporary solution is present in the code, although a much better solution should be implemented (see the discussion linked above).
Additionally, I found that there are a couple of outdated man pages which need refreshing, for example au_token(3).
When it comes to the modification of libbsm, I added a function au_close_buffer_tm() to the interface of libbsm. The reason is that there was no other way to create a BSM token with an arbitrary timestamp in it.
Test Suite
I did not manage to create a test suite. Throughout the project I used my own shell script (it is called fu) to test the shell tool with the Linux Audit records I've gathered. There are both real life examples and some edge case tests written by me.
The Linux Audit logs are stored inside tests/. It is possible to run the tests using like this:
Manual Pages
I did not create any manual pages during Google Summer of Code.