[ Pobierz całość w formacie PDF ]
VIRUS BULLETIN
www.virusbtn.com
TECHNICAL FEATURE
FLIBI: EVOLUTION
Peter Ferrie
Microsoft, USA
gradual refi nements in image accuracy. Something similar
can occur here, where the sequence of amino acids does
not need to work completely (or even at all) in order to be
useful (or just retained). As unlikely as these things are,
millions of computer years from now, we might see some of
the following transformations.
The aim of this article is to demonstrate how some
instructions from the original set might be removed by
replacing them with functionally equivalent code sequences
using the remaining instructions. One advantage of a
smaller instruction set is that it allows an increase in the
number of codons that can map to a single amino acid,
thus making the body more resilient to corruption. Further,
a sequence of instructions has a smaller risk of lethal
mutation than a single instruction, because the risk is spread
over a wider area.
We begin with a brief overview of the language itself. There
are 45 commands in the release version (there were only
43 in the preview version). There are three general-purpose
registers (‘A’, ‘B’ and ‘D’, which correspond to the ‘eax’,
‘ebp’ and ‘edx’ CPU registers); one temporary register, upon
which all operations are performed (which corresponds to
the ‘ebx’ CPU register); one ‘operator’ register, which holds
the value for any operation that requires a parameter (which
corresponds to the ‘ecx’ CPU register); and two buffer
registers, one of which holds the destination for branching
instructions (which corresponds to the ‘esi’ CPU register),
and the other holds the destination for write instructions
(which corresponds to the ‘edi’ CPU register).
The language supports the following commands:
• _nopsA, _nopsB, _nopsD, _nopdA, _nopdB, _nopdD
• _saveWrtOff, _saveJmpOff
• _writeByte, _writeDWord
• _save, _addsaved, _subsaved
• _getDO, _getdata, _getEIP
• _push, _pop, _pushall, _popall
• _zer0
• _mul, _div, _shl, _shr, _and, _xor
• _add0001, _add0004, _add0010, _add0040, _add0100,
_add0400, _add1000, _add4000, _sub0001
• _nopREAL
• _JnzUp, _JzDown, _JnzDown
• _CallAPILoadLibrary, _CallAPIMessageBox,
_CallAPISleep (release version), _call
• _null (release version, it has no actual name)
This set can be reduced in several ways. The most obvious
candidates for removal are the three API calls (two in the
The Flibi virus demonstrated that a virus can carry its own
‘genetic code
and if the codons
1
(the p-code form of the virus), the tRNA
2
(the translator
function), or the corresponding amino acids
3
(the native
code) are mutated in some way, then interesting behaviours
can arise.
Each codon is used as a relative offset into a table of amino
acids. There is a single pointer to the table. Mutation of a
codon might cause a new amino acid to be produced, since
it might now point to a different entry in the table. Mutation
of the pointer would almost certainly be fatal since many
codons would not be translated into the correct amino
acids. Mutation of the amino acid itself might produce new
behaviour, depending on the change. For example, a shift
could become a rotate.
The virus has the ability to move a sequence of codons to
a later position in the stream
4
, and then fi ll the gap with
no-operation instructions. In most cases, this simply results
in the replacement of the codons at the destination
5
. Of
course, if the selected sequence appears at the end of the
defi ned stream (there is a lot of slack space after the last
meaningful codon), then the size of the defi ned stream
will increase slightly each time that condition occurs.
However, the size of the buffer remains fi xed. Therefore,
new sequences can only appear when the translator code
is modifi ed to increase the number of codons that are
translated, thus ‘translating’ garbage beyond the original
end of the stream. That garbage could potentially be
modifi ed over time to eventually produce meaningful
functionality. Its location in the virus body would change
over time as a result of the codon deletion, allowing the
new amino acids to ‘migrate’ to a fi nal position where they
become truly useful. The human eye did not spring fully
formed from the dust of the earth but was the result of
1
A codon is a trinucleotide sequence of DNA or RNA (the nucleic
functioning of living organisms) that corresponds to a specifi c amino
acid. See http://www.genome.gov/Glossary/index.cfm?id=36.
2
Transfer RNA, or tRNA, is a small RNA molecule that is
boyer/0470003790/structure/tRNA/trna_intro.htm.
3
Amino acids are the building blocks of proteins. See
http://en.wikipedia.org/w/index.php?title=Amino_
acid&oldid=412676887.
4
There is a bug in this code, which can result in attempting to copy
more bytes than exist in the source.
5
There is an additional case where the destination is the same as the
source, in which case the codons are deleted.
6
MAY 2011
VIRUS BULLETIN
www.virusbtn.com
preview version
6
). The APIs can be called using the ‘_call’
command if the API addresses are placed in the data section
in this way:
_getDO ;get data offset
_addnnnn ;adjust ebx as appropriate to reach the
;required offset
_call ;call the API
This leaves 42 commands remaining (41 in the preview
version).
The ‘_zer0’ command can be removed by using this code:
_save ;ecx = ebx
_xor ;ebx = 0
41 (40) commands now remain.
The ‘_subsaved’ command (which performs the action
‘ebx = ebx – ecx’) can be removed, and the ‘_addsaved’
command (which performs the action ‘ebx = ebx + ecx’)
can be used instead, with a slight change. Specifi cally, the
new value of the ‘ecx’ register is ‘-ecx’ (such that ‘ebx =
ebx + -ecx’). However, there is no negate command, so an
equivalent result must be achieved using the combination
of operations that perform a ‘not’ and an ‘add 1’. The
problem is that a ‘not’ operation uses the value ‘0xffffffff’,
which requires many steps to construct. Given the existing
instruction set, it would be simplest to place the value
‘0xffffffff’ in the data section
7
. It must be placed at the start
of the data section, because the ‘_addnnnn’ commands can
be removed, leaving no way to select another offset. This
algorithm can then be used:
xor ebx, 0xffffffff
inc ebx
which we translate into this code:
_push
_getDO
_getdata
;ebx = 0xffffffff
_save
;ecx = 0xffffffff
_pop
_addsaved
;ebx = ebx - 1
39 (38) commands remain.
The ‘_addnnnn’ commands exist for convenience, but all of
the commands apart from ‘_add0001’ can be constructed
using the ‘_add0001’ command. Thus, the ‘_add0004’, ‘_
add0010’, ‘_add0040’, ‘_add0100’, ‘_add0400’, ‘_add1000’
and ‘_add4000’ commands can be removed.
32 (31) commands remain.
The ‘_add0001’ command can also be removed, because the
number ‘1’ can be recovered from the value ‘0xffffffff’ by
using this code:
_getDO ;get data offset
_getdata ;ebx = 0xffffffff
_save ;ecx = 0xffffffff
;here is a horrible trick:
;modern CPUs limit the shift-count to 0x1f by taking
;the low fi ve bits for the count and simply discarding
;the rest of the value internally, this performs a
;cl & 0x1f and it’s exactly what we need
_shr ;ebx = ebx >> cl
_save ;ecx = 1
From then on, the ‘_addsaved’ command can be used to
increment the ‘ebx’ register as needed.
31 (30) commands remain.
Of course, it would require very many uses of the
‘_addsaved’ command in order to construct large values,
but value construction can be accelerated by using the ‘_shl’
and ‘_xor’ commands.
For example, constructing the value ‘2’ is a matter of the
following:
_shl ;ebx = ebx << cl (ebx and ecx are ‘1’
;from above)
Constructing the value ‘3’, beginning with the ‘ebx’ and
‘ecx’ registers holding the value ‘1’, as above, is a matter of
the following:
_shl ;ebx = ebx << cl
_xor ;ebx = ebx ^ ecx
Constructing the value ‘4’, beginning with the ‘ebx’ and
‘ecx’ registers holding the value ‘1’, as above, is a matter of
the following:
_shl
;get data offset
_getdata
;fetch 0xffffffff
_xor
;logically ‘not’ ebx
_add0001
;increment result to complete negate
_save
;replace ecx
_pop
_addsaved ;ebx = ebx + ecx
40 (39) commands remain.
In the same way, the ‘_sub0001’ command can be removed
by using this code:
_push
_getDO
;get data offset
;ebx = ebx << cl
_shl
;ebx = ebx << cl
6
The ‘_CallAPISleep’ command was added to the release version
because the API resolver code could not resolve the Sleep() API on
certain platforms. The reason is described in detail in the previous
article
7
It would be even simpler to introduce an instruction which performs a
‘mov ebx, 0xffffffff’.
And so on. Given this algorithm, we can see that the value
‘0xffffffff’ is not the only possible ‘base constant’. The
value ‘1’ could be used instead, since the value ‘0xffffffff’
could be produced from it in the following way:
MAY 2011
7
VIRUS BULLETIN
www.virusbtn.com
Whereas, to construct the value ‘0xf0000000’, beginning
with the ‘ebx’ and ‘ecx’ registers holding the value
‘0xffffffff’, as above, the following can be used:
_push
_shr
_getDO
;get data offset
_getdata
;ebx = 1
_save
;ecx = 1
_shl
;ebx = 2
_xor
;ebx = 3
;ebx = ebx >> cl
_shl
;ebx = 6
_save
;ecx = 1
_xor
;ebx = 7
_shl
;ebx = 2
_shl
;ebx = 0x0e
_xor
;ebx = 3
_xor
;ebx = 0x0f
_shl
;ebx = 6
... [54 steps]
_shl ;ebx = 0xfffffffe
_xor ;ebx = 0xffffffff
_save ;ecx = 0xffffffff
Clearly, it is far simpler to go from ‘0xffffffff’ to ‘1’ than the
other way around. Note that values can also be constructed
using the ‘reverse’ of this technique, to reduce the number
of shifts required. For example, constructing the value
‘0x80000000’ is a matter of the following:
_getDO ;get data offset
_getdata ;ebx = 0xffffffff
_save ;ecx = 0xffffffff
_shl ;ebx = 0x80000000
Constructing the value ‘0x40000000’ is a matter of the
following:
_push
_shr
_xor
;ebx = 7
_shl
;ebx = 0x0e
_shl
;ebx = 0x1c
_save
;ecx = 0x1c
_pop
_shl ;ebx = 0xf0000000
Thus, depending on the value, the ‘_shl’ method is the
simplest.
Astute readers will have noticed that none of the value
constructions above use the ‘_addsaved’ command. This
shows that constants can be constructed without using any
form of ‘add’. However, it is also possible to perform the
addition of arbitrary values without using any form of ‘add’,
resulting in the removal of the ‘_addsaved’ command by
using this algorithm (edx and ebp holding the values to add
together):
eax = edx ^ ebp
do
{
;ebx = ebx >> cl (ebx = 0x80000000,
;ecx = 0xffffffff from above)
_save
;ecx = 1
_pop
_shr ;ebx = 0x40000000
However, setting additional bits in the upper region requires
more than just the ‘_xor’ command. Here are two examples
that set the same value, one using the ‘_shl’ command and
one using the ‘_shr’ command. To construct a value such as
‘0xf0000000’, beginning with the ‘ebx’ register holding the
value ‘0x80000000’ and the ‘ecx’ register holding the value
‘0xffffffff’, as above, the following can be used:
_push
_shr
ebp = (ebp & edx) << 1
edx = eax
eax = edx ^ ebp
}
while (edx & ebp)
ebx = eax
which we translate into this code:
[construct here the fi rst value to add, not shown]
_nopdD ;edx = ebx
[construct here the second value to add, not shown]
_nopdB
;ebx = ebx >> cl
;ebp = ebx
_save
;ecx = 1
_save
;ecx = ebx
_pop
;ebx = 0x80000000 again
_nopsD
;ebx = edx
_push
_shr
;optional, depending on the order of the
;ebx = 0x40000000
;constructions above
_push
_shr
_xor
;ebx = edx ^ ebp
;ebx = 0x20000000
_nopdA
;eax = edx ^ ebp
_push
_shr
_getEIP
_push ;top of do-while loop
;ebx points to a hidden ‘pop ebx’
;instruction as part of _getEIP so there
;is no explicit ‘pop’ instruction inside
;the loop that corresponds to this ‘push’
;instruction
_saveJmpOff ;esi = ebx
_nopsD;ebx = edx
_save
;ebx = 0x10000000
_save
;ecx = 0x10000000
_pop
_xor
;ebx = 0x30000000
_pop
_xor
;ebx = 0x70000000
_pop
_xor
;ebx = 0xf0000000
;ecx = edx
8
MAY 2011
VIRUS BULLETIN
www.virusbtn.com
to two bytes past the start of the slot. The result is that the
‘_nopREAL’ command must be used to fi ll that destination
slot, otherwise a crash could occur because the branch
might land in the middle of a command. However, the
‘_JnzDown’ command can be removed by using alternative
code, and any non-stack and non-memory instruction can be
used for tail padding. Thus it appears that, given its current
uses, the ‘_nopREAL’ command can be removed.
29 (28) commands remain.
In the release version a ‘_null’ command exists, which emits
a single zero into the stream, followed by the ‘nop’ padding.
Its existence is the result of a bug. The execution of such an
instruction is likely to cause an exception. It is possible on
Windows XP
and later to register a vectored exception handle
using the existing language, and that could intercept the
exception, but this is quite outside the ‘style’ of the language.
The command can be removed without any problem.
28 commands remain.
The ‘_JnzDown’ command could be removed by using
a careful implementation of ‘_JnzUp’ (given that the
meaning is reversed), but perhaps not without the loss of
some functionality. It requires knowledge of the location
of a forward branch destination. This interferes with
command reordering if the buffer size is fi xed, because
there might not be enough slots available to construct the
required ‘add’ value (unless the maximum number of slots
was reserved each time in order to construct any possible
number). It does, however, extend the functionality in a
different way, since the ‘_JnzDown’ command can skip
only three commands at a time, requiring its use multiple
times in order to execute larger conditional blocks. The
‘_JnzDown’ command also places severe restrictions on
what can appear in those conditional blocks, since an
arithmetic operation might clear the Z fl ag, causing the
branch to be taken instead of skipped. In contrast, the use
of the ‘_JnzUp’ command can skip an arbitrary number
of commands without restriction. The difference can be
demonstrated easily. We begin with some code that calls the
GetTickCount() API to fetch a ‘random’ number (for ease
of demonstration, the offset of the GetTickCount() API is
set arbitrarily to the value ‘0x0c’), using the ‘_JnzDown’
command:
;construct pointer to GetTickCount()
;construct the value “0x0c”
_getDO
_nopsB
;ebx = ebp
_and
;ebx = ebp & edx
_push
_getDO ;get data offset
_getdata ;ebx = 0xffffffff
_save
;ecx = 0xffffffff
_shr
;ebx = 1
_save
;ecx = 1
_pop
_shl
;ebx = (edx & ebp) << 1
_nopdB
;ebp = (edx & ebp) << 1
_save
;ecx = ebx
_nopsA
;ebx = eax
_nopdD
;edx = eax
_push
_xor
;ebx = edx ^ ebp
_nopdA
;eax = edx ^ ebp
_pop
_and ;ebx = edx & ebp
_JnzUp ;loop while ((edx & ebp) != 0)
_pop ;discard loop address
_nopsA ;ebx = eax
30 (29) commands remain.
The replacement code for the ‘_addsaved’ command
requires the use of the base constant from the data section
(and here, the value ‘1’ would result in shorter code).
The value ‘1’ can be constructed dynamically instead, in the
following way:
_getEIP
_getdata
;ebx=0xxxxxxx5b
_save
;ecx=0xxxxxxx5b
_shl
_shr ;ebx=0x1b
_save ;ecx=0x1b
_addsaved ;ebx=0x36
_addsaved ;ebx=0x51
_addsaved ;ebx=0x6c
_addsaved ;ebx=0x87
_save ;ecx=0x87
_shr ;ebx=1
However, that algorithm prevents the removal of the
‘_addsaved’ command. The two concepts seem to be
mutually exclusive.
It is unclear whether the ‘_nopREAL’ command could be
removed, since there is no other single-byte command that
might take its place in the event that a true ‘no-operation’
command were required. Its current purposes are to pad
the unused slots following codon deletion and to fi ll the
unused slot(s) that follow the ‘_JnzDown’ command (since
the ‘_JnzDown’ command skips three slots). Note that
the current implementation of the ‘_JnzDown’ command
contains a bug, which is that the destination of the branch
is not the start of a slot. Instead, the command branches
;get data offset
_getdata
;ebx = 0xffffffff
_save
;ecx = 0xffffffff
_shr
;ebx = 1
_save
;ecx = 1
_shl
;ebx = 2
_xor
;ebx = 3
MAY 2011
9
VIRUS BULLETIN
www.virusbtn.com
_shl ;ebx = 6
_shl ;ebx = 0x0c
_nopdB ;ebp = 0x0c
_save ;ecx = 0x0c
;add to data offset
_shl
;ebx = 6
_xor
;ebx = 7
_save
;ecx = 7
;’and’ with result from GetTickCount()
_nopsA
_and
;ebx = ebx & 7
_getDO
;get data offset
_JnzDown
[conditional command 1]
[conditional command 2]
[conditional command 3]
_nopREAL ;work around ‘_JnzDown’ bug
The replacement code might look something like this,
beginning immediately after the call to the GetTickCount()
API:
;save result from GetTickCount()
_nopsA
_push
_nopdD
;edx = data offset
_xor
;ebx = edx ^ ebp
_nopdA
;eax = edx ^ ebp
_getEIP
_push
;top of do-while loop
;ebx points to a hidden ‘pop ebx’
;instruction as part of _getEIP
;so there is no explicit ‘pop’
;instruction inside the loop
;that corresponds to this ‘push’
;instruction
_saveJmpOff ;esi = ebx
_nopsD
;ebx = edx
_save
;ecx = edx
;construct pointer to l2
_nopsB
;ebx = ebp
_and
;ebx = ebp & edx
_getDO
;get data offset
_getdata ;ebx = 0xffffffff
_push
_getDO
_save
;ecx = 0xffffffff
;get data offset
_getdata
;ebx = 0xffffffff
_shr
;ebx = 1
_save
;ecx = 0xffffffff
_save
;ecx = 1
_shr
;ebx = 1
... [‘_shl’ and ‘_xor’ as needed to produce the
value ((lines(l1...l2) * 8) + 3)]
_save
;ecx = 1
_nopdB
;ebp = offset of l2
_pop
_shl
_save
;ecx = offset of l2
;ebx = (edx & ebp) << 1
_getEIP
l1: _nopdD
_nopdB
;ebp = (edx & ebp) << 1
;edx = eip
_save
;ecx = ebx
_xor
;ebx = edx ^ ebp
_nopsA
;ebx = eax
_nopdA
;eax = edx ^ ebp
_nopdD
;edx = eax
_getEIP
_push
_xor
_push
;top of do-while loop
;ebx = edx ^ ebp
;ebx points to a hidden ‘pop ebx’
_nopdA
;eax = edx ^ ebp
;instruction as part of _getEIP
_pop
_and
;so there is no explicit ‘pop’
;instruction inside the loop
;ebx = edx & ebp
;that corresponds to this ‘push’
;instruction
_saveJmpOff ;esi = ebx
_JnzUp
;loop while ((edx & ebp) != 0)
_pop
;discard loop address
_nopsA
;ebx = eax
_nopsD
;ebx = edx
;call GetTickCount()
_save
;ecx = edx
_nopsB
;ebx = ebp
_call
_and
;ebx = ebp & edx
Then the choice is made, and the branch might be taken
(seven in eight chances to take it):
_push
_getDO
;get data offset
_getdata ;ebx = 0xffffffff
;construct the value ‘7’
_save
;ecx = 0xffffffff
_getDO
;get data offset
_shr
;ebx = 1
_getdata
;ebx = 0xffffffff
_save
;ecx = 1
_save
;ecx = 0xffffffff
_pop
_shl
_shr
;ebx = 1
;ebx = (edx & ebp) << 1
_save
;ecx = 1
_nopdB
;ebp = (edx & ebp) << 1
_shl
;ebx = 2
_save
;ecx = ebx
_xor
;ebx = 3
_nopsA
;ebx = eax
10
MAY 2011
[ Pobierz całość w formacie PDF ]