CCSDS_study project

This commit is contained in:
2026-05-05 21:54:35 +08:00
commit 9be41f9270
585 changed files with 91275 additions and 0 deletions

View File

@@ -0,0 +1,613 @@
.. currentmodule:: netzob
.. _discover_features:
Discover features of Netzob
~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. warning::
This tutorial for Netzob 1.x is currently slighlty obsolete, and should be updated to the Netzob API 2.x.
This tutorial presents the main features of Netzob regarding the
inference of message formats and grammar of a simple toy protocol. The
described features cover the following capabilities:
- Import of a PCAP file
- Format message inference
- Partitionment of messages following a specific delimiter
- Regroupment of messages following a specific key field
- Partitionment of a subset a each message following a sequence aligment
- Search for relationships in each group of messages
- Modification of the format message to apply found relationships
- Grammar inference
- Generation of an automaton with one main state according to a captured sequence of messages
- Generation of an automaton with a sequence of states according to a captured sequence of messages
- Generation of a Prefix Tree Acceptor (PTA) automaton according to a captured sequence of messages
- Traffic generation and fuzzing
- Generation of messages following the inferred message format of each group and through visiting the inferred automata
- Fuzzing of an implementation by generating altered message formats
Retrieve Netzob and resources.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
At first, retrieve the source code of Netzob::
$ git clone https://dev.netzob.org/git/netzob
Then, you can retrieve the source code of the toy protocol implementation used in this tutorial, as well as some PCAP files of sequences of messages.
- `Toy protocol implementation <https://dev.netzob.org/attachments/download/179/tutorial_netzob_v1.tar.gz>`_
- `PCAP of sequence 1 <https://dev.netzob.org/attachments/download/182/target_src_v1_session1.pcap>`_
- `PCAP of sequence 2 <https://dev.netzob.org/attachments/download/181/target_src_v1_session2.pcap>`_
- `PCAP of sequence 3 <https://dev.netzob.org/attachments/download/180/target_src_v1_session3.pcap>`_
Import messages from a PCAP file.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Reading packets from a PCAP file is done through the PCAPImporter.readFile() static function. This function can optionally take more parameters to specify a BPF filter, the import layer or the number of packets to capture::
from netzob.all import *
messages_session1 = PCAPImporter.readFile("target_src_v1_session1.pcap").values()
messages_session2 = PCAPImporter.readFile("target_src_v1_session2.pcap").values()
messages = messages_session1 + messages_session2
for message in messages:
print(message)
The output is::
[1388154953.32 127.0.0.1:57831->127.0.0.1:4242] 'CMDidentify#\x07\x00\x00\x00Roberto'
[1388154953.32 127.0.0.1:4242->127.0.0.1:57831] 'RESidentify#\x00\x00\x00\x00\x00\x00\x00\x00'
[1388154953.32 127.0.0.1:57831->127.0.0.1:4242] 'CMDinfo#\x00\x00\x00\x00'
[1388154953.32 127.0.0.1:4242->127.0.0.1:57831] 'RESinfo#\x00\x00\x00\x00\x04\x00\x00\x00info'
[1388154953.32 127.0.0.1:57831->127.0.0.1:4242] 'CMDstats#\x00\x00\x00\x00'
[1388154953.32 127.0.0.1:4242->127.0.0.1:57831] 'RESstats#\x00\x00\x00\x00\x05\x00\x00\x00stats'
[1388154953.32 127.0.0.1:57831->127.0.0.1:4242] 'CMDauthentify#\n\x00\x00\x00aStrongPwd'
[1388154953.32 127.0.0.1:4242->127.0.0.1:57831] 'RESauthentify#\x00\x00\x00\x00\x00\x00\x00\x00'
[1388154953.32 127.0.0.1:57831->127.0.0.1:4242] 'CMDencrypt#\x06\x00\x00\x00abcdef'
[1388154953.32 127.0.0.1:4242->127.0.0.1:57831] "RESencrypt#\x00\x00\x00\x00\x06\x00\x00\x00$ !&'$"
[1388154953.32 127.0.0.1:57831->127.0.0.1:4242] "CMDdecrypt#\x06\x00\x00\x00$ !&'$"
[1388154953.32 127.0.0.1:4242->127.0.0.1:57831] 'RESdecrypt#\x00\x00\x00\x00\x06\x00\x00\x00abcdef'
[1388154953.33 127.0.0.1:57831->127.0.0.1:4242] 'CMDbye#\x00\x00\x00\x00'
[1388154953.33 127.0.0.1:4242->127.0.0.1:57831] 'RESbye#\x00\x00\x00\x00\x00\x00\x00\x00'
[1388154953.31 127.0.0.1:57831->127.0.0.1:4242] 'CMDidentify#\x04\x00\x00\x00fred'
[1388154953.31 127.0.0.1:4242->127.0.0.1:57831] 'RESidentify#\x00\x00\x00\x00\x00\x00\x00\x00'
[1388154953.31 127.0.0.1:57831->127.0.0.1:4242] 'CMDinfo#\x00\x00\x00\x00'
(...)
Regroup messages in a symbol and do a format partitionment with a delimiter
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
According to a quick review of the displayed messages, the character '#' sounds interesting as i appears in the middle of each message. So let's use it as a delimiter::
symbol = Symbol(messages=messages)
Format.splitDelimiter(symbol, ASCII("#"))
print("[+] Symbol structure:")
print(symbol._str_debug())
print("[+] Partitionned messages:")
print(symbol)
We now obtain the following symbol (i.e. our goup of messages) structure::
[+] Symbol structure:
Symbol
|-- Field-0
|-- Alt
|-- Data (Raw='RESstats' ((0, 64)))
|-- Data (Raw='RESauthentify' ((0, 104)))
|-- Data (Raw='RESidentify' ((0, 88)))
|-- Data (Raw='CMDstats' ((0, 64)))
|-- Data (Raw='CMDdecrypt' ((0, 80)))
|-- Data (Raw='CMDauthentify' ((0, 104)))
|-- Data (Raw='RESdecrypt' ((0, 80)))
|-- Data (Raw='RESinfo' ((0, 56)))
|-- Data (Raw='CMDinfo' ((0, 56)))
|-- Data (Raw='RESauthentify' ((0, 104)))
|-- Data (Raw='CMDencrypt' ((0, 80)))
|-- Data (Raw='CMDauthentify' ((0, 104)))
|-- Data (Raw='CMDstats' ((0, 64)))
|-- Data (Raw='RESbye' ((0, 48)))
|-- Data (Raw='RESdecrypt' ((0, 80)))
|-- Data (Raw='RESencrypt' ((0, 80)))
|-- Data (Raw='CMDidentify' ((0, 88)))
|-- Data (Raw='CMDbye' ((0, 48)))
|-- Data (Raw='RESinfo' ((0, 56)))
|-- Data (Raw='RESencrypt' ((0, 80)))
|-- Data (Raw='RESidentify' ((0, 88)))
|-- Data (Raw='CMDidentify' ((0, 88)))
|-- Data (Raw='CMDencrypt' ((0, 80)))
|-- Data (Raw='RESbye' ((0, 48)))
|-- Data (Raw='CMDinfo' ((0, 56)))
|-- Data (Raw='CMDbye' ((0, 48)))
|-- Data (Raw='CMDdecrypt' ((0, 80)))
|-- Data (Raw='RESstats' ((0, 64)))
|-- Field-sep-23
|-- Data (ASCII=# ((0, 8)))
|-- Field-2
|-- Alt
|-- Data (Raw='\x04\x00\x00\x00fred' ((0, 64)))
|-- Data (Raw='\x00\x00\x00\x00\x00\x00\x00\x00' ((0, 64)))
|-- Data (Raw='\x00\x00\x00\x00\x05\x00\x00\x00stats' ((0, 104)))
|-- Data (Raw='\n\x00\x00\x00aStrongPwd' ((0, 112)))
|-- Data (Raw='\x00\x00\x00\x00\x00\x00\x00\x00' ((0, 64)))
|-- Data (Raw='\x00\x00\x00\x00' ((0, 32)))
|-- Data (Raw='\x00\x00\x00\x00\x00\x00\x00\x00' ((0, 64)))
|-- Data (Raw='\x00\x00\x00\x00\x00\x00\x00\x00' ((0, 64)))
|-- Data (Raw='\x00\x00\x00\x00' ((0, 32)))
|-- Data (Raw='\x06\x00\x00\x00abcdef' ((0, 80)))
|-- Data (Raw='\x00\x00\x00\x00\x04\x00\x00\x00info' ((0, 96)))
|-- Data (Raw='\n\x00\x00\x00123456test' ((0, 112)))
|-- Data (Raw='\x00\x00\x00\x00\x00\x00\x00\x00' ((0, 64)))
|-- Data (Raw='\x00\x00\x00\x00\n\x00\x00\x00123456test' ((0, 144)))
|-- Data (Raw='\x07\x00\x00\x00Roberto' ((0, 88)))
|-- Data (Raw="\x00\x00\x00\x00\x06\x00\x00\x00$ !&'$" ((0, 112)))
|-- Data (Raw="\x00\x00\x00\x00\n\x00\x00\x00spqvwt6'16" ((0, 144)))
|-- Data (Raw="\x06\x00\x00\x00$ !&'$" ((0, 80)))
|-- Data (Raw='\x00\x00\x00\x00\x05\x00\x00\x00stats' ((0, 104)))
|-- Data (Raw='\x00\x00\x00\x00' ((0, 32)))
|-- Data (Raw="\n\x00\x00\x00spqvwt6'16" ((0, 112)))
|-- Data (Raw='\t\x00\x00\x00myPasswd!' ((0, 104)))
|-- Data (Raw='\x00\x00\x00\x00' ((0, 32)))
|-- Data (Raw='\x00\x00\x00\x00\x04\x00\x00\x00info' ((0, 96)))
|-- Data (Raw='\x00\x00\x00\x00\x06\x00\x00\x00abcdef' ((0, 112)))
|-- Data (Raw='\x00\x00\x00\x00' ((0, 32)))
|-- Data (Raw='\x00\x00\x00\x00' ((0, 32)))
|-- Data (Raw='\x00\x00\x00\x00\x00\x00\x00\x00' ((0, 64)))
Regarding the partitioned messages, this now looks like this::
<pre><code class="bash">
'CMDidentify' | '#' | '\x07\x00\x00\x00Roberto'
'RESidentify' | '#' | '\x00\x00\x00\x00\x00\x00\x00\x00'
'CMDinfo' | '#' | '\x00\x00\x00\x00'
'RESinfo' | '#' | '\x00\x00\x00\x00\x04\x00\x00\x00info'
'CMDstats' | '#' | '\x00\x00\x00\x00'
'RESstats' | '#' | '\x00\x00\x00\x00\x05\x00\x00\x00stats'
'CMDauthentify' | '#' | '\n\x00\x00\x00aStrongPwd'
'RESauthentify' | '#' | '\x00\x00\x00\x00\x00\x00\x00\x00'
'CMDencrypt' | '#' | '\x06\x00\x00\x00abcdef'
'RESencrypt' | '#' | "\x00\x00\x00\x00\x06\x00\x00\x00$ !&'$"
'CMDdecrypt' | '#' | "\x06\x00\x00\x00$ !&'$"
'RESdecrypt' | '#' | '\x00\x00\x00\x00\x06\x00\x00\x00abcdef'
'CMDbye' | '#' | '\x00\x00\x00\x00'
'RESbye' | '#' | '\x00\x00\x00\x00\x00\x00\x00\x00'
'CMDidentify' | '#' | '\x04\x00\x00\x00fred'
(...)
Cluster according to a key field
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The first field seems interesting, as it contains some kind of commands ('CMDencrypt', 'CMDidentify', etc.). Let's thus cluster the symbol according to the first field::
symbols = Format.clusterByKeyField(symbol, symbol.fields[0])
print("[+] Number of symbols after clustering: {0}".format(len(symbols)))
print("[+] Symbol list:")
for keyFieldName, s in symbols.items():
print(" * {0}".format(keyFieldName))
The clustering algorithm produces 14 different symbols, where each symbol has a uniq value in the first field.::
[+] Number of symbols after clustering: 14
[+] Symbol list:
* RESdecrypt
* RESbye
* RESidentify
* CMDbye
* RESencrypt
* CMDidentify
* RESstats
* CMDencrypt
* RESauthentify
* CMDdecrypt
* CMDinfo
* CMDauthentify
* RESinfo
* CMDstats
Apply a format partitionment with a sequence alignment on the third field of each symbol
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
As the last field seems to have a dynamic size, let's have a look at what would provide a sequence alignment (i.e. a means to align static and dynamic sub-fields)::
for symbol in symbols.values():
Format.splitAligned(symbol.fields[2], doInternalSlick=True)
print("[+] Partitionned messages:")
print(symbol)
For the symbol 'CMDencrypt', the sequence alignment of the last field produces the following format, where we can observe a static field of '\x00\x00\x00' surrounded by two variable fields. The last field seems to be the buffer we want to encrypt, as the key field name suggest (i.e. 'CMDencrypt').::
(...)
[+] Partitionned messages:
'CMDencrypt' | '#' | '\n' | '\x00\x00\x00' | '123456test'
'CMDencrypt' | '#' | '\x06' | '\x00\x00\x00' | 'abcdef'
(...)
Find field relations in each symbol
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Let's now find any relationships is those messages::
for symbol in symbols.values():
rels = RelationFinder.findOnSymbol(symbol)
print("[+] Relations found: ")
for rel in rels:
print(" " + rel["relation_type"] + ", between '" + rel["x_attribute"] + "' of:")
print(" " + str('-'.join([f.name for f in rel["x_fields"]])))
p = [v.getValues()[:] for v in rel["x_fields"]]
print(" " + str(p))
print(" " + "and '" + rel["y_attribute"] + "' of:")
print(" " + str('-'.join([f.name for f in rel["y_fields"]])))
p = [v.getValues()[:] for v in rel["y_fields"]]
print(" " + str(p))
In the symbol 'CMDencrypt', we have found a relationship between the content of a field (the third one) and the length of another field (the last one, which presumably contains the buffer we want to encrypt).::
(...)
[+] Relations found:
SizeRelation, between 'value' of:
Field
[['\n', '\x06']]
and 'size' of:
Field
[['123456test', 'abcdef']]
(...)
Find relations and apply them in the symbol structure
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
We then modify the format message to apply the relationship we have just found, by creating a Size field whose value depends on the content of a targeted field. We also specify a factor that basically says that the value of the size field should be one eighth of the size of the buffer field (as every field size is expressed in bits by default)::
for symbol in symbols.values():
rels = RelationFinder.findOnSymbol(symbol)
for rel in rels:
# Apply first found relationship
rel = rels[0]
rel["x_fields"][0].domain = Size(rel["y_fields"], factor=1/8.0)
print("[+] Symbol structure:")
print(symbol._str_debug())
The 'CMDencrypt' symbol structure now looks like this::
(...)
[+] Symbol structure:
Symbol_CMDencrypt
|-- Field-0
|-- Data (ASCII=CMDencrypt ((0, 80)))
|-- Field-sep-23
|-- Data (ASCII=# ((0, 8)))
|-- Field-2
|-- Data (Raw=None ((0, None)))
|-- |-- Field
|-- Size(['Field']) - Type:Raw=None ((8, 8))
|-- |-- Field
|-- Data (Raw='\x00\x00\x00' ((0, 24)))
|-- |-- Field
|-- Data (Raw=None ((0, 80)))
(...)
That is all for the message format inference. Let's now look at the state machine of this toy protocol.
Generate a chained states automaton
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
We will generate a basic automaton that illustrates the sequence of commands and responses extracted from a PCAP file. For each message sent, this will create a new transition to a new state, thus the name of *chained states automaton*::
# Create a session of messages
session = Session(messages_session1)
# Abstract this session according to the inferred symbols
abstractSession = session.abstract(list(symbols.values()))
# Generate an automata according to the observed sequence of messages/symbols
automata = Automata.generateChainedStatesAutomata(abstractSession, list(symbols.values()))
# Print the dot representation of the automata
dotcode = automata.generateDotCode()
print(dotcode)
The obtained automaton is finally converted into Dot code in order to render a graphical version of it.::
digraph G {
"Start state" [shape=doubleoctagon, style=filled, fillcolor=white, URL="f8d33b83-d6b0-4180-832c-7cce9d6b3fea"];
"State 1" [shape=ellipse, style=filled, fillcolor=white, URL="a332ed56-e2d8-4c8c-9ec2-99c5f942e9a3"];
"State 2" [shape=ellipse, style=filled, fillcolor=white, URL="8f45bd4e-fe03-4a26-bf9a-1adec60f597d"];
"State 3" [shape=ellipse, style=filled, fillcolor=white, URL="01999e79-de00-467d-987a-e9411d57be99"];
"State 4" [shape=ellipse, style=filled, fillcolor=white, URL="9b20ed29-77e5-43c1-bb8b-cf3a84674941"];
"State 5" [shape=ellipse, style=filled, fillcolor=white, URL="52ec3815-656b-421b-bb1f-c4f7746be534"];
"State 6" [shape=ellipse, style=filled, fillcolor=white, URL="1cbbd123-32d5-4cd8-bd01-4fd3bcd8ae38"];
"State 7" [shape=ellipse, style=filled, fillcolor=white, URL="8a8ab662-db23-4206-ba35-28396ee31115"];
"State 8" [shape=ellipse, style=filled, fillcolor=white, URL="ee9e0d5d-bb4e-4d2e-8c97-1553afa1cc68"];
"End state" [shape=ellipse, style=filled, fillcolor=white, URL="3874e4e9-af5d-428e-92b8-e1fda38b6ef9"];
"Start state" -> "State 1" [fontsize=5, label="OpenChannelTransition", URL="4beecca4-0d48-4ca9-8d83-ffd8766b64c7"];
"State 1" -> "State 2" [fontsize=5, label="Transition (Symbol_CMDidentify;{Symbol_RESidentify})", URL="c4e5451c-6a53-41f3-9748-7179774eb7de"];
"State 2" -> "State 3" [fontsize=5, label="Transition (Symbol_CMDinfo;{Symbol_RESinfo})", URL="c4e5451c-6a53-41f3-9748-7179774eb7de"];
"State 3" -> "State 4" [fontsize=5, label="Transition (Symbol_CMDstats;{Symbol_RESstats})", URL="c4e5451c-6a53-41f3-9748-7179774eb7de"];
"State 4" -> "State 5" [fontsize=5, label="Transition (Symbol_CMDauthentify;{Symbol_RESauthentify})", URL="c4e5451c-6a53-41f3-9748-7179774eb7de"];
"State 5" -> "State 6" [fontsize=5, label="Transition (Symbol_CMDencrypt;{Symbol_RESencrypt})", URL="c4e5451c-6a53-41f3-9748-7179774eb7de"];
"State 6" -> "State 7" [fontsize=5, label="Transition (Symbol_CMDdecrypt;{Symbol_RESdecrypt})", URL="c4e5451c-6a53-41f3-9748-7179774eb7de"];
"State 7" -> "State 8" [fontsize=5, label="Transition (Symbol_CMDbye;{Symbol_RESbye})", URL="c4e5451c-6a53-41f3-9748-7179774eb7de"];
"State 8" -> "End state" [fontsize=5, label="CloseChannelTransition", URL="c6ac87b7-5de1-401a-8b75-5d2a73d81264"];
}
.. figure:: https://dev.netzob.org/attachments/download/172/automata_target_v1_chained.svg
:align: center
:target: https://dev.netzob.org/attachments/download/172/automata_target_v1_chained.svg
:alt:
Generate a one state automaton
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This time, instead of converting a PCAP into a sequence of states for each message observed, we generate a uniq state that accept any of the observed sent messages to trigger a new transition. In response to each sent message (for example 'CMDencrypt'), we expect a specific response (for example 'REDencrypt')::
# Create a session of messages
session = Session(messages_session1)
# Abstract this session according to the inferred symbols
abstractSession = session.abstract(list(symbols.values()))
# Generate an automata according to the observed sequence of messages/symbols
automata = Automata.generateOneStateAutomata(abstractSession, list(symbols.values()))
# Print the dot representation of the automata
dotcode = automata.generateDotCode()
print(dotcode)
The obtained automaton is finally converted into Dot code in order to render a graphical version of it.::
digraph G {
"Start state" [shape=doubleoctagon, style=filled, fillcolor=white, URL="0659071e-1849-4616-a11a-e98edfe86e24"];
"Main state" [shape=ellipse, style=filled, fillcolor=white, URL="424e0a69-da0b-4030-816a-8368e30a00a9"];
"End state" [shape=ellipse, style=filled, fillcolor=white, URL="9de3d54b-f0eb-45f8-809a-86a60d22812f"];
"Start state" -> "Main state" [fontsize=5, label="OpenChannelTransition", URL="3818118b-97db-474f-b9c3-f38c04152a74"];
"Main state" -> "Main state" [fontsize=5, label="Transition (Symbol_CMDidentify;{Symbol_RESidentify})", URL="f6000e04-10a8-41de-a1a0-29021440684a"];
"Main state" -> "Main state" [fontsize=5, label="Transition (Symbol_CMDinfo;{Symbol_RESinfo})", URL="f6000e04-10a8-41de-a1a0-29021440684a"];
"Main state" -> "Main state" [fontsize=5, label="Transition (Symbol_CMDstats;{Symbol_RESstats})", URL="f6000e04-10a8-41de-a1a0-29021440684a"];
"Main state" -> "Main state" [fontsize=5, label="Transition (Symbol_CMDauthentify;{Symbol_RESauthentify})", URL="f6000e04-10a8-41de-a1a0-29021440684a"];
"Main state" -> "Main state" [fontsize=5, label="Transition (Symbol_CMDencrypt;{Symbol_RESencrypt})", URL="f6000e04-10a8-41de-a1a0-29021440684a"];
"Main state" -> "Main state" [fontsize=5, label="Transition (Symbol_CMDdecrypt;{Symbol_RESdecrypt})", URL="f6000e04-10a8-41de-a1a0-29021440684a"];
"Main state" -> "Main state" [fontsize=5, label="Transition (Symbol_CMDbye;{Symbol_RESbye})", URL="f6000e04-10a8-41de-a1a0-29021440684a"];
"Main state" -> "End state" [fontsize=5, label="CloseChannelTransition", URL="75a4cc3a-72a4-42a3-af2c-aa3939f899aa"];
}
.. figure:: https://dev.netzob.org/attachments/download/173/automata_target_v1_onestate.svg
:align: center
:target: https://dev.netzob.org/attachments/download/173/automata_target_v1_onestate.svg
:alt:
Generate a PTA-based automaton
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Finally, we convert multiple sequences of messages taken form different PCAP files to generate an automaton for which we have merge identical paths. The underlying merging strategy is called a Prefix-Tree Acceptor::
# Create sessions of messages
messages_session1 = PCAPImporter.readFile("target_src_v1_session1.pcap").values()
messages_session3 = PCAPImporter.readFile("target_src_v1_session3.pcap").values()
session1 = Session(messages_session1)
session3 = Session(messages_session3)
# Abstract this session according to the inferred symbols
abstractSession1 = session1.abstract(list(symbols.values()))
abstractSession3 = session3.abstract(list(symbols.values()))
# Generate an automata according to the observed sequence of messages/symbols
automata = Automata.generatePTAAutomata([abstractSession1, abstractSession3], list(symbols.values()))
# Print the dot representation of the automata
dotcode = automata.generateDotCode()
print(dotcode)
The obtained automaton is finally converted into Dot code in order to render a graphical version of it.::
digraph G {
"Start state" [shape=doubleoctagon, style=filled, fillcolor=white, URL="e46d8a67-2a96-479a-9234-c1b38c75b847"];
"State 0" [shape=ellipse, style=filled, fillcolor=white, URL="0cd8a2c9-4410-45a0-9950-6456546f49dc"];
"State 1" [shape=ellipse, style=filled, fillcolor=white, URL="bbc10d50-f197-40f6-a674-5f80790ef954"];
"State 2" [shape=ellipse, style=filled, fillcolor=white, URL="739801b7-9e0d-4fba-a4f5-cf130e6b7fbf"];
"State 3" [shape=ellipse, style=filled, fillcolor=white, URL="c2075b80-16b9-4bd7-b290-6eb333f94e43"];
"State 4" [shape=ellipse, style=filled, fillcolor=white, URL="715ede75-d81e-46ea-a7c1-f537e5dba892"];
"State 9" [shape=ellipse, style=filled, fillcolor=white, URL="ad5873af-c26a-482f-94d9-0cf47c69376b"];
"State 10" [shape=ellipse, style=filled, fillcolor=white, URL="01859f7d-6b43-45af-8c17-9decb10dea9b"];
"End state 11" [shape=ellipse, style=filled, fillcolor=white, URL="7f4bd693-a35f-479b-8e86-128dc46c71cf"];
"State 5" [shape=ellipse, style=filled, fillcolor=white, URL="ee9da65c-b072-4344-bf71-2d67a3b73880"];
"State 6" [shape=ellipse, style=filled, fillcolor=white, URL="902e76e4-6a9a-45a2-95ba-ae9484f1084f"];
"State 7" [shape=ellipse, style=filled, fillcolor=white, URL="f7e9b27a-6879-4b4f-bb51-00530f07addf"];
"End state 8" [shape=ellipse, style=filled, fillcolor=white, URL="fe710eed-287f-4abf-93bf-6878e487d8a9"];
"Start state" -> "State 0" [fontsize=5, label="OpenChannelTransition", URL="5d6139d0-9b1c-49b2-b19d-91ae8c56f299"];
"State 0" -> "State 1" [fontsize=5, label="Transition (Symbol_CMDidentify;{Symbol_RESidentify})", URL="a1d2d03d-8c58-4c83-afa1-c40433fbd833"];
"State 1" -> "State 2" [fontsize=5, label="Transition (Symbol_CMDinfo;{Symbol_RESinfo})", URL="a1d2d03d-8c58-4c83-afa1-c40433fbd833"];
"State 2" -> "State 3" [fontsize=5, label="Transition (Symbol_CMDstats;{Symbol_RESstats})", URL="a1d2d03d-8c58-4c83-afa1-c40433fbd833"];
"State 3" -> "State 4" [fontsize=5, label="Transition (Symbol_CMDauthentify;{Symbol_RESauthentify})", URL="a1d2d03d-8c58-4c83-afa1-c40433fbd833"];
"State 4" -> "State 5" [fontsize=5, label="Transition (Symbol_CMDencrypt;{Symbol_RESencrypt})", URL="a1d2d03d-8c58-4c83-afa1-c40433fbd833"];
"State 4" -> "State 9" [fontsize=5, label="Transition (Symbol_CMDdecrypt;{Symbol_RESdecrypt})", URL="a1d2d03d-8c58-4c83-afa1-c40433fbd833"];
"State 9" -> "State 10" [fontsize=5, label="Transition (Symbol_CMDbye;{Symbol_RESbye})", URL="a1d2d03d-8c58-4c83-afa1-c40433fbd833"];
"State 10" -> "End state 11" [fontsize=5, label="CloseChannelTransition", URL="f7ddbccf-93b6-4496-a153-5b2306d95dac"];
"State 5" -> "State 6" [fontsize=5, label="Transition (Symbol_CMDdecrypt;{Symbol_RESdecrypt})", URL="a1d2d03d-8c58-4c83-afa1-c40433fbd833"];
"State 6" -> "State 7" [fontsize=5, label="Transition (Symbol_CMDbye;{Symbol_RESbye})", URL="a1d2d03d-8c58-4c83-afa1-c40433fbd833"];
"State 7" -> "End state 8" [fontsize=5, label="CloseChannelTransition", URL="f7ddbccf-93b6-4496-a153-5b2306d95dac"];
}
.. figure:: https://dev.netzob.org/attachments/download/174/automata_target_v1_pta.svg
:align: center
:target: https://dev.netzob.org/attachments/download/174/automata_target_v1_pta.svg
:alt:
Generate messages according to the inferred model
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
We now have a pretty good knowledge of the format messsage and grammar of the targeted protocol. Let's thus play with this model, by trying to communicate with a real server implementation.
At first, let's start the server in order to discus with it.::
$ cd src_v1/
$ ./server
Ready to read incoming messages
(...)
Then, we create a UDP client that will communicate with the server (on 127.0.0.1:4242) by exchanging messages generated from the infered symbols::
# Create a UDP client instance
channelOut = UDPClient(remoteIP="127.0.0.1", remotePort=4242)
abstractionLayerOut = AbstractionLayer(channelOut, list(symbols.values()))
abstractionLayerOut.openChannel()
# Visit the automata for n iteration
state = automata.initialState
for n in range(8):
state = state.executeAsInitiator(abstractionLayerOut)
We go through 8 iterations in the automaton.::
1454: [INFO] AbstractionLayer:openChannel: Going to open the communication channel...
1454: [INFO] AbstractionLayer:openChannel: Communication channel opened.
1454: [INFO] State:executeAsInitiator: Next transition: Open.
1454: [INFO] AbstractionLayer:openChannel: Going to open the communication channel...
1454: [INFO] AbstractionLayer:openChannel: Communication channel opened.
1454: [INFO] State:executeAsInitiator: Transition 'Open' leads to state: State 1.
1455: [INFO] State:executeAsInitiator: Next transition: Transition.
1455: [INFO] AbstractionLayer:writeSymbol: Going to specialize symbol: 'Symbol_CMDidentify' (id=dbea29b9-7e9f-4c2b-be14-625f675569f3).
1455: [INFO] AbstractionLayer:writeSymbol: Data generated from symbol 'Symbol_CMDidentify': 'CMDidentify#\x03\x00\x00\x00\xfc{\xdb'.
1456: [INFO] AbstractionLayer:writeSymbol: Going to write to communication channel...
1456: [INFO] AbstractionLayer:writeSymbol: Writing to commnunication channel donne..
1456: [INFO] AbstractionLayer:readSymbol: Going to read from communication channel...
1456: [INFO] AbstractionLayer:readSymbol: Received data: ''RESidentify#\x00\x00\x00\x00\x00\x00\x00\x00''
1457: [INFO] AbstractionLayer:readSymbol: Received symbol on communication channel: 'Symbol_RESidentify'
1457: [INFO] Transition:executeAsInitiator: Possible output symbol: 'Symbol_RESidentify' (id=49c24e1c-3751-412e-9f6a-f006a7de7492).
1457: [INFO] State:executeAsInitiator: Transition 'Transition' leads to state: State 2.
1457: [INFO] State:executeAsInitiator: Next transition: Transition.
1457: [INFO] AbstractionLayer:writeSymbol: Going to specialize symbol: 'Symbol_CMDinfo' (id=5eb47a57-eccf-4d06-8231-0b1ae87f96a7).
1458: [INFO] AbstractionLayer:writeSymbol: Data generated from symbol 'Symbol_CMDinfo': 'CMDinfo#\x00\x00\x00\x00'.
1458: [INFO] AbstractionLayer:writeSymbol: Going to write to communication channel...
1458: [INFO] AbstractionLayer:writeSymbol: Writing to commnunication channel donne..
1458: [INFO] AbstractionLayer:readSymbol: Going to read from communication channel...
1458: [INFO] AbstractionLayer:readSymbol: Received data: ''RESinfo#\x00\x00\x00\x00\x04\x00\x00\x00info''
1462: [INFO] AbstractionLayer:readSymbol: Received symbol on communication channel: 'Symbol_RESinfo'
1462: [INFO] Transition:executeAsInitiator: Possible output symbol: 'Symbol_RESinfo' (id=b41502e3-21ea-4cb9-9c1e-dc171f715685).
1462: [INFO] State:executeAsInitiator: Transition 'Transition' leads to state: State 3.
1462: [INFO] State:executeAsInitiator: Next transition: Transition.
(...)
Regarding the real server, we can see that received messages are well formated, as the server is able to parse them and send correct responses.::
$ ./server
Ready to read incoming messages
-> Read: CMDidentify#.
Command: CMDidentify
Arg size: 2
Arg content: ..
<- Send:
Return value: 0
Size of data buffer: 0
Data buffer:
""
-> Read: CMDinfo#
Command: CMDinfo
Arg size: 0
<- Send:
Return value: 0
Size of data buffer: 4
Data buffer:
DATA: 69 6e 66 6f "info"
-> Read: CMDstats#
Command: CMDstats
Arg size: 0
<- Send:
Return value: 0
Size of data buffer: 5
Data buffer:
DATA: 73 74 61 74 73 "stats"
-> Read: CMDauthentify#.
Command: CMDauthentify
Arg size: 6
Arg content: ......
<- Send:
Return value: 0
Size of data buffer: 0
Data buffer:
""
-> Read: CMDencrypt#.
Command: CMDencrypt
Arg size: 2
Arg content: ..
<- Send:
(...)
Do some fuzzing on a specific symbol
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Finally, we voluntarily twist the format message of the 'CMDencrypt' symbol, in order to try some fuzzing. The format modification corresponds to an extention of the size of the buffer field (i.e. the one which receives the data to encrypt)::
def send_and_receive_symbol(symbol):
data = symbol.specialize()
print("[+] Sending: {0}".format(repr(data)))
channelOut.write(data)
data = channelOut.read()
print("[+] Receiving: {0}".format(repr(data)))
# Update symbol definition to allow a broader payload size
symbols["CMDencrypt"].fields[2].fields[2].domain = Raw(nbBytes=(10, 120))
for i in range(10):
send_and_receive_symbol(symbols["CMDencrypt"])
We can see that Netzob is only sending CMDencrypt messages with a potentially long last field::
[+] Sending: 'CMDencrypt#6\x00\x00\x00&\xe0*\xb3\xa8A(\x0b\xd2yA\xb5\xb8\rw\x0fGi\xee\xb3\xd6\xb0<\xfc\xc0\xa7m\xbd\xbc\xde2~\xceE\xe5\xda@\xd4\xed\xed\xf2\xb4\xe7\t\xfbC\xbf\x05\xc6\xce\xfb\x83\xf2\x00'
(...)
In the server part, we quickly get a segmentation fault, due to a bug in the parsing of the last field.::
$ gdb ./server
(gdb) run
Starting program: /home/fgy/travaux/netzob/git/netzob-resources/experimentations/tutorial_target/src_v1/server
Ready to read incoming messages
(...)
-> Read: CMDencrypt#6
Command: CMDencrypt
Arg size: 54
Arg content: &?*??A(
wGi???<???m???2~?E??@???????? ?C??
Program received signal SIGSEGV, Segmentation fault.
0x08048bc0 in api_encrypt (in=0x45ce7e32 <Address 0x45ce7e32 out of bounds>, len=3561020133, out=0xb4f2eded <Address 0xb4f2eded out of bounds>) at amo_api.c:80
80 tmpData[i] = (in[i] ^ key) % 0xff;
That's all folks for this introduction tutorial. You can get the entire `source code <https://dev.netzob.org/attachments/download/183/inference_target_src_v1.py>`_ of the script used to infer and play with the protocol:
We invite you to read the API documentation or talk with us on IRC (#netzob on Freenode) if you have any question.

View File

@@ -0,0 +1,308 @@
.. currentmodule:: netzob
.. _tutorial_get_started:
Getting started with Netzob
~~~~~~~~~~~~~~~~~~~~~~~~~~~
The goal of this tutorial is to present the usage of each main component
of Netzob (inference of message format, construction of the state
machine and generation of traffic) through an undocumented protocol.
You can download the protocol material here :
- `Protocol
PCAP <https://dev.netzob.org/attachments/132/target_protocol.pcap>`_
: contains messages of the targeted protocol ;
- `Protocol
implementation <https://dev.netzob.org/attachments/127/target_protocol.tar.gz>`_
: provide the server and client implementation of the protocol.
You can follow the tutorial with only the PCAP file. But, you will need
the implementation if you want to generate traffic and allow Netzob to
discuss with a real implementation.
Setting the Workspace
^^^^^^^^^^^^^^^^^^^^^
Just after installing Netzob, when you start it, you have to set the
workspace directory (as in Eclipe).
.. figure:: https://dev.netzob.org/attachments/119/tuto_workspace.png
:align: center
:alt:
**Side note:** in Netzob, a workspace can be defined as a collection
of projects and of configuration properties. The directory which
host the workspace contains directories and files which includes
configuration files (workspace.xml), the set of projects (directory
projects) and other configuration resources (logging, traces, ...).
When creating a new workspace, Netzob will generate the necessary
workspace files based on templates. The directory "projects"
includes a directory for each created project. You can specify the
workspace on the command line (using the option "-w <path to the
workspace>" when executing Netzob. Otherwise, it will read the user
file located at "~/.netzob" to find out which workspace was lastly
used. If none, Netzob will ask you at startup where the workspace
is.
Your first project
^^^^^^^^^^^^^^^^^^
To create a project, navigate to the menu ``File`` > ``New project``.
Here, you can choose a project name which should be unique in the
workspace.
**Side note:** by default, Netzob chooses a location inside a
dedicated directory located in the "projects" directory of your
current workspace. The newly created project is automatically
selected which allow you to start working on it.
You can switch to another project at anytime through the use of the menu
``File`` > ``Open project from workspace``. Do not forget to save your
project before!
Capture traces
^^^^^^^^^^^^^^
The first step in the inferring process of a protocol in Netzob is to
capture and to import messages as samples. There are different methods
to retrieve messages depending of the communication channel used (files,
network, IPC, USB, etc.) and the format (PCAP, hex, raw binary flows,
etc.).
For this tutorial, you can import network messages with the provided
PCAP file. But we recommand to use the provided implementation to
generate samples of traffic and capture them with Netzob. You can do
this with the Netwok Capturer plugin, which is accessible in the menu
``File`` > ``Capture messages`` > ``Capture network traffic``.
.. figure:: https://dev.netzob.org/attachments/113/tuto_capture-small.png
:align: center
:target: https://dev.netzob.org/attachments/106/tuto_capture.png
:alt:
As shown in the picture, you have to launch the capture at the Layer 4
on the localhost ``lo`` interface. As the targeted protocol works over
UDP, you'll be able to capture only the UDP payloads. Then launch the
server of the targeted protocol and then the client. This one will send
different commands to the server and wait for the response.
Once you have captured one session, you have to select the messages you
want to import (you should import everything) and click the Import
messages button. A popup will ask you if you want to allow duplicate
messages. It's better to not do so, to avoid unnecessary messages. We
recommend to repeat this import process 4 times, in order to have enough
variation between messages.
Infer vocabulary
^^^^^^^^^^^^^^^^
Let's now start the inference of the message format (vocabulary).
The next picture shows the whole vocabulary inference interface and the
intended meaning of each component.
.. figure:: https://dev.netzob.org/attachments/120/tuto_voca_ui_small.png
:align: center
:target: https://dev.netzob.org/attachments/123/tuto_voca_ui.png
:alt:
The main window shows each message in raw hexadecimal format. You can
play with visualization attributes : right click on the symbol, then
select Visualization and the attribute you want to change (hex, decimal
or even string format, the unit size and potentially the sign and
endianness).
The following picture shows the rendering of the messages in hex format
(on the left) and string format (on the right). You can then see that
messages contain some interesting strings (``api_identify``,
``api_encrypt``, ``api_decrypt``, etc.).
.. figure:: https://dev.netzob.org/attachments/128/tuto_messages-small.png
:align: center
:target: https://dev.netzob.org/attachments/129/tuto_messages.png
:alt:
You can use the filter functionality to display messages that contain a
specific pattern. Here, we filter with the ``api_identify`` pattern.
.. figure:: https://dev.netzob.org/attachments/107/tuto_messages3-small.png
:align: center
:target: https://dev.netzob.org/attachments/101/tuto_messages3.png
:alt:
This filter permits to easily retrieve the messages associated with a
potential identification command.
You can see that a '``#``\ ' character is present in each messages. You
can try to split the messages by forcing their partitioning with a
specific delimiter. To do so, use the Force partitioning functionality
available in the symbol list (either with a right click on a symbol, or
by selecting a symbol with its checkbox and then clicking on the Force
partitioning button right above).
.. figure:: https://dev.netzob.org/attachments/117/tuto_force_partitioning.png
:align: center
:alt:
Using the '``#``\ ' string delimiter, you'll have the following result:
.. figure:: https://dev.netzob.org/attachments/130/tuto_force_part_result_small.png
:align: center
:target: https://dev.netzob.org/attachments/131/tuto_force_part_result.png
:alt:
You may also want to play with Sequence alignment. This partitioning
enables message alignment according to their common patterns.
After playing with the different partitioning available, you are able
to retrieve the different commands associated with the targeted
protocol, as shown on the following picture.
.. figure:: https://dev.netzob.org/attachments/109/tuto_symboles-small.png
:align: center
:target: https://dev.netzob.org/attachments/104/tuto_symboles.png
:alt:
According to the name of the commands, you can see that a
``api_encrypt`` command is available. Let's have a look at its message
format, which looks like:
::
[command]#[dataToEncrypt][padding]
Netzob enables you to indicate that a specific field has a mutable
content, which means its data is not fixed (such as the '#' delimiter)
nor part of a set of fixed elements (such as the command string).To
specify the structure of a field and its attributes, right click on a
field and select Edit Variable. A popup dialog displays a rooted tree
that corresponds to the inferred structure of the field. For example,
you should have all the observed values of the field (materialized
through DataVariable leafs) under an AlternateVariable node variable.
Regarding the targeted protocol, as we want to allow any data for the
current field, we first have to delete the ``AlternateVariableNode`` and
modify the root node to a ``DataVariable`` that has a mutable behavior,
as shown on the following picture.
.. figure:: https://dev.netzob.org/attachments/115/tuto_variable-small.png
:align: center
:target: https://dev.netzob.org/attachments/105/tuto_variable.png
:alt:
You can visualize the associated message format on bottom-left corner.
Its should display something like this:
.. figure:: https://dev.netzob.org/attachments/110/tuto_variable2-small.png
:align: center
:target: https://dev.netzob.org/attachments/97/tuto_variable2.png
:alt:
Now that we have refined the ``api_encrypt`` command message, we have to
do the same for other commands that also take as parameter a user data:
``api_identify``, ``api_authentify`` and ``api_decrypt``, but also for
some response messages such as ``resp_decrypt`` and ``resp_encrypt``.
At this time, you have a satisfactory approximation the vocabulary. You
can now start to construct the state machine of the protocol.
Infer Grammar
^^^^^^^^^^^^^
In this tutorial, we won't explain the automatic inference (learning) of
the state machine. As the targeted protocol has a basic state machine,
we will simply show how to model it in Netzob.
A basic state machine contains states and transitions. In Netzob, we use
a complex structure to model the grammar of a protocol. This model
enables information's specification such as the response time between an input
symbol and an output symbol, or even the probability of the different
output messages given an uniq input message. This model is called an
SMMDT (Stochastic Mealy Machine with Deterministic Transitions).
The grammar perspective interface of Netzob enables the creation of:
- states (initial or not);
- semi-stochastic transitions (i.e. "normal" transitions);
- open channel transitions;
- close channel transitions.
.. figure:: https://dev.netzob.org/attachments/118/tuto_grammar_buttons.png
:align: center
:alt:
Regarding our targeted protocol, we construct the associated model with
the following information:
- 1 open channel transition and an initial state;
- 1 close channel transition and a final state;
- 4 main states: init, identified, authenticated, closed;
- depending on the current state, we are able or not to launch certain
commands;
- some commands will trigger transitions (``api_identify``,
``api_authentify`` and ``api_bye``).
Once modeled, this looks like:
.. figure:: https://dev.netzob.org/attachments/114/tuto_grammar-small.png
:align: center
:target: https://dev.netzob.org/attachments/116/tuto_grammar.png
:alt:
Now that Netzob knows both the vocabulary and the grammar of the
targeted protocol, we are able to generate traffic that respect the
protocol model.
Generate traffic
^^^^^^^^^^^^^^^^
Let's go to the Simulator perspective of Netzob.
The simulator provides either client creation, server or both.
You can tell Netzob to talk with a real client or server implementation,
or you can just launch a client and a server inside Netzob and let them
talk together.
.. figure:: https://dev.netzob.org/attachments/121/tuto_simu_ui_small.png
:align: center
:target: https://dev.netzob.org/attachments/122/tuto_simu_ui.png
:alt:
Let's now create a client. We have to specify the following information:
- **client name**;
- **initiator** or not (i.e. who opens the communication channel ?): it
will usally be yes for a client;
- **client or server side**: client;
- **protocol**: UDP for te targeted protocol;
- **bind IP**: nothing here, as the client finds its own interface;
- **bind port**: nothing here, as the client finds its own port;
- **target IP**: 127.0.0.1;
- **target port**: 4242.
Now start the real server implementation, select the client in Netzob
and click the Start button on the top-right corner. This will generate
and send commands to the real server, and you'll be able to see the
exchanged messages in the interface, as shown on the following picture.
.. figure:: https://dev.netzob.org/attachments/108/tuto_simu-small.png
:align: center
:target: https://dev.netzob.org/attachments/99/tuto_simu.png
:alt:
After this introductive tutorial, we'll be glade to have feedbacks and
to `help you <http://www.netzob.org/community>`_ (see our mailing list
`user@lists.netzob.org <mailto:user@lists.netzob.org>`_ or ou IRC
channel #netzob on Freenode).
If you want to go further and `start
contributing <http://www.netzob.org/development>`_ to Netzob, that is
perfect. There are many simple or complex tasks everyone can do:
translation, documentation, bug fix, feature proposal or implementation.

View File

@@ -0,0 +1,255 @@
.. currentmodule:: netzob
.. _tutorial_modeling_protocol:
Modeling your Protocol with Netzob
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This tutorial details the main features of Netzob's protocol modeling
aspects. It shows how your protocol fields can be described with Netzob's
language.
The first thing to know is that a Netzob protocol model is entirely made of python code. Naturally, this code relies on Netzob's classes and methods. Thus, following this tutorial requires an installed version of ``Netzob (>=1.0)`` and your favorite python editor.
Initial Settings
^^^^^^^^^^^^^^^^
First step will be to create a directory that will hold our python source file.
For example, create the temporary ``/tmp/netzob`` directory and initiate the executable python file ``/tmp/netzob/tutorial.py``::
/$ mkdir /tmp/netzob
/$ cd /tmp/netzob
/tmp/netzob$ touch tutorial.py
/tmp/netzob$ chmod +x tutorial.py
Along with the traditional python shebang, imports the netzob library::
#!/usr/bin/env python
from netzob.all import *
Executing this file should return the following::
/tmp/netzob$ ./tutorial.py
Warning: FastBinaryTree not available, using Python version BinaryTree.
Warning: FastAVLTree not available, using Python version AVLTree.
Warning: FastRBTree not available, using Python version RBTree.
If an error related to the netzob import is returned, check the installation process you followed to install netzob.
Modeling the Protocol Vocabulary
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
In Netzob, the vocabulary of a protocol consists in a list of symbols.
A symbol represents all the messages that share a similar objectif from a protocol perspective. For example, the HTTP_GET symbol describes any HTTP request with the method GET being set.
A symbol is made of a succession of fields and an optional name::
>>> s = Symbol(name="MySymbol", fields = [field1, field2])
A symbol can be **specialized** into a context-valid message and a message can be **abstracted** into a symbol.
A field describes a chunk of the symbol and is defined by a definition domain::
>>> field1 = Field(name="MyField1", domain=domainOfField1)
>>> field2 = Field(name="MyField2", domain=domainOfField2)
A definition domain describes the set of values its field accepts. To support complex domains, a definition domain is represented by a tree where each vertices is a variable. Thus it exists two kind of variables, *Leaf variables* that accept no children and *Node variables* that accept one or more children variables.
**Leaf variables** are the simplest variables. It exists four kinds of leaf variables.
A *Data Variables* describes a data which value is of a given type. Various types are provided with Netzob:
* *ASCII* : an ASCII string (see class :class:`ASCII <netzob.Common.Models.Types.ASCII>`)
Example of a field that only accepts the "netzob" ASCII string::
>>> field = Field(ASCII("netzob"))
>>> field.specialize()
"netzob"
Example of a field that only accepts ASCII strings of five characters::
>>> field = Field(ASCII(nbChars=5))
>>> field.specialize()
zorjf
Exemple of a field that only accepts ASCII strings made of 5 to 10 characters::
>>> field = Field(ASCII(nbChars=(5, 10)))
>>> field.specialize()
jfozkp
>>> field.specialize()
nckrphjj
* *Decimal* : a decimal number (see class :class:`Decimal <netzob.Common.Models.Types.Decimal>`)
Similarly to the ASCII type, a Decimal data can be constrained by a specific value::
>>> field = Field(Decimal(20))
>>> field.specialize()
'\x14'
A decimal variable also accepts a range of valid values::
>>> field = Field(Decimal(interval=(10, 100)))
>>> field.specialize()
'\xda\x82'
>>> field.specialize()
'\xd6\xca'
* *Raw* : a sequence of bytes (see class :class:`Raw <netzob.Common.Models.Types.Raw>`)
Example of a field that accepts a specific sequence of bytes::
>>> field = Field(Raw('\x00\x01\x02\x03'))
>>> repr(field.specialize())
"'\\x00\\x01\\x02\\x03'"
Example of a field that accepts any sequence of ten bytes::
>>> field = Field(Raw(nbBytes=10))
>>> field.specialize()
't)\x99\x8a\x02>\xd1\x91y\x9b'
* *BitArray* : a sequence of bits (see class :class:`BitArray <netzob.Common.Models.Types.BitArray>`)
Example of a field that accepts 3 to 10 bits::
>>> field = Field(BitArray(nbBits=(3, 10))
>>> field.specialize()
'\xbe@'
* *IPv4* : an IPv4 raw address (see class :class:`IPv4 <netzob.Common.Models.Types.IPv4>`)
Example of a field that only accepts an IPv4 address::
>>> field = Field(IPv4())
>>> field.specialize()
'\x86\x89\\\xac'
Example of a field that only accepts an IPv4 address that belongs to the network 192.168.0.0/24::
>>> field = Field(IPv4(network='192.168.0.0/24'))
>>> field.specialize()
'\xc0\xa8\x00\x0b'
Along with Data variables, the definition domain of a field can embed the definition of relationships. Two kinds of relationships are supported in Netzob; intra-symbol relationships and inter-symbol relationships. The former denotes a relationship between the size or the value of a variable, and another field in the same symbol. The latter one denotes a relationship with a field of another symbol. Currently, three kinds of relationships are supported.
* A *Size Relationship* that describes a data whose value is the size of another field.
The size field can be declared before the targeted field in the same symbol::
>>> payloadField = Field(Raw(nbBytes=(5, 10)))
>>> sizeField = Field(Size(payloadField))
>>> s = Symbol([sizeField, payloadField])
>>> s.specialize()
'\x08\xac\xa4\xb8\x93\x8d\x83\x95%' # size = 8
>>> s.specialize()
'\x05\xff\xef\x93\x07\xd7' # size = 5
The size field can also be declared after the targeted field in the same symbol::
>>> payloadField = Field(Raw(nbBytes=(5, 10)))
>>> sizeField = Field(Size(payloadField))
>>> s = Symbol([payloadField, sizeField])
>>> s.specialize()
'n\\\x82\x84`\x00\x13\x9f\x08' # size = 8
>>> s.specialize()
'\xe7\xc4\xde\xbd\x18\x05' " size = 5
An optional "factor" and "offset" can be applied to the value of the computed size::
>>> payloadField = Field(Raw(nbBytes=(5, 10)))
>>> sizeField = Field(Size(payloadField, offset=1))
>>> s = Symbol([sizeField, payloadField])
>>> s.specialize()
'\x07\xfb+K\xf4N\x99' # size = 6 + 1 (offset)
More details and examples of Size relationships can be found in its API doc :class:`Size <netzob.Common.Models.Vocabulary.Domain.Variables.Leafs.Size>`.
* A *Value Relationship* is very similar to the size relationship except that the relationship applies on the value of the targeted field.
For example, a symbol can be made of three fields, the former being a random sequence of 5 bytes, the second a simple ASCII delimitor (':') while the latest shares the same value than the first field::
>>> f1 = Field(Raw(nbBytes=5))
>>> f2 = Field(ASCII(':'))
>>> f3 = Field(Value(f1))
>>> s = Symbol(fields=[f1, f2, f3])
>>> s.specialize()
'\x0f\x01ShS:\x0f\x01ShS'
>>> s.specialize()
'6H\xf9\x84\xc4:6H\xf9\x84\xc4'
More details and examples of Value relationships can be found in its API doc :class:`Size <netzob.Common.Models.Vocabulary.Domain.Variables.Leafs.Size>`.
* A *Checksum Variable* describes a data whose value is the IP checksum of one or more other fields.
The following example, illustrates the creation of an ICMP Echo request packet with a valid checksum represented on two bytes computed on-the-fly::
>>> typeField = Field(name="Type", domain=Raw('\\x08'))
>>> codeField = Field(name="Code", domain=Raw('\\x00'))
>>> chksumField = Field(name="Checksum")
>>> identField = Field(name="Identifier", domain=Raw('\\x1d\\x22'))
>>> seqField = Field(name="Sequence Number", domain=Raw('\\x00\\x07'))
>>> timeField = Field(name="Timestamp", domain=Raw('\\xa8\\xf3\\xf6\\x53\\x00\\x00\\x00\\x00'))
>>> headerField = Field(name="header")
>>> headerField.fields = [typeField, codeField, chksumField, identField, seqField, timeField]
>>> dataField = Field(name="Payload", domain=Raw(nbBytes=(5, 10)))
>>> chksumField.domain = Checksum([headerField, dataField], "InternetChecksum", dataType=Raw(nbBytes=2))
>>> s = Symbol(fields = [headerField, dataField])
>>> s.specialize()
'\\x08\\x00\x9d\xda\\x1d\\x22\\x00\\x07\\xa8\\xf3\\xf6\\x53\\x00\\x00\\x00\\x00\xec6\xf4\x98\xee' # checksum = \\xda\\x1d
**Leaf Variables** can be combined into a tree model to produce much more complex definition domains. To achieve this, **Node Variables** can be used to construct complex definition domains made of a succession of variables, an alternative of variables or a repetition of variables.
* The *Aggregation Node Variable* can be used to model a succession of variables.
For example, a field that accepts an ASCII string of 10 characters followed by 2 bytes (see :class:`Agg <netzob.Common.Models.Vocabulary.Domain.Variables.Nodes.Agg>`)::
>>> domainOfField = Agg([ ASCII(nbChars=10), Raw(nbBytes=2) ])
>>> field = Field(domainOfField)
>>> repr(field.specialize())
"'VLAuxPd0A0\\x86M'"
* The *Alternate Node Variable* can be used to model an alternative of multiple variables (OR).
For example, in the following models a field either accepts the ASCII value "hello" or any ASCII string of 10 to 15 characters (see :class:`Alt <netzob.Common.Models.Vocabulary.Domain.Variables.Nodes.Alt>`) ::
>>> field = Field(Alt([ ASCII("hello"), ASCII(nbChars=(10, 15)) ]))
>>> repr(field.specialize())
"'hello'"
>>> repr(field.specialize())
"'Zm7D3Ade9K'"
* The *Repeat Node Variable* can be used to model a repetition of a variable.
For example, in the following models a field accepts between 1 and 4 repetitions of the ASCII string "netzob" (see :class:`Repeat <netzob.Common.Models.Vocabulary.Domain.Variables.Nodes.Repeat>`) ::::
>>> field = Field(Repeat(ASCII("netzob"), nbRepeat=(1, 4)))
>>> repr(field.specialize())
"'netzob'"
>>> repr(field.specialize())
"'netzobnetzobnetzob'"
Node variables can be combined to produce complex definition domains. For example, the following models a field that either accept an ASCII string that starts by the letter "n" or a random IPv4 address::
>>> field = Field( Alt([ Agg([ASCII('n'), ASCII()]), Agg([ IPv4() ])]) )
>>> repr(field.specialize())
"'nlPj66'"
>>> repr(field.specialize())
"'aI\\xe4\\xc5'"

View File

@@ -0,0 +1,329 @@
.. currentmodule:: netzob
.. _tutorial_peach:
Auto generation of Peach pit files/fuzzers
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Principle
^^^^^^^^^
`Peach <http://peachfuzzer.com>`_ is an open-source framework of
fuzzing. It provides API to create smart fuzzers adapted to the tester's
needs through XML configuration files called `*Peach pit
files* <http://peachfuzzer.com/PeachPit>`_.
Making such files needs knowledge of the format message and state
machine of the targeted protocol as well as the actor Peach has to fuzz.
Fortunately, Netzob provides means for reverse engineering of undocumented and
proprietary protocols from provided traces in a semi-automatic way.
Netzob provides an exporter plugin for Peach that can transform
the inferred data model and state machine of a targeted protocol into a
Peach pit file automatically.
This tutorial shows how to take advantage of the Peach exporter plugin
provided in Netzob to automatically construct Peach pit configuration
files.
Prerequisite
^^^^^^^^^^^^
You need Netzob in version 0.4.1 or above.
This tutorial assumes that the user have previously followed the
`Getting started with
Netzob <http://www.netzob.org/resources/tutorial_get_started>`_ tutorial
and have a complete Netzob project (or at least some format messages).
The protocol implementation contains several vulnerabilities that should
be detected during fuzzing.
Moreover it assumes that the user has Peach 2.3.8 installed.
Export
^^^^^^
To export the project go in ``File`` > ``Export the project`` >
``Peach pit file``. The window below should appears :
.. figure:: https://dev.netzob.org/attachments/download/134
:align: center
:alt:
The window is composed of three panels. The left one lists all fuzzer
available. They differ on the state representation. There are three
kinds of fuzzer available:
- "Randomized state order fuzzer": one state is created for each
symbols of Netzob and at each step, the fuzzer changes of state for a
randomly chosen one.
- "Randomized transitions stateful fuzzer": one state is created for
each symbols of Netzob and the transitions between these states are
based on those Netzob allows, weight by their probability.
- "One-state fuzzer": one state is created corresponding to the chosen
symbol.
When the fuzzer is on a particular state, it sends fuzzed data that
corresponds to the associated symbol to the target. Choose one of them.
The right panel shows the fuzzer. It gives the user a small idea of what
he is doing and what changes between two configurations.
The bottom panel has two options:
- The first options ``Fuzzing based on`` tells on which Netzob data
model the fuzzing is based:
- "Variable": use the Netzob variables to make Peach data models. It
makes more fuzzy but less smart fuzzer.
- "Regex": use the Netzob Regex (which are displayed on the top of
the symbol visualization), it is the simplest solution.
- The second options ``Mutate static fields`` tells if the static
fields in the Netzob data model are fuzzed or not.
The ``Export`` button exports the fuzzer into a user defined file.
Use this fuzzer into Peach\ <#Use-this-fuzzer-into-Peach>`_
Export this fuzzer directly through the ``Export`` button to a file
named "test.xml" into the directory of Peach. It should create a
PeachzobAddons.py file, which is essential for Peach to leverage Netzob
capabilities as "fixup".
The "test.xml" file should look like this. Look closely to the few XML
comments.
::
<?xml version="1.0" encoding="utf-8"?>
<Peach xmlns="http://phed.org/2008/Peach" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://phed.org/2008/Peach /peach/peach.xsd">
<Include ns="default" src="file:defaults.xml"/>
<Import import="PeachzobAddons"/>
<DataModel name="dataModel 1">
<Blob name="Field 0_0" valueType="hex" value="6170695f"/>
<Blob name="Field 1_0" valueType="hex">
<Fixup class="PeachzobAddons.Or">
<Param name="values" value="Blob,696e666f; Blob,7374617473"/>
</Fixup>
</Blob>
<Blob name="Field 2_0" valueType="hex" value="2300000000000000000000000000000000000000000000"/>
<Blob name="Field 3_0" valueType="hex" value="00"/>
</DataModel>
<DataModel name="dataModel 2">
<Blob name="Field 0_0" valueType="hex" value="6170695f62796523000000000000000000000000000000000000000000000000"/>
</DataModel>
<DataModel name="dataModel 3">
<Blob name="Field 0_0" valueType="hex" value="6170695f6964656e746966792366726564000000000000000000000000000000"/>
</DataModel>
<DataModel name="dataModel 4">
<Blob name="Field 0_0" valueType="hex" value="6170695f61757468656e74696679236d79506173737764210000000000000000"/>
</DataModel>
<DataModel name="dataModel 5">
<Blob name="Field 0_0" valueType="hex" value="6170695f656e6372797074233132333435367465737400000000000000000000"/>
</DataModel>
<DataModel name="dataModel 6">
<Blob name="Field 0_0" valueType="hex" value="6170695f64656372797074237370717677743627313600000000000000000000"/>
</DataModel>
<DataModel name="dataModel 7">
<Blob name="Default-1_0" valueType="hex" value="00000000"/>
<Blob name="Default-2-1_0" valueType="hex" value="23"/>
<Blob name="Default-2-2-1-1_0" valueType="hex">
<Fixup class="PeachzobAddons.Or">
<Param name="values" value="Blob,00000000000000; Blob,00000004000000; Blob,00000005000000; Blob,0000000a000000; Blob,64000000000000; Blob,8b04080a000000"/>
</Fixup>
</Blob>
<Blob name="Default-2-2-1-2_0" valueType="hex">
<Fixup class="PeachzobAddons.Or">
<Param name="values" value="Blob,00000000000000000000; Blob,31323334353674657374; Blob,696e666f000000000000; Blob,73707176777436273136; Blob,73746174730000000000"/>
</Fixup>
</Blob>
<Blob name="Default-2-2-2_0" valueType="hex" value="00000000000000000000"/>
</DataModel>
<DataModel name="dataModel 9">
<Blob name="Field 0">
<Fixup class="PeachzobAddons.RandomField">
<Param name="minlen" value="0"/>
<Param name="maxlen" value="1024"/>
<Param name="type" value="Blob"/>
</Fixup>
</Blob>
</DataModel>
<StateModel initialState="state 0" name="stateModel">
<State name="state 0">
<Action ref="state 1" type="changeState" when="random.randint(1,8)==1"/>
<Action ref="state 2" type="changeState" when="random.randint(1,7)==1"/>
<Action ref="state 3" type="changeState" when="random.randint(1,6)==1"/>
<Action ref="state 4" type="changeState" when="random.randint(1,5)==1"/>
<Action ref="state 5" type="changeState" when="random.randint(1,4)==1"/>
<Action ref="state 6" type="changeState" when="random.randint(1,3)==1"/>
<Action ref="state 7" type="changeState" when="random.randint(1,2)==1"/>
<Action ref="state 9" type="changeState"/>
</State>
<State name="state 1">
<Action type="output">
<DataModel ref="dataModel 1"/>
<Data name="data"/>
</Action>
</State>
<State name="state 2">
<Action type="output">
<DataModel ref="dataModel 2"/>
<Data name="data"/>
</Action>
</State>
<State name="state 3">
<Action type="output">
<DataModel ref="dataModel 3"/>
<Data name="data"/>
</Action>
</State>
<State name="state 4">
<Action type="output">
<DataModel ref="dataModel 4"/>
<Data name="data"/>
</Action>
</State>
<State name="state 5">
<Action type="output">
<DataModel ref="dataModel 5"/>
<Data name="data"/>
</Action>
</State>
<State name="state 6">
<Action type="output">
<DataModel ref="dataModel 6"/>
<Data name="data"/>
</Action>
</State>
<State name="state 7">
<Action type="output">
<DataModel ref="dataModel 7"/>
<Data name="data"/>
</Action>
</State>
<State name="state 9">
<Action type="output">
<DataModel ref="dataModel 9"/>
<Data name="data"/>
</Action>
</State>
</StateModel>
<Agent name="DefaultAgent">
<!--Todo: Configure the Agents.-->
</Agent>
<Test name="DefaultTest">
<!--Todo: Enable Agent <Agent ref="TheAgent"/> -->
<StateModel ref="stateModel"/>
<Publisher class="udp.Udp">
<Param name="host" value="127.0.0.1"/>
<Param name="port" value="4242"/>
</Publisher>
<Publisher class="udp.Udp">
<Param name="host" value="127.0.0.1"/>
<Param name="port" value="10000"/>
</Publisher>
<!--The Netzob project has several simulator actors, so this file have several publishers. Choose one of them and remove the others.-->
</Test>
<Run name="DefaultRun">
<!--Todo: Configure the run.-->
<Logger class="logger.Filesystem">
<Param name="path" value="logs"/>
</Logger>
<Test ref="DefaultTest"/>
</Run>
</Peach>
This tutorial will not talk about Peach agents but configuring one of
them could be useful. In the Test block, there is as many publishers as
the Netzob simulator has actors. One publisher is needed, remove the
others. If there is no publishers, create one according to the model
above. On this example, the tester remove the second publisher.
Launch the fuzzing
^^^^^^^^^^^^^^^^^^
You first have to start the targeted server:
::
./server
Assuming that the user exports the "test.xml" file into the Peach
directory, you can now start the fuzzer:
::
python peach.py test.xml
After few seconds, you should trigger a segfault or a stack smashing
detection.
::
-> Read: api_identify#fred
Command: api_identify
Arg: fred
<- Send:
Return value: 0
Size of data buffer: 13
Data buffer:
DATA: 72 65 73 70 5f 69 64 65 6e 74 69 66 79 "resp_identify"
-> Read: api_identify#f
Command: api_identify
Arg: f
*** stack smashing detected ***: ./server terminated
======= Backtrace: =========
/lib/i386-linux-gnu/libc.so.6(__fortify_fail+0x45)[0xcec045]
/lib/i386-linux-gnu/libc.so.6(+0x103ffa)[0xcebffa]
./server[0x8048a3c]
./server[0x8048eb4]
./server[0x8048985]
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xc014d3]
./server[0x8048831]
======= Memory map: ========
00289000-0028a000 r-xp 00000000 00:00 0 [vdso]
002fb000-00317000 r-xp 00000000 08:03 2605207 /lib/i386-linux-gnu/libgcc_s.so.1
00317000-00318000 r--p 0001b000 08:03 2605207 /lib/i386-linux-gnu/libgcc_s.so.1
00318000-00319000 rw-p 0001c000 08:03 2605207 /lib/i386-linux-gnu/libgcc_s.so.1
00bb4000-00bd4000 r-xp 00000000 08:03 673152 /lib/i386-linux-gnu/ld-2.15.so
00bd4000-00bd5000 r--p 0001f000 08:03 673152 /lib/i386-linux-gnu/ld-2.15.so
00bd5000-00bd6000 rw-p 00020000 08:03 673152 /lib/i386-linux-gnu/ld-2.15.so
00be8000-00d8b000 r-xp 00000000 08:03 672879 /lib/i386-linux-gnu/libc-2.15.so
00d8b000-00d8c000 ---p 001a3000 08:03 672879 /lib/i386-linux-gnu/libc-2.15.so
00d8c000-00d8e000 r--p 001a3000 08:03 672879 /lib/i386-linux-gnu/libc-2.15.so
00d8e000-00d8f000 rw-p 001a5000 08:03 672879 /lib/i386-linux-gnu/libc-2.15.so
00d8f000-00d92000 rw-p 00000000 00:00 0
08048000-0804a000 r-xp 00000000 08:03 6488874 /home/sygus/travaux/netzob/target_protocol/server
0804a000-0804b000 r--p 00001000 08:03 6488874 /home/sygus/travaux/netzob/target_protocol/server
0804b000-0804c000 rw-p 00002000 08:03 6488874 /home/sygus/travaux/netzob/target_protocol/server
09e0d000-09e2e000 rw-p 00000000 00:00 0 [heap]
b778b000-b778c000 rw-p 00000000 00:00 0
b77a8000-b77ac000 rw-p 00000000 00:00 0
bf90f000-bf930000 rw-p 00000000 00:00 0 [stack]
Abandon (core dumped)

View File

@@ -0,0 +1,139 @@
.. currentmodule:: netzob
.. _tutorial_wireshark:
Export Wireshark dissectors
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Principle
^^^^^^^^^
`Wireshark <http://www.wireshark.org>`_ is an open-source packet
analyzer able to identify protocols and to highlight fields from the
data stream. Its main drawback is that it is only usefull on
documented/standard protocols. Within Netzob, which achieves
semi-automatic reverse engineering of protocols, we have developed an
exporter plugin that provides automatic generation of Wireshark dissectors
from proprietary or undocumented protocols. Dissectors are built in
`LUA <http://wiki.wireshark.org/Lua>`_ programming language.
Netzob provides a powerful datamodel in which fields are described with
the following information:
- Regular expression (fixed or dynamic size)
- Name (textual representation)
- Format
- Size
- Endianness
- Signing
All this information is gathered to generate a script including a
dissector used by Wireshark.
Language
^^^^^^^^
Wireshark can be statically extended with C modules similar to core
dissectors. Optionally, Wireshark can be configured to embed a LUA
interpretor. For modularity purposes, the Lua engine has been chosen
to extend Wireshark with Netzob generated dissectors.
Prerequisite
^^^^^^^^^^^^
You need Netzob in version 0.4.1 or above. The wireshark exporter
functionality is provided as a netzob core plugin (which is included in
the 0.4.1 version).
This tutorial assumes that the user have previously inferred the
specification of the targeted protocol. An example of protocol inference
is avaibale in the `Getting started with
Netzob <http://www.netzob.org/resources/tutorial_get_started>`_
tutorial.
Usage
^^^^^
#. Check that Wireshark supports Lua
.. figure:: http://wiki.wireshark.org/Lua?action=AttachFile&do=get&target=lua-about.png
:align: center
:alt:
#. Select a project
Given a partitioned symbol in a project you can generate a wireshark
dissector using the Export project menu item, then by selecting
Wireshark.
.. figure:: https://dev.netzob.org/attachments/158/2012-10-25-173314_1595x647_scrot_small.png
:align: center
:target: https://dev.netzob.org/attachments/82/2012-10-25-173314_1595x647_scrot.png
:alt:
.. figure:: https://dev.netzob.org/attachments/159/2012-10-25-180841_1552x731_scrot_small.png
:align: center
:target: https://dev.netzob.org/attachments/83/2012-10-25-180841_1552x731_scrot.png
:alt:
You should get a popup with the LUA script automatically generated:
.. figure:: https://dev.netzob.org/attachments/161/2012-10-30-180554_987x807_scrot_small.png
:align: center
:target: https://dev.netzob.org/attachments/94/2012-10-30-180554_987x807_scrot.png
:alt:
#. Import into wireshark
Two methods are available:
- Evaluate the Lua script in a Wireshark instance.
In wireshark, select ``Tools > Lua > Evaluate`` and paste the
generated code.
- Start wireshark with a specific Lua script.
Start wireshark with the following parameters:
``wireshark -X lua_script:PATH_OF_LUA_SCRIPT``
This will automatically import the Lua script on start.
#. Dissect data packets
Within the lower panel of Wireshark, you should get the dissected packets:
.. figure:: https://dev.netzob.org/attachments/160/2012-10-25-182017_956x1041_scrot_small.png
:align: center
:target: https://dev.netzob.org/attachments/85/2012-10-25-182017_956x1041_scrot.png
:alt:
Limitations
^^^^^^^^^^^
Variable size fields cannot be easily exported to the datamodel used by
Wireshark when we don't know the expected size. In this case, an error
message will popup preventing Netzob from generating the dissector. If
this happen, you have to complete the protocol model in order to find
the expected size of the dynamic field.
Improvements
^^^^^^^^^^^^
These ideas could be use to enhance dissection:
- Use relations (field / size, repeat ...)
- Look at future bitfield core implementation
What next ?
^^^^^^^^^^^
After this tutorial, we'll be glade to have feedbacks and to help you
(see our mailing list
`user@lists.netzob.org <mailto:user@lists.netzob.org>`_ or our IRC
channel #netzob on Freenode).
If you want to go further and `start contributing to
Netzob <http://www.netzob.org/development#becomecontributor>`_, that's
perfect. There are many simple or complex tasks everyone can do:
translation, documentation, bug fix, feature proposal or implementation.