[PATCH] Traffic replay emulator tool (load generator)

Thu Jun 29 05:10:13 UTC 2017

Hi,

The folks at Catalyst have been working on a tool to replay anonymized 
summary files generated from an earlier tool we wrote 
(traffic_summary.pl). We've now written a number of packet drivers and 
generation code to retrigger the traffic as well as generating traffic 
following a similar profile. Using a network summary we can create a 
(Markov) model file which estimates the probability of a packet 
occurring given the previous two packets that occurred, and then use 
that to generate semi-randomized traffic. From the model, we can save 
new traffic summaries so they can be replayed more deterministically and 
possibly for diagnosing issues.

I've already personally found a number of regressions in the code from 
the past 6 months, based on both performance and correctness. I also 
have a number of patches in the works or have already been submitted 
addressing these issues. By the time I've finished my initial once-over, 
LDAP binds should be roughly 20% faster than they were originally, LDAP 
searches should also improve about 20% and more improvement is likely to 
come in the future. I've managed to look at the throughput with the LDAP 
multi-process patches and it shows significant improvements in 
throughput even on a 2 CPU 4 GB RAM instance. The fixes made to locking 
also don't appear to have caused any significant slowdown.

In the traffic summaries we've attained so far, a very large proportion 
of the traffic tends to be DNS (50% at minimum). This much traffic 
appears to be a byproduct of everything else, so DNS is handled 
separately with an parameter to tune additional DNS queries in case more 
is required.

The following protocols are not currently handled:
      smb
      smb2
      browser
      smb_netlogon

The following DRSUAPI replication packets are currently ignored 
(although will be triggered in the background due to running the tool 
e.g. password changes):
      DsReplicaSync
      DsGetNCChanges
      DsReplicaUpdateRefs

We currently report standard statistics like max response time, 95th 
percentile, mean, median, failure and success rate etc. But we (see: 
Douglas) are working on producing more comprehensive and visual data 
representations.

Feedback appreciated! The tool is fairly standalone and most of the code 
is either setup or plain boilerplate packet creation code. In the 
future, we'd probably like to improve the range and realism of the 
packets sent, but so far it's already been quite a useful tool. Running 
the tool and seeing how our throughput evolves over time will be 
interesting, as well as running against Windows and comparing the two.


Cheers,

Garming





-------------- next part --------------
From a9cf6d74fcee645f5a2ecf57bd2afd0a4ea088cb Mon Sep 17 00:00:00 2001
From: Garming Sam <garming at catalyst.net.nz>
Date: Thu, 29 Jun 2017 15:04:11 +1200
Subject: [PATCH 1/7] getopt: Make sure to pass down workgroup and realm

This is important, because at times we can be unsure if it was actually
set to a value, or done by default e.g. they have a workgroup called
WORKGROUP.

Signed-off-by: Garming Sam <garming at catalyst.net.nz>
---
 python/samba/getopt.py | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/python/samba/getopt.py b/python/samba/getopt.py
index 9e1fb83..9a7f50a 100644
--- a/python/samba/getopt.py
+++ b/python/samba/getopt.py
@@ -46,7 +46,7 @@ class SambaOptions(optparse.OptionGroup):
                         type=str, metavar="OPTION",
                         help="set smb.conf option from command line",
                         callback=self._set_option)
-        self.add_option("--realm", action="callback",
+        self.add_option("--realm", action="callback", dest="realm",
                         type=str, metavar="REALM", help="set the realm name",
                         callback=self._set_realm)
         self._configfile = None
@@ -70,6 +70,7 @@ class SambaOptions(optparse.OptionGroup):
     def _set_realm(self, option, opt_str, arg, parser):
         self._lp.set('realm', arg)
         self.realm = arg
+        setattr(parser.values, option.dest, arg)
 
     def _set_option(self, option, opt_str, arg, parser):
         if arg.find('=') == -1:
@@ -142,7 +143,7 @@ class CredentialsOptions(optparse.OptionGroup):
                         action="callback", type=str,
                         help="Username", callback=self._parse_username)
         self._add_option("-W", "--workgroup", metavar="WORKGROUP",
-                        action="callback", type=str,
+                        action="callback", type=str, dest="workgroup",
                         help="Workgroup", callback=self._parse_workgroup)
         self._add_option("-N", "--no-pass", action="callback",
                         help="Don't ask for a password",
@@ -177,6 +178,7 @@ class CredentialsOptions(optparse.OptionGroup):
 
     def _parse_workgroup(self, option, opt_str, arg, parser):
         self.creds.set_domain(arg)
+        setattr(parser.values, option.dest, arg)
 
     def _set_password(self, option, opt_str, arg, parser):
         self.creds.set_password(arg)
-- 
1.9.1


From 9bfe47dc01437aaa0b246f23b816ac58ec3b2599 Mon Sep 17 00:00:00 2001
From: Douglas Bagnall <douglas.bagnall at catalyst.net.nz>
Date: Fri, 23 Jun 2017 14:16:53 +1200
Subject: [PATCH 2/7] traffic_summary: avoid uninitialised variable warning

Signed-off-by: Douglas Bagnall <douglas.bagnall at catalyst.net.nz>
---
 script/traffic_summary.pl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/script/traffic_summary.pl b/script/traffic_summary.pl
index 5c69ca1..05c1cf0 100755
--- a/script/traffic_summary.pl
+++ b/script/traffic_summary.pl
@@ -518,7 +518,7 @@ sub format_ldap
 #------------------------------------------------------------------------------
 sub format_kerberos
 {
-    my $msg_type    = $proto_data{'kerberos.msg_type.show'};
+    my $msg_type    = $proto_data{'kerberos.msg_type.show'} || '';
     my $cname_type  = $proto_data{'kerberos.cname.type'} || '';
     my $description = $proto_data{'kerberos.msg_type.showname'} || '';
 
-- 
1.9.1


From ccd95ebed96ba44405f3520c8c1bc1cf25413bee Mon Sep 17 00:00:00 2001
From: Gary Lockyer <gary at catalyst.net.nz>
Date: Thu, 29 Jun 2017 11:08:37 +1200
Subject: [PATCH 3/7] traffic_packets: Drivers for emulating and sending
 traffic

This is to be used for a new performance testing tool.

Signed-off-by: Gary Lockyer <gary at catalyst.net.nz>

Pair-programmed-with: Garming Sam <garming at catalyst.net.nz>
Pair-programmed-with: Douglas Bagnall <douglas.bagnall at catalyst.net.nz>
---
 python/samba/emulate/traffic_packets.py | 867 ++++++++++++++++++++++++++++++++
 1 file changed, 867 insertions(+)
 create mode 100644 python/samba/emulate/traffic_packets.py

diff --git a/python/samba/emulate/traffic_packets.py b/python/samba/emulate/traffic_packets.py
new file mode 100644
index 0000000..fd92cf3
--- /dev/null
+++ b/python/samba/emulate/traffic_packets.py
@@ -0,0 +1,867 @@
+# Dispatch for various request types.
+#
+# Copyright (C) Catalyst IT Ltd. 2017
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+import os
+import ctypes
+import dns.resolver
+import random
+
+from samba.net import Net
+from samba.dcerpc import security, drsuapi, nbt, lsa, netlogon, ntlmssp
+from samba.dcerpc.netlogon import netr_WorkstationInformation
+from samba.dcerpc.security import dom_sid
+from samba.netbios import Node
+from samba.ndr import ndr_pack
+from samba.credentials import (
+    CLI_CRED_NTLMv2_AUTH,
+    MUST_USE_KERBEROS,
+    DONT_USE_KERBEROS
+)
+from samba import NTSTATUSError
+from samba.ntstatus import NT_STATUS_OBJECT_NAME_NOT_FOUND
+from samba.dcerpc.misc import SEC_CHAN_WKSTA
+
+
+def uint32(v):
+    return ctypes.c_uint32(v).value
+
+
+def check_runtime_error(runtime, val):
+    if runtime is None:
+        return False
+
+    err32 = uint32(runtime[0])
+    if err32 == val:
+        return True
+
+    return False
+
+name_formats = [
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_FQDN_1779,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_NT4_ACCOUNT,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_DISPLAY,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_GUID,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_CANONICAL,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_USER_PRINCIPAL,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_CANONICAL_EX,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_SERVICE_PRINCIPAL,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_SID_OR_SID_HISTORY,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_DNS_DOMAIN,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_UPN_AND_ALTSECID,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_NT4_ACCOUNT_NAME_SANS_DOMAIN_EX,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_LIST_GLOBAL_CATALOG_SERVERS,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_UPN_FOR_LOGON,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_LIST_SERVERS_WITH_DCS_IN_SITE,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_STRING_SID_NAME,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_ALT_SECURITY_IDENTITIES_NAME,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_LIST_NCS,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_LIST_DOMAINS,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_MAP_SCHEMA_GUID,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_NT4_ACCOUNT_NAME_SANS_DOMAIN,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_LIST_ROLES,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_LIST_INFO_FOR_SERVER,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_LIST_SERVERS_FOR_DOMAIN_IN_SITE,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_LIST_DOMAINS_IN_SITE,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_LIST_SERVERS_IN_SITE,
+    drsuapi.DRSUAPI_DS_NAME_FORMAT_LIST_SITES,
+]
+
+
+def warning(message):
+    print "\033[37;41;1m" "Warning: %s" "\033[00m" % (message)
+
+###############################################################################
+#
+# Packet generation functions:
+#
+# All the packet generation functions have the following form:
+#  packet_${protocol}_${opcode}(packet, conversation, context)
+#
+#  The functions return true, if statistics should be collected for the packet
+#                      false, the packet has been ignored.
+#
+# Where:
+#   protocol is the protocol, i.e. cldap, dcerpc, ...
+#   opcode   is the protocol op code i.e. type of the packet to be
+#            generated.
+#
+#   packet contains data about the captured/generated packet
+#          provides any extra data needed to generate the packet
+#
+#   conversation Details of the current client/server interaction
+#
+#   context state data for the current interaction
+#
+#
+#
+# The following protocols are not currently handled:
+#     smb
+#     smb2
+#     browser
+#     smb_netlogon
+#
+# The following drsuapi replication packets are currently ignored:
+#     DsReplicaSync
+#     DsGetNCChanges
+#     DsReplicaUpdateRefs
+
+
+# Packet generators that do NOTHING are assigned to the null_packet
+# function which allows the conversation generators to notice this and
+# avoid a whole lot of pointless work.
+def null_packet(packet, conversation, context):
+    return False
+
+
+def packet_cldap_3(packet, conversation, context):
+    # searchRequest
+    net = Net(creds=context.creds, lp=context.lp)
+    net.finddc(domain=context.lp.get('realm'),
+               flags=(nbt.NBT_SERVER_LDAP |
+                      nbt.NBT_SERVER_DS |
+                      nbt.NBT_SERVER_WRITABLE))
+    return True
+
+
+packet_cldap_5 = null_packet
+# searchResDone
+
+packet_dcerpc_0  = null_packet
+# Request
+# Can be ignored, it's the continuation of an existing conversation
+
+packet_dcerpc_2 = null_packet
+# Request
+# Server response, so should be ignored
+
+
+packet_dcerpc_11 = null_packet
+# Bind
+# creation of the rpc dcerpc connection is managed by the higher level
+# protocol drivers. So we ignore it when generating traffic
+
+
+packet_dcerpc_12 = null_packet
+# Bind_ack
+# Server response, so should be ignored
+
+
+packet_dcerpc_13 = null_packet
+# Bind_nak
+# Server response, so should be ignored
+
+
+packet_dcerpc_14 = null_packet
+# Alter_context
+# Generated as part of the connect process
+
+
+def packet_dcerpc_15(packet, conversation, context):
+    # Alter_context_resp
+    # This means it was GSSAPI/krb5 (probably)
+    # Check the kerberos_state and issue a diagnostic if kerberos not enabled
+    if context.user_creds.get_kerberos_state() == DONT_USE_KERBEROS:
+        warning("Kerberos disabled but have dcerpc Alter_context_resp "
+                "indicating Kerberos was used")
+    return False
+
+
+def packet_dcerpc_16(packet, conversation, context):
+    # AUTH3
+    # This means it was NTLMSSP
+    # Check the kerberos_state and issue a diagnostic if kerberos enabled
+    if context.user_creds.get_kerberos_state() == MUST_USE_KERBEROS:
+        warning("Kerberos enabled but have dcerpc AUTH3 "
+                "indicating NTLMSSP was used")
+    return False
+
+
+def packet_dns_0(packet, conversation, context):
+    # query
+    name, rtype = context.guess_a_dns_lookup()
+    resolver = dns.resolver.Resolver()
+    resolver.timeout  = 1
+    resolver.lifetime = 1
+    resolver.query(name, rtype)
+    return True
+
+packet_dns_1 = null_packet
+# response
+# Server response, so should be ignored
+
+
+def packet_drsuapi_0(packet, conversation, context):
+    # DsBind
+    context.get_drsuapi_connection_pair(True)
+    return True
+
+
+NAME_FORMATS = [getattr(drsuapi, _x) for _x in dir(drsuapi)
+                if 'NAME_FORMAT' in _x]
+
+
+def packet_drsuapi_12(packet, conversation, context):
+    # DsCrackNames
+    drs, handle = context.get_drsuapi_connection_pair()
+
+    names = drsuapi.DsNameString()
+    names.str = context.server
+
+    req = drsuapi.DsNameRequest1()
+    req.format_flags = 0
+    req.format_offered = 7
+    req.format_desired = random.choice(name_formats)
+    req.codepage = 1252
+    req.language = 1033  # German, I think
+    req.format_flags = 0
+    req.count = 1
+    req.names = [names]
+
+    (result, ctr) = drs.DsCrackNames(handle, 1, req)
+    return True
+
+
+def packet_drsuapi_13(packet, conversation, context):
+    # DsWriteAccountSpn
+    req = drsuapi.DsWriteAccountSpnRequest1()
+    req.operation = drsuapi.DRSUAPI_DS_SPN_OPERATION_ADD
+    (drs, handle) = context.get_drsuapi_connection_pair()
+    (level, res) = drs.DsWriteAccountSpn(handle, 1, req)
+    return True
+
+
+def packet_drsuapi_1(packet, conversation, context):
+    # DsUnbind
+    (drs, handle) = context.get_drsuapi_connection_pair()
+    drs.DsUnbind(handle)
+    del context.drsuapi_connections[-1]
+    return True
+
+
+packet_drsuapi_2 = null_packet
+# DsReplicaSync
+# This is between DCs, triggered on a DB change
+# Ignoring for now
+
+
+packet_drsuapi_3 = null_packet
+# DsGetNCChanges
+# This is between DCs, trigger with DB operation,
+# or DsReplicaSync between DCs.
+# Ignoring for now
+
+
+packet_drsuapi_4 = null_packet
+# DsReplicaUpdateRefs
+# Ignoring for now
+
+
+packet_epm_3 = null_packet
+# Map
+# Will be generated by higher level protocol calls
+
+
+def packet_kerberos_(packet, conversation, context):
+    # Use the presence of kerberos packets as a hint to enable kerberos
+    # for the rest of the conversation.
+    # i.e. kerberos packets are not explicitly generated.
+    context.user_creds.set_kerberos_state(MUST_USE_KERBEROS)
+    context.user_creds_bad.set_kerberos_state(MUST_USE_KERBEROS)
+    context.machine_creds.set_kerberos_state(MUST_USE_KERBEROS)
+    context.machine_creds_bad.set_kerberos_state(MUST_USE_KERBEROS)
+    context.creds.set_kerberos_state(MUST_USE_KERBEROS)
+    return False
+
+
+packet_ldap_ = null_packet
+# Unknown
+# The ldap payload was probably encrypted so just ignore it.
+
+
+def packet_ldap_0(packet, conversation, context):
+    # bindRequest
+    if packet.extra[5] == "simple":
+        # Perform a simple bind.
+        context.get_ldap_connection(new=True, simple=True)
+    else:
+        # Perform a sasl bind.
+        context.get_ldap_connection(new=True, simple=False)
+    return True
+
+packet_ldap_1 = null_packet
+# bindResponse
+# Server response ignored for traffic generation
+
+
+def packet_ldap_2(packet, conversation, context):
+    # unbindRequest
+    # pop the last one off -- most likely we're in a bind/unbind ping.
+    del context.ldap_connections[-1:]
+    return False
+
+
+def packet_ldap_3(packet, conversation, context):
+    # searchRequest
+
+    (scope, dn_sig, filter, attrs, extra, desc, oid) = packet.extra
+    if not scope:
+        scope = 0
+
+    samdb = context.get_ldap_connection()
+    dn = context.get_matching_dn(dn_sig)
+
+    samdb.search(dn, scope=int(scope), attrs=attrs.split(','))
+    return True
+
+packet_ldap_4 = null_packet
+# searchResEntry
+# Server response ignored for traffic generation
+
+
+packet_ldap_5 = null_packet
+# Server response ignored for traffic generation
+
+
+packet_lsarpc_6 = null_packet
+# lsa_OpenPolicy
+# We ignore this, but take it as a hint that the lsarpc handle should
+# be over a named pipe.
+#
+
+
+def packet_lsarpc_14(packet, conversation, context):
+    # lsa_LookupNames
+    c = context.get_lsarpc_named_pipe_connection()
+
+    objectAttr = lsa.ObjectAttribute()
+    pol_handle = c.OpenPolicy2(u'', objectAttr,
+                               security.SEC_FLAG_MAXIMUM_ALLOWED)
+
+    sids  = lsa.TransSidArray()
+    names = [lsa.String("This Organization"),
+             lsa.String("Digest Authentication")]
+    level = 5
+    count = 0
+    c.LookupNames(pol_handle, names, sids, level, count)
+    return True
+
+
+def packet_lsarpc_15(packet, conversation, context):
+    # lsa_LookupSids
+    c = context.get_lsarpc_named_pipe_connection()
+
+    objectAttr = lsa.ObjectAttribute()
+    pol_handle = c.OpenPolicy2(u'', objectAttr,
+                               security.SEC_FLAG_MAXIMUM_ALLOWED)
+
+    sids = lsa.SidArray()
+    sid = lsa.SidPtr()
+
+    x = dom_sid("S-1-5-7")
+    sid.sid = x
+    sids.sids = [sid]
+    sids.num_sids = 1
+    names = lsa.TransNameArray()
+    level = 5
+    count = 0
+
+    c.LookupSids(pol_handle, sids, names, level, count)
+    return True
+
+
+def packet_lsarpc_39(packet, conversation, context):
+    # lsa_QueryTrustedDomainInfoBySid
+    # Samba does not support trusted domains, so this call is expected to fail
+    #
+    c = context.get_lsarpc_named_pipe_connection()
+
+    objectAttr = lsa.ObjectAttribute()
+
+    pol_handle = c.OpenPolicy2(u'', objectAttr,
+                               security.SEC_FLAG_MAXIMUM_ALLOWED)
+
+    domsid = security.dom_sid(context.domain_sid)
+    level = 1
+    try:
+        c.QueryTrustedDomainInfoBySid(pol_handle, domsid, level)
+    except NTSTATUSError as error:
+        # Object Not found is the expected result, anything else is a
+        # failure.
+        if not check_runtime_error(error, NT_STATUS_OBJECT_NAME_NOT_FOUND):
+            raise
+    return True
+
+packet_lsarpc_40 = null_packet
+# lsa_SetTrustedDomainInfo
+# Not currently supported
+
+
+def packet_lsarpc_76(packet, conversation, context):
+    # lsa_LookupSids3
+    c = context.get_lsarpc_connection()
+    sids = lsa.SidArray()
+    sid = lsa.SidPtr()
+    # Need a set
+    x = dom_sid("S-1-5-7")
+    sid.sid = x
+    sids.sids = [sid]
+    sids.num_sids = 1
+    names = lsa.TransNameArray2()
+    level = 5
+    count = 0
+    lookup_options = 0
+    client_revision = 2
+    c.LookupSids3(sids, names, level, count, lookup_options, client_revision)
+    return True
+
+
+def packet_lsarpc_77(packet, conversation, context):
+    # lsa_LookupNames4
+    c = context.get_lsarpc_connection()
+    sids  = lsa.TransSidArray3()
+    names = [lsa.String("This Organization"),
+             lsa.String("Digest Authentication")]
+    level = 5
+    count = 0
+    lookup_options = 0
+    client_revision = 2
+    c.LookupNames4(names, sids, level, count, lookup_options, client_revision)
+    return True
+
+
+def packet_nbns_0(packet, conversation, context):
+    # query
+    n = Node()
+    try:
+        n.query_name("ANAME", context.server, timeout=4, broadcast=False)
+    except:
+        pass
+    return True
+
+packet_nbns_1 = null_packet
+# response
+# Server response, not generated by the client
+
+
+packet_rpc_netlogon_4 = null_packet
+# NetrServerReqChallenge
+# generated by higher level protocol drivers
+# ignored for traffic generation
+
+
+packet_rpc_netlogon_21 = null_packet
+# NetrLogonDummyRoutine1
+# Used to determine security settings. Triggered from schannel setup
+# So no need for an explicit generator
+
+
+packet_rpc_netlogon_26 = null_packet
+# NetrServerAuthenticate3
+# Triggered from schannel set up, no need for an explicit generator
+
+
+def packet_rpc_netlogon_29(packet, conversation, context):
+    # NetrLogonGetDomainInfo [531]
+    c = context.get_netlogon_connection()
+    (auth, succ) = context.get_authenticator()
+    query = netr_WorkstationInformation()
+
+    c.netr_LogonGetDomainInfo(context.server,
+                              context.netbios_name,
+                              auth,
+                              succ,
+                              2,      # TODO are there other values?
+                              query)
+    return True
+
+
+def packet_rpc_netlogon_30(packet, conversation, context):
+    # NetrServerPasswordSet2
+    c = context.get_netlogon_connection()
+    (auth, succ) = context.get_authenticator()
+    DATA_LEN = 512
+    # Set the new password to the existing password, this generates the same
+    # work load as a new value, and leaves the account password intact for
+    # subsequent runs
+    newpass = context.machine_creds.get_password().encode('utf-16-le')
+    pwd_len = len(newpass)
+    filler  = [ord(x) for x in os.urandom(DATA_LEN - pwd_len)]
+    pwd = netlogon.netr_CryptPassword()
+    pwd.length = pwd_len
+    pwd.data = filler + [ord(x) for x in newpass]
+    context.machine_creds.encrypt_netr_crypt_password(pwd)
+    c.netr_ServerPasswordSet2(context.server,
+                              context.machine_creds.get_workstation(),
+                              SEC_CHAN_WKSTA,
+                              context.netbios_name,
+                              auth,
+                              pwd)
+    return True
+
+
+def packet_rpc_netlogon_39(packet, conversation, context):
+    # NetrLogonSamLogonEx [4331]
+    def connect(creds):
+        c = context.get_netlogon_connection()
+
+        # Disable Kerberos in cli creds to extract NTLM response
+        old_state = creds.get_kerberos_state()
+        creds.set_kerberos_state(DONT_USE_KERBEROS)
+
+        logon = samlogon_logon_info(context.domain,
+                                    context.netbios_name,
+                                    creds)
+        logon_level = netlogon.NetlogonNetworkTransitiveInformation
+        validation_level = netlogon.NetlogonValidationSamInfo4
+        netr_flags = 0
+        c.netr_LogonSamLogonEx(context.server,
+                               context.machine_creds.get_workstation(),
+                               logon_level,
+                               logon,
+                               validation_level,
+                               netr_flags)
+
+        creds.set_kerberos_state(old_state)
+
+    context.last_samlogon_bad =\
+        context.with_random_bad_credentials(connect,
+                                            context.user_creds,
+                                            context.user_creds_bad,
+                                            context.last_samlogon_bad)
+    return True
+
+
+def samlogon_target(domain_name, computer_name):
+    target_info = ntlmssp.AV_PAIR_LIST()
+    target_info.count = 3
+    computername = ntlmssp.AV_PAIR()
+    computername.AvId = ntlmssp.MsvAvNbComputerName
+    computername.Value = computer_name
+
+    domainname = ntlmssp.AV_PAIR()
+    domainname.AvId = ntlmssp.MsvAvNbDomainName
+    domainname.Value = domain_name
+
+    eol = ntlmssp.AV_PAIR()
+    eol.AvId = ntlmssp.MsvAvEOL
+    target_info.pair = [domainname, computername, eol]
+
+    return ndr_pack(target_info)
+
+
+def samlogon_logon_info(domain_name, computer_name, creds):
+
+    target_info_blob = samlogon_target(domain_name, computer_name)
+
+    challenge = b"abcdefgh"
+    # User account under test
+    response = creds.get_ntlm_response(flags=CLI_CRED_NTLMv2_AUTH,
+                                       challenge=challenge,
+                                       target_info=target_info_blob)
+
+    logon = netlogon.netr_NetworkInfo()
+
+    logon.challenge     = [ord(x) for x in challenge]
+    logon.nt            = netlogon.netr_ChallengeResponse()
+    logon.nt.length     = len(response["nt_response"])
+    logon.nt.data       = [ord(x) for x in response["nt_response"]]
+    logon.identity_info = netlogon.netr_IdentityInfo()
+
+    (username, domain)  = creds.get_ntlm_username_domain()
+    logon.identity_info.domain_name.string  = domain
+    logon.identity_info.account_name.string = username
+    logon.identity_info.workstation.string  = creds.get_workstation()
+
+    return logon
+
+
+def packet_rpc_netlogon_40(packet, conversation, context):
+    # DsrEnumerateDomainTrusts
+    c = context.get_netlogon_connection()
+    c.netr_DsrEnumerateDomainTrusts(
+        context.server,
+        netlogon.NETR_TRUST_FLAG_IN_FOREST |
+        netlogon.NETR_TRUST_FLAG_OUTBOUND  |
+        netlogon.NETR_TRUST_FLAG_INBOUND)
+    return True
+
+
+def packet_rpc_netlogon_45(packet, conversation, context):
+    # NetrLogonSamLogonWithFlags [7]
+    def connect(creds):
+        c = context.get_netlogon_connection()
+        (auth, succ) = context.get_authenticator()
+
+        # Disable Kerberos in cli creds to extract NTLM response
+        old_state = creds.get_kerberos_state()
+        creds.set_kerberos_state(DONT_USE_KERBEROS)
+
+        logon = samlogon_logon_info(context.domain,
+                                    context.netbios_name,
+                                    creds)
+        logon_level = netlogon.NetlogonNetworkTransitiveInformation
+        validation_level = netlogon.NetlogonValidationSamInfo4
+        netr_flags = 0
+        c.netr_LogonSamLogonWithFlags(context.server,
+                                      context.machine_creds.get_workstation(),
+                                      auth,
+                                      succ,
+                                      logon_level,
+                                      logon,
+                                      validation_level,
+                                      netr_flags)
+
+        creds.set_kerberos_state(old_state)
+
+    context.last_samlogon_bad =\
+        context.with_random_bad_credentials(connect,
+                                            context.user_creds,
+                                            context.user_creds_bad,
+                                            context.last_samlogon_bad)
+    return True
+
+
+def packet_samr_0(packet, conversation, context):
+    # Open
+    c = context.get_samr_context()
+    c.get_handle()
+    return True
+
+
+def packet_samr_1(packet, conversation, context):
+    # Close
+    c = context.get_samr_context()
+    s = c.get_connection()
+    # close the last opened handle, may not always be accurate
+    # but will do for load simulation
+    if c.user_handle is not None:
+        s.Close(c.user_handle)
+        c.user_handle = None
+    elif c.group_handle is not None:
+        s.Close(c.group_handle)
+        c.group_handle = None
+    elif c.domain_handle is not None:
+        s.Close(c.domain_handle)
+        c.domain_handle = None
+        c.rids          = None
+    elif c.handle is not None:
+        s.Close(c.handle)
+        c.handle     = None
+        c.domain_sid = None
+    return True
+
+
+def packet_samr_3(packet, conversation, context):
+    # QuerySecurity
+    c = context.get_samr_context()
+    s = c.get_connection()
+    if c.user_handle is None:
+        packet_samr_34(packet, conversation, context)
+    s.QuerySecurity(c.user_handle, 1)
+    return True
+
+
+def packet_samr_5(packet, conversation, context):
+    # LookupDomain
+    c = context.get_samr_context()
+    s = c.get_connection()
+    h = c.get_handle()
+    d = lsa.String()
+    d.string = context.domain
+    c.domain_sid = s.LookupDomain(h, d)
+    return True
+
+
+def packet_samr_6(packet, conversation, context):
+    # EnumDomains
+    c = context.get_samr_context()
+    s = c.get_connection()
+    h = c.get_handle()
+    s.EnumDomains(h, 0, 0)
+    return True
+
+
+def packet_samr_7(packet, conversation, context):
+    # OpenDomain
+    c = context.get_samr_context()
+    s = c.get_connection()
+    h = c.get_handle()
+    if c.domain_sid is None:
+        packet_samr_5(packet, conversation, context)
+
+    c.domain_handle = s.OpenDomain(h,
+                                   security.SEC_FLAG_MAXIMUM_ALLOWED,
+                                   c.domain_sid)
+    return True
+
+SAMR_QUERY_DOMAIN_INFO_LEVELS = [8, 12]
+
+
+def packet_samr_8(packet, conversation, context):
+    # QueryDomainInfo [228]
+    c = context.get_samr_context()
+    s = c.get_connection()
+    if c.domain_handle is None:
+        packet_samr_7(packet, conversation, context)
+    level = random.choice(SAMR_QUERY_DOMAIN_INFO_LEVELS)
+    s.QueryDomainInfo(c.domain_handle, level)
+    return True
+
+packet_samr_14 = null_packet
+# CreateDomainAlias
+# Ignore these for now.
+
+
+def packet_samr_15(packet, conversation, context):
+    # EnumDomainAliases
+    c = context.get_samr_context()
+    s = c.get_connection()
+    if c.domain_handle is None:
+        packet_samr_7(packet, conversation, context)
+
+    s.EnumDomainAliases(c.domain_handle, 100, 0)
+    return True
+
+
+def packet_samr_16(packet, conversation, context):
+    # GetAliasMembership
+    c = context.get_samr_context()
+    s = c.get_connection()
+    if c.domain_handle is None:
+        packet_samr_7(packet, conversation, context)
+
+    sids = lsa.SidArray()
+    sid  = lsa.SidPtr()
+    sid.sid = c.domain_sid
+    sids.sids = [sid]
+    s.GetAliasMembership(c.domain_handle, sids)
+    return True
+
+
+def packet_samr_17(packet, conversation, context):
+    # LookupNames
+    c = context.get_samr_context()
+    s = c.get_connection()
+    if c.domain_handle is None:
+        packet_samr_7(packet, conversation, context)
+
+    name = lsa.String(context.username)
+    c.rids = s.LookupNames(c.domain_handle, [name])
+    return True
+
+
+def packet_samr_18(packet, conversation, context):
+    # LookupRids
+    c = context.get_samr_context()
+    s = c.get_connection()
+    if c.rids is None:
+        packet_samr_17(packet, conversation, context)
+    rids = []
+    for r in c.rids:
+        for i in r.ids:
+            rids.append(i)
+    s.LookupRids(c.domain_handle, rids)
+    return True
+
+
+def packet_samr_19(packet, conversation, context):
+    # OpenGroup
+    c = context.get_samr_context()
+    s = c.get_connection()
+    if c.domain_handle is None:
+        packet_samr_7(packet, conversation, context)
+
+    rid = 0x202  # Users I think.
+    c.group_handle = s.OpenGroup(c.domain_handle,
+                                 security.SEC_FLAG_MAXIMUM_ALLOWED,
+                                 rid)
+    return True
+
+
+def packet_samr_25(packet, conversation, context):
+    # QueryGroupMember
+    c = context.get_samr_context()
+    s = c.get_connection()
+    if c.group_handle is None:
+        packet_samr_19(packet, conversation, context)
+    s.QueryGroupMember(c.group_handle)
+    return True
+
+
+def packet_samr_34(packet, conversation, context):
+    # OpenUser
+    c = context.get_samr_context()
+    s = c.get_connection()
+    if c.rids is None:
+        packet_samr_17(packet, conversation, context)
+    c.user_handle = s.OpenUser(c.domain_handle,
+                               security.SEC_FLAG_MAXIMUM_ALLOWED,
+                               c.rids[0].ids[0])
+    return True
+
+
+def packet_samr_36(packet, conversation, context):
+    # QueryUserInfo
+    c = context.get_samr_context()
+    s = c.get_connection()
+    if c.user_handle is None:
+        packet_samr_34(packet, conversation, context)
+    level = 1
+    s.QueryUserInfo(c.user_handle, level)
+    return True
+
+
+def packet_samr_39(packet, conversation, context):
+    # GetGroupsForUser
+    c = context.get_samr_context()
+    s = c.get_connection()
+    if c.user_handle is None:
+        packet_samr_34(packet, conversation, context)
+    s.GetGroupsForUser(c.user_handle)
+    return True
+
+
+def packet_samr_57(packet, conversation, context):
+    # Connect2
+    c = context.get_samr_context()
+    c.get_handle()
+
+
+def packet_samr_64(packet, conversation, context):
+    # Connect5
+    c = context.get_samr_context()
+    c.get_handle()
+    return True
+
+
+def packet_srvsvc_16(packet, conversation, context):
+    # NetShareGetInfo
+    s = context.get_srvsvc_connection()
+    server_unc = "\\\\" + context.server
+    share_name = "netlogon"
+    level = 1
+    s.NetShareGetInfo(server_unc, share_name, level)
+    return True
+
+
+def packet_srvsvc_21(packet, conversation, context):
+    # NetSrvGetInfo
+    srvsvc = context.get_srvsvc_connection()
+    server_unc = "\\\\" + context.server
+    level = 102
+    srvsvc.NetSrvGetInfo(server_unc, level)
+    return True
-- 
1.9.1


From cb39bedd5be0d92bce9d47fbb1ff140cb2f5b07d Mon Sep 17 00:00:00 2001
From: Douglas Bagnall <douglas.bagnall at catalyst.net.nz>
Date: Thu, 2 Mar 2017 15:01:25 +1300
Subject: [PATCH 4/7] traffic_emulator: Introduce a traffic simulator based on
 traffic_summary.pl

Using a summarized network trace, we learn a Markov model estimating the
probability of a packet occuring given the last two packets. This
results in a JSON file containing our ngrams (x, y) -> { a: 2, b: 3 }
and our conversation rates (with a separate DNS rate).

[Packet x]  \t  [Packet y] : {

    [Packet a] : [count of packet],

    [Packet b] : [count of packet],

    "-" : [count of wait]
}

"-" represent wait

Using the counts, they form ratios of which packets will likely be sent.

Once a model file has been created, we can generate semi-randomized
traffic based on it. We can also save new traffic summary files which
can be replayed relatively deterministically in comparison (and be used
to debug and benchmark).

Signed-off-by: Douglas Bagnall <douglas.bagnall at catalyst.net.nz>
Signed-off-by: Gary Lockyer <gary at catalyst.net.nz>
Signed-off-by: Tim Beale <timbeale at catalyst.net.nz>
Signed-off-by: Garming Sam <garming at catalyst.net.nz>
Pair-programmed-with: Gary Lockyer <gary at catalyst.net.nz>
Pair-programmed-with: Tim Beale <timbeale at catalyst.net.nz>
Pair-programmed-with: Garming Sam <garming at catalyst.net.nz>
---
 python/samba/emulate/__init__.py         |    2 +
 python/samba/emulate/traffic.py          | 1846 ++++++++++++++++++++++++++++++
 python/samba/tests/emulate/__init__.py   |    1 +
 python/samba/tests/emulate/traffic.py    |  196 ++++
 script/traffic_learner                   |   66 ++
 script/traffic_replay                    |  359 ++++++
 selftest/tests.py                        |    1 +
 testdata/traffic-sample-very-short.model |  115 ++
 testdata/traffic-sample-very-short.txt   |   50 +
 9 files changed, 2636 insertions(+)
 create mode 100644 python/samba/emulate/__init__.py
 create mode 100644 python/samba/emulate/traffic.py
 create mode 100644 python/samba/tests/emulate/__init__.py
 create mode 100644 python/samba/tests/emulate/traffic.py
 create mode 100755 script/traffic_learner
 create mode 100755 script/traffic_replay
 create mode 100644 testdata/traffic-sample-very-short.model
 create mode 100644 testdata/traffic-sample-very-short.txt

diff --git a/python/samba/emulate/__init__.py b/python/samba/emulate/__init__.py
new file mode 100644
index 0000000..931627d
--- /dev/null
+++ b/python/samba/emulate/__init__.py
@@ -0,0 +1,2 @@
+# Yes, we need an __init__.py
+
diff --git a/python/samba/emulate/traffic.py b/python/samba/emulate/traffic.py
new file mode 100644
index 0000000..2e2d59a
--- /dev/null
+++ b/python/samba/emulate/traffic.py
@@ -0,0 +1,1846 @@
+# -*- encoding: utf-8 -*-
+# Samba traffic replay and learning
+#
+# Copyright (C) Catalyst IT Ltd. 2017
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+import time
+import os
+import random
+import json
+import math
+import sys
+import signal
+import itertools
+
+from collections import OrderedDict, Counter, defaultdict
+from samba.emulate import traffic_packets
+from samba.samdb import SamDB
+import ldb
+from ldb import LdbError
+from samba.dcerpc import ClientConnection
+from samba.dcerpc import security, drsuapi, lsa
+from samba.dcerpc import netlogon
+from samba.dcerpc.netlogon import netr_Authenticator
+from samba.dcerpc import srvsvc
+from samba.dcerpc import samr
+from samba.drs_utils import drs_DsBind
+import traceback
+from samba.credentials import Credentials, DONT_USE_KERBEROS, MUST_USE_KERBEROS
+from samba.auth import system_session
+from samba.dsdb import UF_WORKSTATION_TRUST_ACCOUNT, UF_PASSWD_NOTREQD
+from samba.dsdb import UF_NORMAL_ACCOUNT
+from samba.dcerpc.misc import SEC_CHAN_WKSTA
+from samba import gensec
+
+SLEEP_OVERHEAD = 3e-4
+
+# we don't use None, because it complicates [de]serialisation
+NON_PACKET = '-'
+
+CLIENT_CLUES = {
+    ('dns', '0'): 1.0,      # query
+    ('smb', '0x72'): 1.0,   # Negotiate protocol
+    ('ldap', '0'): 1.0,     # bind
+    ('ldap', '3'): 1.0,     # searchRequest
+    ('ldap', '2'): 1.0,     # unbindRequest
+    ('cldap', '3'): 1.0,
+    ('dcerpc', '11'): 1.0,  # bind
+    ('dcerpc', '14'): 1.0,  # Alter_context
+    ('nbns', '0'): 1.0,     # query
+}
+
+SERVER_CLUES = {
+    ('dns', '1'): 1.0,      # response
+    ('ldap', '1'): 1.0,     # bind response
+    ('ldap', '4'): 1.0,     # search result
+    ('ldap', '5'): 1.0,     # search done
+    ('cldap', '5'): 1.0,
+    ('dcerpc', '12'): 1.0,  # bind_ack
+    ('dcerpc', '13'): 1.0,  # bind_nak
+    ('dcerpc', '15'): 1.0,  # Alter_context response
+}
+
+SKIPPED_PROTOCOLS = {"smb", "smb2", "browser", "smb_netlogon"}
+
+WAIT_SCALE = 10.0
+WAIT_THRESHOLD = (1.0 / WAIT_SCALE)
+NO_WAIT_LOG_TIME_RANGE = (-10, -3)
+
+# DEBUG_LEVEL can be changed by scripts with -d
+DEBUG_LEVEL = 0
+
+
+def debug(level, msg, *args):
+    if level <= DEBUG_LEVEL:
+        if not args:
+            print >> sys.stderr, msg
+        else:
+            print >> sys.stderr, msg % tuple(args)
+
+
+def debug_lineno(*args):
+    tb = traceback.extract_stack(limit=2)
+    print >> sys.stderr, (" %s:" "\033[01;33m"
+                          "%s " "\033[00m" % (tb[0][2], tb[0][1])),
+    for a in args:
+        print >> sys.stderr, a
+    print >> sys.stderr
+    sys.stderr.flush()
+
+
+def random_colour_print():
+    n = 18 + random.randrange(214)
+    prefix = "\033[38;5;%dm" % n
+
+    def p(*args):
+        for a in args:
+            print >>sys.stderr, "%s%s\033[00m" % (prefix, a)
+
+    return p
+
+
+class FakePacketError(Exception):
+    pass
+
+
+class Packet(object):
+    def __init__(self, fields):
+        if isinstance(fields, str):
+            fields = fields.rstrip('\n').split('\t')
+
+        (timestamp,
+         ip_protocol,
+         stream_number,
+         src,
+         dest,
+         protocol,
+         opcode,
+         desc) = fields[:8]
+        extra = fields[8:]
+
+        self.timestamp = float(timestamp)
+        self.ip_protocol = ip_protocol
+        try:
+            self.stream_number = int(stream_number)
+        except (ValueError, TypeError):
+            self.stream_number = None
+        self.src = int(src)
+        self.dest = int(dest)
+        self.protocol = protocol
+        self.opcode = opcode
+        self.desc = desc
+        self.extra = extra
+
+        if self.src < self.dest:
+            self.endpoints = (self.src, self.dest)
+        else:
+            self.endpoints = (self.dest, self.src)
+
+    def as_summary(self, time_offset=0.0):
+        extra = '\t'.join(self.extra)
+        t = self.timestamp + time_offset
+        return (t, '%f\t%s\t%s\t%d\t%d\t%s\t%s\t%s\t%s' %
+                (t,
+                 self.ip_protocol,
+                 self.stream_number or '',
+                 self.src,
+                 self.dest,
+                 self.protocol,
+                 self.opcode,
+                 self.desc,
+                 extra))
+
+    def __str__(self):
+        return ("%.3f: %d -> %d; ip %s; strm %s; prot %s; op %s; desc %s %s" %
+                (self.timestamp, self.src, self.dest, self.ip_protocol or '-',
+                 self.stream_number, self.protocol, self.opcode, self.desc,
+                 ('«' + ' '.join(self.extra) + '»' if self.extra else '')))
+
+    def __repr__(self):
+        return "<Packet @%s>" % self
+
+    def copy(self):
+        return self.__class__([self.timestamp,
+                               self.ip_protocol,
+                               self.stream_number,
+                               self.src,
+                               self.dest,
+                               self.protocol,
+                               self.opcode,
+                               self.desc] + self.extra)
+
+    def as_packet_type(self):
+        t = '%s:%s' % (self.protocol, self.opcode)
+        return t
+
+    def client_score(self):
+        """A positive number means we think it is a client; a negative number
+        means we think it is a server. Zero means no idea. range: -1 to 1.
+        """
+        key = (self.protocol, self.opcode)
+        if key in CLIENT_CLUES:
+            return CLIENT_CLUES[key]
+        if key in SERVER_CLUES:
+            return -SERVER_CLUES[key]
+        return 0.0
+
+    def play(self, conversation, context):
+        fn_name = 'packet_%s_%s' % (self.protocol, self.opcode)
+        try:
+            fn = getattr(traffic_packets, fn_name)
+
+        except AttributeError as e:
+            print >>sys.stderr, "Conversation(%s) Missing handler %s" % \
+                (conversation.conversation_id, fn_name)
+            return
+
+        # Don't display a message for kerberos packets, they're not directly
+        # generated they're used to indicate kerberos should be used
+        if self.protocol != "kerberos":
+            debug(2, "Conversation(%s) Calling handler %s" %
+                     (conversation.conversation_id, fn_name))
+
+        start = time.time()
+        try:
+            if fn(self, conversation, context):
+                # Only collect timing data for functions that generate
+                # network traffic, or fail
+                end = time.time()
+                duration = end - start
+                print("%f\t%s\t%s\t%s\t%f\tTrue\t" %
+                      (end, conversation.conversation_id, self.protocol,
+                       self.opcode, duration))
+        except Exception as e:
+            end = time.time()
+            duration = end - start
+            print("%f\t%s\t%s\t%s\t%f\tFalse\t%s" %
+                  (end, conversation.conversation_id, self.protocol,
+                   self.opcode, duration, e))
+            raise
+
+    def __cmp__(self, other):
+        return self.timestamp - other.timestamp
+
+    def is_really_a_packet(self, missing_packet_stats=None):
+        if self.protocol in SKIPPED_PROTOCOLS:
+            # Ignore any packets for the protocols we're not interested in.
+            return False
+        if self.protocol == "ldap" and self.opcode == '':
+            # skip ldap continuation packets
+            return False
+
+        fn_name = 'packet_%s_%s' % (self.protocol, self.opcode)
+        try:
+            fn = getattr(traffic_packets, fn_name)
+            if fn is traffic_packets.null_packet:
+                return False
+        except AttributeError:
+            print >>sys.stderr, "missing packet %s" % fn_name
+            return False
+        return True
+
+
+class ReplayContext(object):
+    def __init__(self,
+                 server=None,
+                 lp=None,
+                 creds=None,
+                 badpassword_frequency=None,
+                 prefer_kerberos=None,
+                 tempdir=None,
+                 statsdir=None,
+                 ou=None,
+                 base_dn=None,
+                 domain=None,
+                 domain_sid=None):
+
+        self.server                   = server
+        self.ldap_connections         = []
+        self.dcerpc_connections       = []
+        self.lsarpc_connections       = []
+        self.lsarpc_connections_named = []
+        self.drsuapi_connections      = []
+        self.srvsvc_connections       = []
+        self.samr_contexts            = []
+        self.netlogon_connection      = None
+        self.creds                    = creds
+        self.lp                       = lp
+        self.prefer_kerberos          = prefer_kerberos
+        self.ou                       = ou
+        self.base_dn                  = base_dn
+        self.domain                   = domain
+        self.statsdir                 = statsdir
+        self.global_tempdir           = tempdir
+        self.domain_sid               = domain_sid
+
+        # Bad password attempt controls
+        self.badpassword_frequency    = badpassword_frequency
+        self.last_lsarpc_bad          = False
+        self.last_lsarpc_named_bad    = False
+        self.last_simple_bind_bad     = False
+        self.last_bind_bad            = False
+        self.last_srvsvc_bad          = False
+        self.last_drsuapi_bad         = False
+        self.last_netlogon_bad        = False
+        self.last_samlogon_bad        = False
+        self.generate_ldap_search_tables()
+        self.next_conversation_id = itertools.count().next
+
+    def generate_ldap_search_tables(self):
+        session = system_session()
+
+        db = SamDB(url="ldap://%s" % self.server,
+                   session_info=session,
+                   credentials=self.creds,
+                   lp=self.lp)
+
+        res = db.search(db.domain_dn(),
+                        scope=ldb.SCOPE_SUBTREE,
+                        attrs=['dn'])
+
+        # find a list of dns for each pattern
+        # e.g. CN,CN,CN,DC,DC
+        dn_map = {}
+        attribute_clue_map = {
+            'invocationId': []
+        }
+
+        for r in res:
+            dn = str(r.dn)
+            pattern = ','.join(x.lstrip()[:2] for x in dn.split(',')).upper()
+            dns = dn_map.setdefault(pattern, [])
+            dns.append(dn)
+            if dn.startswith('CN=NTDS Settings,'):
+                attribute_clue_map['invocationId'].append(dn)
+
+        # extend the map in case we are working with a different
+        # number of DC components.
+        # for k, v in self.dn_map.items():
+        #     print >>sys.stderr, k, len(v)
+
+        for k, v in dn_map.items():
+            if k[-3:] != ',DC':
+                continue
+            p = k[:-3]
+            while p[-3:] == ',DC':
+                p = p[:-3]
+            for i in range(5):
+                p += ',DC'
+                if p != k and p in dn_map:
+                    print >> sys.stderr, 'dn_map collison %s %s' % (k, p)
+                    continue
+                dn_map[p] = dn_map[k]
+
+        self.dn_map = dn_map
+        self.attribute_clue_map = attribute_clue_map
+
+    def generate_process_local_config(self, account, conversation):
+        if account is None:
+            return
+        self.netbios_name             = account.netbios_name
+        self.machinepass              = account.machinepass
+        self.username                 = account.username
+        self.userpass                 = account.userpass
+
+        self.tempdir = mk_masked_dir(self.global_tempdir,
+                                     'conversation-%d' %
+                                     conversation.conversation_id)
+
+        self.lp.set("private dir",     self.tempdir)
+        self.lp.set("lock dir",        self.tempdir)
+        self.lp.set("state directory", self.tempdir)
+        self.lp.set("tls verify peer", "no_check")
+
+        # If the domain was not specified, check for the environment
+        # variable.
+        if self.domain is None:
+            self.domain = os.environ["DOMAIN"]
+
+        self.remoteAddress = "/root/ncalrpc_as_system"
+        self.samlogon_dn   = ("cn=%s,%s" %
+                              (self.netbios_name, self.ou))
+        self.user_dn       = ("cn=%s,%s" %
+                              (self.username, self.ou))
+
+        self.generate_machine_creds()
+        self.generate_user_creds()
+
+    # Execute the supplied function
+    # Based on the frequency in badpassword_frequency randomly perform the
+    # function with the supplied bad credentials.
+    # If run with bad credentials, the function is re-run with the good
+    # credentials.
+    # failed_last_time is used to prevent consecutive bad credential attempts
+    # So the over all bad credential frequency will be lower than that
+    # requested, but not significantly
+    def with_random_bad_credentials(self, f, good, bad, failed_last_time):
+        if not failed_last_time:
+            if (self.badpassword_frequency > 0 and
+               random.random() < self.badpassword_frequency):
+                try:
+                    f(bad)
+                except:
+                    # Ignore any exceptions as the operation may fail
+                    # as it's being performed with bad credentials
+                    pass
+                failed_last_time = True
+            else:
+                failed_last_time = False
+
+        result = f(good)
+        return (result, failed_last_time)
+
+    def generate_user_creds(self):
+        self.user_creds = Credentials()
+        self.user_creds.guess(self.lp)
+        self.user_creds.set_workstation(self.netbios_name)
+        self.user_creds.set_password(self.userpass)
+        self.user_creds.set_username(self.username)
+        if self.prefer_kerberos:
+            self.user_creds.set_kerberos_state(MUST_USE_KERBEROS)
+        else:
+            self.user_creds.set_kerberos_state(DONT_USE_KERBEROS)
+
+        self.user_creds_bad = Credentials()
+        self.user_creds_bad.guess(self.lp)
+        self.user_creds_bad.set_workstation(self.netbios_name)
+        self.user_creds_bad.set_password(self.userpass[:-4])
+        self.user_creds_bad.set_username(self.username)
+        if self.prefer_kerberos:
+            self.user_creds_bad.set_kerberos_state(MUST_USE_KERBEROS)
+        else:
+            self.user_creds_bad.set_kerberos_state(DONT_USE_KERBEROS)
+
+        # Credentials for ldap simple bind.
+        self.simple_bind_creds = Credentials()
+        self.simple_bind_creds.guess(self.lp)
+        self.simple_bind_creds.set_workstation(self.netbios_name)
+        self.simple_bind_creds.set_password(self.userpass)
+        self.simple_bind_creds.set_username(self.username)
+        self.simple_bind_creds.set_gensec_features(
+            self.simple_bind_creds.get_gensec_features() | gensec.FEATURE_SEAL)
+        if self.prefer_kerberos:
+            self.simple_bind_creds.set_kerberos_state(MUST_USE_KERBEROS)
+        else:
+            self.simple_bind_creds.set_kerberos_state(DONT_USE_KERBEROS)
+        self.simple_bind_creds.set_bind_dn(self.user_dn)
+
+        self.simple_bind_creds_bad = Credentials()
+        self.simple_bind_creds_bad.guess(self.lp)
+        self.simple_bind_creds_bad.set_workstation(self.netbios_name)
+        self.simple_bind_creds_bad.set_password(self.userpass[:-4])
+        self.simple_bind_creds_bad.set_username(self.username)
+        self.simple_bind_creds_bad.set_gensec_features(
+            self.simple_bind_creds_bad.get_gensec_features() |
+            gensec.FEATURE_SEAL)
+        if self.prefer_kerberos:
+            self.simple_bind_creds_bad.set_kerberos_state(MUST_USE_KERBEROS)
+        else:
+            self.simple_bind_creds_bad.set_kerberos_state(DONT_USE_KERBEROS)
+        self.simple_bind_creds_bad.set_bind_dn(self.user_dn)
+
+    def generate_machine_creds(self):
+        self.machine_creds = Credentials()
+        self.machine_creds.guess(self.lp)
+        self.machine_creds.set_workstation(self.netbios_name)
+        self.machine_creds.set_secure_channel_type(SEC_CHAN_WKSTA)
+        self.machine_creds.set_password(self.machinepass)
+        self.machine_creds.set_username(self.netbios_name + "$")
+        if self.prefer_kerberos:
+            self.machine_creds.set_kerberos_state(MUST_USE_KERBEROS)
+        else:
+            self.machine_creds.set_kerberos_state(DONT_USE_KERBEROS)
+
+        self.machine_creds_bad = Credentials()
+        self.machine_creds_bad.guess(self.lp)
+        self.machine_creds_bad.set_workstation(self.netbios_name)
+        self.machine_creds_bad.set_secure_channel_type(SEC_CHAN_WKSTA)
+        self.machine_creds_bad.set_password(self.machinepass[:-4])
+        self.machine_creds_bad.set_username(self.netbios_name + "$")
+        if self.prefer_kerberos:
+            self.machine_creds_bad.set_kerberos_state(MUST_USE_KERBEROS)
+        else:
+            self.machine_creds_bad.set_kerberos_state(DONT_USE_KERBEROS)
+
+    def get_matching_dn(self, pattern, attributes=None):
+        # If the pattern is an empty string, we assume ROOTDSE,
+        # Otherwise we try adding or removing DC suffixes, then
+        # shorter leading patterns until we hit one.
+        # e.g if there is no CN,CN,CN,CN,DC,DC
+        # we first try       CN,CN,CN,CN,DC
+        # and                CN,CN,CN,CN,DC,DC,DC
+        # then change to        CN,CN,CN,DC,DC
+        # and as last resort we use the base_dn
+        attr_clue = self.attribute_clue_map.get(attributes)
+        if attr_clue:
+            return random.choice(attr_clue)
+
+        pattern = pattern.upper()
+        while pattern:
+            if pattern in self.dn_map:
+                return random.choice(self.dn_map[pattern])
+            # chop one off the front and try it all again.
+            pattern = pattern[3:]
+
+        return self.base_dn
+
+    def get_dcerpc_connection(self, new=False):
+        guid = '12345678-1234-abcd-ef00-01234567cffb'  # RPC_NETLOGON UUID
+        if self.dcerpc_connections and not new:
+            return self.dcerpc_connections[-1]
+        c = ClientConnection("ncacn_ip_tcp:%s" % self.server,
+                             (guid, 1), self.lp)
+        self.dcerpc_connections.append(c)
+        return c
+
+    def get_srvsvc_connection(self, new=False):
+        if self.srvsvc_connections and not new:
+            return self.srvsvc_connections[-1]
+
+        def connect(creds):
+            return srvsvc.srvsvc("ncacn_np:%s" % (self.server),
+                                 self.lp,
+                                 creds)
+
+        (c, self.last_srvsvc_bad) = \
+            self.with_random_bad_credentials(connect,
+                                             self.user_creds,
+                                             self.user_creds_bad,
+                                             self.last_srvsvc_bad)
+
+        self.srvsvc_connections.append(c)
+        return c
+
+    def get_lsarpc_connection(self, new=False):
+        if self.lsarpc_connections and not new:
+            return self.lsarpc_connections[-1]
+
+        def connect(creds):
+            binding_options = 'schannel,seal,sign'
+            return lsa.lsarpc("ncacn_ip_tcp:%s[%s]" %
+                              (self.server, binding_options),
+                              self.lp,
+                              creds)
+
+        (c, self.last_lsarpc_bad) = \
+            self.with_random_bad_credentials(connect,
+                                             self.machine_creds,
+                                             self.machine_creds_bad,
+                                             self.last_lsarpc_bad)
+
+        self.lsarpc_connections.append(c)
+        return c
+
+    def get_lsarpc_named_pipe_connection(self, new=False):
+        if self.lsarpc_connections_named and not new:
+            return self.lsarpc_connections_named[-1]
+
+        def connect(creds):
+            return lsa.lsarpc("ncacn_np:%s" % (self.server),
+                              self.lp,
+                              creds)
+
+        (c, self.last_lsarpc_named_bad) = \
+            self.with_random_bad_credentials(connect,
+                                             self.machine_creds,
+                                             self.machine_creds_bad,
+                                             self.last_lsarpc_named_bad)
+
+        self.lsarpc_connections_named.append(c)
+        return c
+
+    def get_drsuapi_connection_pair(self, new=False, unbind=False):
+        """get a (drs, drs_handle) tuple"""
+        if self.drsuapi_connections and not new:
+            c = self.drsuapi_connections[-1]
+            return c
+
+        def connect(creds):
+            binding_options = 'seal'
+            binding_string = "ncacn_ip_tcp:%s[%s]" %\
+                             (self.server, binding_options)
+            return drsuapi.drsuapi(binding_string, self.lp, creds)
+
+        (drs, self.last_drsuapi_bad) = \
+            self.with_random_bad_credentials(connect,
+                                             self.user_creds,
+                                             self.user_creds_bad,
+                                             self.last_drsuapi_bad)
+
+        (drs_handle, supported_extensions) = drs_DsBind(drs)
+        c = (drs, drs_handle)
+        self.drsuapi_connections.append(c)
+        return c
+
+    def get_ldap_connection(self, new=False, simple=False):
+        if self.ldap_connections and not new:
+            return self.ldap_connections[-1]
+
+        def simple_bind(creds):
+            return SamDB('ldaps://%s' % self.server,
+                         credentials=creds,
+                         lp=self.lp)
+
+        def sasl_bind(creds):
+            return SamDB('ldap://%s' % self.server,
+                         credentials=creds,
+                         lp=self.lp)
+        if simple:
+            (samdb, self.last_simple_bind_bad) = \
+                self.with_random_bad_credentials(simple_bind,
+                                                 self.simple_bind_creds,
+                                                 self.simple_bind_creds_bad,
+                                                 self.last_simple_bind_bad)
+        else:
+            (samdb, self.last_bind_bad) = \
+                self.with_random_bad_credentials(sasl_bind,
+                                                 self.user_creds,
+                                                 self.user_creds_bad,
+                                                 self.last_bind_bad)
+
+        self.ldap_connections.append(samdb)
+        return samdb
+
+    def get_samr_context(self, new=False):
+        if not self.samr_contexts or new:
+            self.samr_contexts.append(SamrContext(self.server))
+        return self.samr_contexts[-1]
+
+    def get_netlogon_connection(self):
+
+        if self.netlogon_connection:
+            return self.netlogon_connection
+
+        def connect(creds):
+            return netlogon.netlogon("ncacn_ip_tcp:%s[schannel,seal]" %
+                                     (self.server),
+                                     self.lp,
+                                     creds)
+        (c, self.last_netlogon_bad) = \
+            self.with_random_bad_credentials(connect,
+                                             self.machine_creds,
+                                             self.machine_creds_bad,
+                                             self.last_netlogon_bad)
+        self.netlogon_connection = c
+        return c
+
+    def guess_a_dns_lookup(self):
+        # XXX at some point do something sensible
+        return ('example.com', 'A')
+
+    def get_authenticator(self):
+        auth = self.machine_creds.new_client_authenticator()
+        current  = netr_Authenticator()
+        current.cred.data = [ord(x) for x in auth["credential"]]
+        current.timestamp = auth["timestamp"]
+
+        subsequent = netr_Authenticator()
+        return (current, subsequent)
+
+
+class SamrContext(object):
+    def __init__(self, server):
+        self.connection    = None
+        self.handle        = None
+        self.domain_handle = None
+        self.domain_sid    = None
+        self.group_handle  = None
+        self.user_handle   = None
+        self.rids          = None
+        self.server        = server
+
+    def get_connection(self):
+        if not self.connection:
+            self.connection = samr.samr("ncacn_ip_tcp:%s" % (self.server))
+        return self.connection
+
+    def get_handle(self):
+        if not self.handle:
+            c = self.get_connection()
+            self.handle = c.Connect2(None, security.SEC_FLAG_MAXIMUM_ALLOWED)
+        return self.handle
+
+
+class Conversation(object):
+    conversation_id = None
+
+    def __init__(self, start_time=None, endpoints=None):
+        self.start_time = start_time
+        self.endpoints = endpoints
+        self.packets = []
+        self.msg = random_colour_print()
+        self.client_balance = 0.0
+
+    def add_packet(self, packet):
+        """Add a packet object to this conversation, making a local copy with
+        a conversation-relative timestamp."""
+        p = packet.copy()
+
+        if self.start_time is None:
+            self.start_time = p.timestamp
+
+        if self.endpoints is None:
+            self.endpoints = p.endpoints
+
+        if p.endpoints != self.endpoints:
+            raise FakePacketError("Conversation endpoints %s don't match"
+                                  "packet endpoints %s" %
+                                  (self.endpoints, p.endpoints))
+
+        p.timestamp -= self.start_time
+
+        if p.src == p.endpoints[0]:
+            self.client_balance -= p.client_score()
+        else:
+            self.client_balance += p.client_score()
+
+        if p.is_really_a_packet():
+            self.packets.append(p)
+
+    def add_short_packet(self, timestamp, p, extra, client=True):
+        """Create a packet from a timestamp, and 'protocol:opcode' pair, and a
+        (possibly empty) list of extra data. If client is True, assume
+        this packet is from the client to the server.
+        """
+        protocol, opcode = p.split(':', 1)
+        src, dest = self.guess_client_server()
+        if not client:
+            src, dest = dest, src
+
+        desc = OP_DESCRIPTIONS.get((protocol, opcode), '')
+        ip_protocol = IP_PROTOCOLS.get(protocol, '06')
+        fields = [timestamp - self.start_time, ip_protocol,
+                  '', src, dest,
+                  protocol, opcode, desc]
+        fields.extend(extra)
+        packet = Packet(fields)
+        # XXX we're assuming the timestamp is already adjusted for
+        # this conversation?
+        # XXX should we adjust client balance for guessed packets?
+        if packet.src == packet.endpoints[0]:
+            self.client_balance -= packet.client_score()
+        else:
+            self.client_balance += packet.client_score()
+        if packet.is_really_a_packet():
+            self.packets.append(packet)
+
+    def __str__(self):
+        return ("<Conversation %s %s starting %.3f %d packets>" %
+                (self.conversation_id, self.endpoints, self.start_time,
+                 len(self.packets)))
+
+    __repr__ = __str__
+
+    def __iter__(self):
+        return iter(self.packets)
+
+    def __len__(self):
+        return len(self.packets)
+
+    def get_duration(self):
+        if len(self.packets) < 2:
+            return 0
+        return self.packets[-1].timestamp - self.packets[0].timestamp
+
+    def replay_as_summary_lines(self):
+        lines = []
+        for p in self.packets:
+            lines.append(p.as_summary(self.start_time))
+        return lines
+
+    def replay_in_fork_with_delay(self, start, context=None, account=None):
+        def signal_handler(signal, frame):
+            sys.stderr.close()
+            sys.stdout.close()
+            os._exit(0)
+
+        t = self.start_time
+        now = time.time() - start
+        gap = t - now
+        # we are replaying strictly in order, so it is safe to sleep
+        # in the main process if the gap is big enough. This reduces
+        # the number of concurrent threads, which allows us to make
+        # larger loads.
+        if gap > 0.15 and False:
+            print >>sys.stderr, "sleeping for %f in main process" % (gap - 0.1)
+            time.sleep(gap - 0.1)
+            now = time.time() - start
+            gap = t - now
+            print >>sys.stderr, "gap is now %f" % gap
+
+        self.conversation_id = context.next_conversation_id()
+        pid = os.fork()
+        if pid != 0:
+            return pid
+        pid = os.getpid()
+        signal.signal(signal.SIGTERM, signal_handler)
+        # we must never return, or we'll end up running parts of the
+        # parent's clean-up code. So we work in a try...finally, and
+        # try to print any exceptions.
+
+        try:
+            context.generate_process_local_config(account, self)
+            sys.stdin.close()
+            os.close(0)
+            filename = os.path.join(context.statsdir, 'stats-conversation-%d' %
+                                    self.conversation_id)
+            sys.stdout.close()
+            sys.stdout = open(filename, 'w')
+
+            sleep_time = gap - SLEEP_OVERHEAD
+            if sleep_time > 0:
+                time.sleep(sleep_time)
+
+            miss = t - (time.time() - start)
+            self.msg("starting %s [miss %.3f pid %d]" % (self, miss, pid))
+            self.replay(context)
+        except Exception:
+            print >>sys.stderr,\
+                ("EXCEPTION in child PID %d, conversation %s" % (pid, self))
+            traceback.print_exc(sys.stderr)
+        finally:
+            sys.stderr.close()
+            sys.stdout.close()
+            os._exit(0)
+
+    def replay(self, context=None):
+        start = time.time()
+
+        for p in self.packets:
+            now = time.time() - start
+            gap = p.timestamp - now
+            sleep_time = gap - SLEEP_OVERHEAD
+            if sleep_time > 0:
+                time.sleep(sleep_time)
+
+            miss = p.timestamp - (time.time() - start)
+            if context is None:
+                self.msg("packet %s [miss %.3f pid %d]" % (p, miss,
+                                                           os.getpid()))
+                continue
+            p.play(self, context)
+
+    def guess_client_server(self, server_clue=None):
+        """Have a go at deciding who is the server and who is the client.
+        returns (client, server)
+        """
+        a, b = self.endpoints
+
+        if self.client_balance < 0:
+            return (a, b)
+
+        # in the absense of a clue, we will fall through to assuming
+        # the lowest number is the server (which is usually true).
+
+        if self.client_balance == 0 and server_clue == b:
+            return (a, b)
+
+        return (b, a)
+
+
+class DnsHammer(Conversation):
+    """A lightweight conversation that generates a lot of dns:0 packets on
+    the fly"""
+
+    def __init__(self, dns_rate, duration):
+        n = int(dns_rate * duration)
+        self.times = [random.uniform(0, duration) for i in range(n)]
+        self.times.sort()
+        self.rate = dns_rate
+        self.duration = duration
+        self.start_time = 0
+        self.msg = random_colour_print()
+
+    def __str__(self):
+        return ("<DnsHammer %d packets over %.1fs (rate %.2f)>" %
+                (len(self.times), self.duration, self.rate))
+
+    def replay_in_fork_with_delay(self, start, context=None, account=None):
+        import pdb
+        pdb.set_trace()
+        Conversation.replay_in_fork_with_delay(self, start, context, account)
+
+    def replay(self, context=None):
+        start = time.time()
+        fn = traffic_packets.packet_dns_0
+        for t in self.times:
+            now = time.time() - start
+            gap = t - now
+            sleep_time = gap - SLEEP_OVERHEAD
+            if sleep_time > 0:
+                time.sleep(sleep_time)
+
+            if context is None:
+                miss = t - (time.time() - start)
+                self.msg("packet %s [miss %.3f pid %d]" % (t, miss,
+                                                           os.getpid()))
+                continue
+
+            packet_start = time.time()
+            try:
+                fn(self, self, context)
+                end = time.time()
+                duration = end - packet_start
+                print("%f\tDNS\tdns\t0\t%f\tTrue\t" % (end, duration))
+            except Exception as e:
+                end = time.time()
+                duration = end - packet_start
+                print("%f\tDNS\tdns\t0\t%f\tFalse\t%s" % (end, duration, e))
+                raise
+
+
+def ingest_summaries(files, dns_mode='count'):
+    dns_counts = defaultdict(int)
+    packets = []
+    for f in files:
+        if isinstance(f, str):
+            f = open(f)
+        print >>sys.stderr, "Ingesting %s" % (f.name,)
+        for line in f:
+            p = Packet(line)
+            if p.protocol == 'dns' and dns_mode != 'include':
+                dns_counts[p.opcode] += 1
+            else:
+                packets.append(p)
+
+        f.close()
+
+    if not packets:
+        return [], 0
+
+    start_time = min(p.timestamp for p in packets)
+    last_packet = max(p.timestamp for p in packets)
+
+    print >>sys.stderr, "gathering packets into conversations"
+    conversations = OrderedDict()
+    for p in packets:
+        p.timestamp -= start_time
+        c = conversations.get(p.endpoints)
+        if c is None:
+            c = Conversation()
+            conversations[p.endpoints] = c
+        c.add_packet(p)
+
+    # We only care about conversations with actual traffic, so we
+    # filter out conversations with nothing to say. We do that here,
+    # rather than earlier, because those empty packets contain useful
+    # hints as to which end of the conversation was the client.
+    conversation_list = []
+    for c in conversations.values():
+        if len(c) != 0:
+            conversation_list.append(c)
+
+    # This is obviously not correct, as many conversations will appear
+    # to start roughly simultaneously at the beginning of the snapshot.
+    # To which we say: oh well, so be it.
+    duration = float(last_packet - start_time)
+    mean_interval = len(conversations) / duration
+
+    return conversation_list, mean_interval, duration, dns_counts
+
+
+def guess_server_address(conversations):
+    # we guess the most common address.
+    addresses = Counter()
+    for c in conversations:
+        addresses.update(c.endpoints)
+    if addresses:
+        return addresses.most_common(1)[0]
+
+
+def stringify_keys(x):
+    y = {}
+    for k, v in x.iteritems():
+        k2 = '\t'.join(k)
+        y[k2] = v
+    return y
+
+    # Server response, so should be ignored
+    pass
+
+
+def unstringify_keys(x):
+    y = {}
+    for k, v in x.iteritems():
+        t = tuple(str(k).split('\t'))
+        y[t] = v
+    return y
+
+
+class TrafficModel(object):
+    def __init__(self, n=3):
+        self.ngrams = {}
+        self.query_details = {}
+        self.n = n
+        self.dns_opcounts = defaultdict(int)
+        self.cumulative_duration = 0.0
+        self.conversation_rate = [0, 1]
+
+    def learn(self, conversations, dns_opcounts={}):
+        prev = 0.0
+        cum_duration = 0.0
+        key = (NON_PACKET,) * (self.n - 1)
+
+        server = guess_server_address(conversations)
+
+        for k, v in dns_opcounts.items():
+            self.dns_opcounts[k] += v
+
+        if len(conversations) > 1:
+            elapsed =\
+                conversations[-1].start_time - conversations[0].start_time
+            self.conversation_rate[0] = len(conversations)
+            self.conversation_rate[1] = elapsed
+
+        for c in conversations:
+            client, server = c.guess_client_server(server)
+            cum_duration += c.get_duration()
+            key = (NON_PACKET,) * (self.n - 1)
+            for p in c:
+                if p.src != client:
+                    continue
+
+                elapsed = p.timestamp - prev
+                prev = p.timestamp
+                if elapsed > WAIT_THRESHOLD:
+                    # add the wait as an extra state
+                    wait = 'wait:%d' % (math.log(max(1.0,
+                                                     elapsed * WAIT_SCALE)))
+                    self.ngrams.setdefault(key, []).append(wait)
+                    key = key[1:] + (wait,)
+
+                short_p = p.as_packet_type()
+                self.query_details.setdefault(short_p,
+                                              []).append(tuple(p.extra))
+                self.ngrams.setdefault(key, []).append(short_p)
+                key = key[1:] + (short_p,)
+
+        self.cumulative_duration += cum_duration
+        # add in the end
+        self.ngrams.setdefault(key, []).append(NON_PACKET)
+
+    def save(self, f):
+        ngrams = {}
+        for k, v in self.ngrams.iteritems():
+            k = '\t'.join(k)
+            ngrams[k] = dict(Counter(v))
+
+        query_details = {}
+        for k, v in self.query_details.iteritems():
+            query_details[k] = dict(Counter('\t'.join(x) if x else '-'
+                                            for x in v))
+
+        d = {
+            'ngrams': ngrams,
+            'query_details': query_details,
+            'cumulative_duration': self.cumulative_duration,
+            'conversation_rate': self.conversation_rate,
+        }
+        d['dns'] = self.dns_opcounts
+
+        if isinstance(f, str):
+            f = open(f, 'w')
+
+        json.dump(d, f, indent=2)
+
+    def load(self, f):
+        if isinstance(f, str):
+            f = open(f)
+
+        d = json.load(f)
+
+        for k, v in d['ngrams'].iteritems():
+            k = tuple(str(k).split('\t'))
+            values = self.ngrams.setdefault(k, [])
+            for p, count in v.iteritems():
+                values.extend([str(p)] * count)
+
+        for k, v in d['query_details'].iteritems():
+            values = self.query_details.setdefault(str(k), [])
+            for p, count in v.iteritems():
+                if p == '-':
+                    values.extend([()] * count)
+                else:
+                    values.extend([tuple(str(p).split('\t'))] * count)
+
+        if 'dns' in d:
+            for k, v in d['dns'].items():
+                self.dns_opcounts[k] += v
+
+        self.cumulative_duration = d['cumulative_duration']
+        self.conversation_rate = d['conversation_rate']
+
+    def construct_conversation(self, timestamp=0.0, client=2, server=1,
+                               hard_stop=None, packet_rate=1):
+        c = Conversation(timestamp, (server, client))
+
+        key = (NON_PACKET,) * (self.n - 1)
+
+        while key in self.ngrams:
+            p = random.choice(self.ngrams.get(key, NON_PACKET))
+            if p == NON_PACKET:
+                break
+            if p in self.query_details:
+                extra = random.choice(self.query_details[p])
+            else:
+                extra = []
+
+            protocol, opcode = p.split(':', 1)
+            if protocol == 'wait':
+                log_wait_time = int(opcode) + random.random()
+                wait = math.exp(log_wait_time) / (WAIT_SCALE * packet_rate)
+                timestamp += wait
+            else:
+                log_wait = random.uniform(*NO_WAIT_LOG_TIME_RANGE)
+                wait = math.exp(log_wait) / packet_rate
+                timestamp += math.exp(log_wait)
+                if hard_stop is not None and timestamp > hard_stop:
+                    break
+                c.add_short_packet(timestamp, p, extra)
+
+            key = key[1:] + (p,)
+
+        return c
+
+    def generate_conversations(self, rate, duration, packet_rate=1):
+        n = 1 + int(rate * self.conversation_rate[0] * duration /
+                    self.conversation_rate[1])
+        server = 1
+        client = 2
+
+        conversations = []
+        while client < n + 2:
+            start = random.uniform(0, duration - 0.5)
+            c = self.construct_conversation(start,
+                                            client,
+                                            server,
+                                            hard_stop=duration,
+                                            packet_rate=packet_rate)
+            if len(c) == 0:
+                continue
+            conversations.append(c)
+            client += 1
+        conversations.sort()
+        return conversations
+
+IP_PROTOCOLS = {
+    'dns': '11',
+    'rpc_netlogon': '06',
+    'kerberos': '06',      # ratio 16248:258
+    'smb': '06',
+    'smb2': '06',
+    'ldap': '06',
+    'cldap': '11',
+    'lsarpc': '06',
+    'samr': '06',
+    'dcerpc': '06',
+    'epm': '06',
+    'drsuapi': '06',
+    'browser': '11',
+    'smb_netlogon': '11',
+    'srvsvc': '06',
+    'nbns': '11',
+}
+
+OP_DESCRIPTIONS = {
+    ('browser', '0x01'): 'Host Announcement (0x01)',
+    ('browser', '0x02'): 'Request Announcement (0x02)',
+    ('browser', '0x08'): 'Browser Election Request (0x08)',
+    ('browser', '0x09'): 'Get Backup List Request (0x09)',
+    ('browser', '0x0c'): 'Domain/Workgroup Announcement (0x0c)',
+    ('browser', '0x0f'): 'Local Master Announcement (0x0f)',
+    ('cldap', '3'): 'searchRequest',
+    ('cldap', '5'): 'searchResDone',
+    ('dcerpc', '0'): 'Request',
+    ('dcerpc', '11'): 'Bind',
+    ('dcerpc', '12'): 'Bind_ack',
+    ('dcerpc', '13'): 'Bind_nak',
+    ('dcerpc', '14'): 'Alter_context',
+    ('dcerpc', '15'): 'Alter_context_resp',
+    ('dcerpc', '16'): 'AUTH3',
+    ('dcerpc', '2'): 'Response',
+    ('dns', '0'): 'query',
+    ('dns', '1'): 'response',
+    ('drsuapi', '0'): 'DsBind',
+    ('drsuapi', '12'): 'DsCrackNames',
+    ('drsuapi', '13'): 'DsWriteAccountSpn',
+    ('drsuapi', '1'): 'DsUnbind',
+    ('drsuapi', '2'): 'DsReplicaSync',
+    ('drsuapi', '3'): 'DsGetNCChanges',
+    ('drsuapi', '4'): 'DsReplicaUpdateRefs',
+    ('epm', '3'): 'Map',
+    ('kerberos', ''): '',
+    ('ldap', ''): '',
+    ('ldap', '0'): 'bindRequest',
+    ('ldap', '1'): 'bindResponse',
+    ('ldap', '2'): 'unbindRequest',
+    ('ldap', '3'): 'searchRequest',
+    ('ldap', '4'): 'searchResEntry',
+    ('ldap', '5'): 'searchResDone',
+    ('ldap', ''): '*** Unknown ***',
+    ('lsarpc', '14'): 'lsa_LookupNames',
+    ('lsarpc', '15'): 'lsa_LookupSids',
+    ('lsarpc', '39'): 'lsa_QueryTrustedDomainInfoBySid',
+    ('lsarpc', '40'): 'lsa_SetTrustedDomainInfo',
+    ('lsarpc', '6'): 'lsa_OpenPolicy',
+    ('lsarpc', '76'): 'lsa_LookupSids3',
+    ('lsarpc', '77'): 'lsa_LookupNames4',
+    ('nbns', '0'): 'query',
+    ('nbns', '1'): 'response',
+    ('rpc_netlogon', '21'): 'NetrLogonDummyRoutine1',
+    ('rpc_netlogon', '26'): 'NetrServerAuthenticate3',
+    ('rpc_netlogon', '29'): 'NetrLogonGetDomainInfo',
+    ('rpc_netlogon', '30'): 'NetrServerPasswordSet2',
+    ('rpc_netlogon', '39'): 'NetrLogonSamLogonEx',
+    ('rpc_netlogon', '40'): 'DsrEnumerateDomainTrusts',
+    ('rpc_netlogon', '45'): 'NetrLogonSamLogonWithFlags',
+    ('rpc_netlogon', '4'): 'NetrServerReqChallenge',
+    ('samr', '0',): 'Connect',
+    ('samr', '16'): 'GetAliasMembership',
+    ('samr', '17'): 'LookupNames',
+    ('samr', '18'): 'LookupRids',
+    ('samr', '19'): 'OpenGroup',
+    ('samr', '1'): 'Close',
+    ('samr', '25'): 'QueryGroupMember',
+    ('samr', '34'): 'OpenUser',
+    ('samr', '36'): 'QueryUserInfo',
+    ('samr', '39'): 'GetGroupsForUser',
+    ('samr', '3'): 'QuerySecurity',
+    ('samr', '5'): 'LookupDomain',
+    ('samr', '64'): 'Connect5',
+    ('samr', '6'): 'EnumDomains',
+    ('samr', '7'): 'OpenDomain',
+    ('samr', '8'): 'QueryDomainInfo',
+    ('smb', '0x04'): 'Close (0x04)',
+    ('smb', '0x24'): 'Locking AndX (0x24)',
+    ('smb', '0x2e'): 'Read AndX (0x2e)',
+    ('smb', '0x32'): 'Trans2 (0x32)',
+    ('smb', '0x71'): 'Tree Disconnect (0x71)',
+    ('smb', '0x72'): 'Negotiate Protocol (0x72)',
+    ('smb', '0x73'): 'Session Setup AndX (0x73)',
+    ('smb', '0x74'): 'Logoff AndX (0x74)',
+    ('smb', '0x75'): 'Tree Connect AndX (0x75)',
+    ('smb', '0xa2'): 'NT Create AndX (0xa2)',
+    ('smb2', '0'): 'NegotiateProtocol',
+    ('smb2', '11'): 'Ioctl',
+    ('smb2', '14'): 'Find',
+    ('smb2', '16'): 'GetInfo',
+    ('smb2', '18'): 'Break',
+    ('smb2', '1'): 'SessionSetup',
+    ('smb2', '2'): 'SessionLogoff',
+    ('smb2', '3'): 'TreeConnect',
+    ('smb2', '4'): 'TreeDisconnect',
+    ('smb2', '5'): 'Create',
+    ('smb2', '6'): 'Close',
+    ('smb2', '8'): 'Read',
+    ('smb_netlogon', '0x12'): 'SAM LOGON request from client (0x12)',
+    ('smb_netlogon', '0x17'): ('SAM Active Directory Response - '
+                               'user unknown (0x17)'),
+    ('srvsvc', '16'): 'NetShareGetInfo',
+    ('srvsvc', '21'): 'NetSrvGetInfo',
+}
+
+
+def expand_short_packet(p, timestamp, src, dest, extra):
+    protocol, opcode = p.split(':', 1)
+    desc = OP_DESCRIPTIONS.get((protocol, opcode), '')
+    ip_protocol = IP_PROTOCOLS.get(protocol, '06')
+
+    line = [timestamp, ip_protocol, '', src, dest, protocol, opcode, desc]
+    line.extend(extra)
+    return '\t'.join(line)
+
+
+def replay(conversations,
+           host=None,
+           creds=None,
+           lp=None,
+           accounts=None,
+           dns_rate=0,
+           duration=None,
+           **kwargs):
+
+    context = ReplayContext(server=host,
+                            creds=creds,
+                            lp=lp,
+                            **kwargs)
+
+    if len(accounts) < len(conversations):
+        print >> sys.stderr, ("we have %d accounts but %d conversations" %
+                              (accounts, conversations))
+
+    cstack = zip(sorted(conversations,
+                        key=lambda x: x.start_time, reverse=True),
+                 accounts)
+
+    start = time.time()
+
+    if duration is None:
+        # end 1 second after the last packet of the last conversation
+        # to start. Conversations other than the last could still be
+        # going, but we don't care.
+        duration = cstack[0][0].packets[-1].timestamp + 1.0
+        print >>sys.stderr, "We will stop after %.1f seconds" % duration
+
+    end = start + duration
+
+    print("Replaying traffic for %u conversations over %d seconds"
+          % (len(conversations), duration))
+
+    children = {}
+    if dns_rate:
+        dns_hammer = DnsHammer(dns_rate, duration)
+        cstack.append((dns_hammer, None))
+
+    try:
+        while True:
+            # we spawn a batch, wait for finishers, then spawn another
+            now = time.time()
+            batch_end = min(now + 2.0, end)
+            fork_time = 0.0
+            fork_n = 0
+            while cstack:
+                c, account = cstack.pop()
+                if c.start_time + start > batch_end:
+                    cstack.append((c, account))
+                    break
+
+                st = time.time()
+                pid = c.replay_in_fork_with_delay(start, context, account)
+                children[pid] = c
+                t = time.time()
+                elapsed = t - st
+                fork_time += elapsed
+                fork_n += 1
+                print >>sys.stderr, "forked %s in pid %s (in %fs)" % (c, pid,
+                                                                      elapsed)
+
+            if fork_n:
+                print >>sys.stderr, ("forked %d times in %f seconds (avg %f)" %
+                                     (fork_n, fork_time, fork_time / fork_n))
+            elif cstack:
+                debug(2, "no forks in batch ending %f" % batch_end)
+
+            while time.time() < batch_end - 1.0:
+                time.sleep(0.01)
+                try:
+                    pid, status = os.waitpid(-1, os.WNOHANG)
+                except OSError as e:
+                    if e.errno != 10:  # no child processes
+                        raise
+                    break
+                if pid:
+                    c = children.pop(pid, None)
+                    print >>sys.stderr, ("process %d finished conversation %s;"
+                                         " %d to go" %
+                                         (pid, c, len(children)))
+
+            if time.time() >= end:
+                print >>sys.stderr, "time to stop"
+                break
+
+    except Exception:
+        print >>sys.stderr, "EXCEPTION in parent"
+        traceback.print_exc()
+    finally:
+        for s in (15, 15, 9):
+            print >>sys.stderr, ("killing %d children with -%d" %
+                                 (len(children), s))
+            for pid in children:
+                try:
+                    os.kill(pid, s)
+                except OSError as e:
+                    if e.errno != 3:  # don't fail if it has already died
+                        raise
+            time.sleep(0.5)
+            end = time.time() + 1
+            while children:
+                try:
+                    pid, status = os.waitpid(-1, os.WNOHANG)
+                except OSError as e:
+                    if e.errno != 10:
+                        raise
+                if pid != 0:
+                    c = children.pop(pid, None)
+                    print >>sys.stderr, ("kill -%d %d KILLED conversation %s; "
+                                         "%d to go" %
+                                         (s, pid, c, len(children)))
+                if time.time() >= end:
+                    break
+
+            if not children:
+                break
+            time.sleep(1)
+
+        if children:
+            print >>sys.stderr, "%d children are missing" % len(children)
+
+        # there may be stragglers that were forked just as ^C was hit
+        # and don't appear in the list of children. We can get them
+        # with killpg, but that will also kill us, so this is^H^H would be
+        # goodbye, except we cheat and pretend to use ^C (SIG_INTERRUPT),
+        # so as not to have to fuss around writing signal handlers.
+        try:
+            os.killpg(0, 2)
+        except KeyboardInterrupt:
+            print >>sys.stderr, "ignoring fake ^C"
+
+
+def openLdb(host, creds, lp):
+    session = system_session()
+    ldb = SamDB(url="ldap://%s" % host,
+                session_info=session,
+                credentials=creds,
+                lp=lp)
+    return ldb
+
+
+def ou_name(ldb, instance_id):
+    return "ou=instance-%d,ou=traffic_replay,%s" % (instance_id,
+                                                    ldb.domain_dn())
+
+
+def create_ou(ldb, instance_id):
+    ou = ou_name(ldb, instance_id)
+    try:
+        ldb.add({"dn":          ou.split(',', 1)[1],
+                 "objectclass": "organizationalunit"})
+    except LdbError as e:
+        (status, _) = e
+        # ignore already exists
+        if status != 68:
+            raise
+    try:
+        ldb.add({"dn":          ou,
+                 "objectclass": "organizationalunit"})
+    except LdbError as e:
+        (status, _) = e
+        # ignore already exists
+        if status != 68:
+            raise
+    return ou
+
+
+class ConversationAccounts(object):
+    def __init__(self, netbios_name, machinepass, username, userpass):
+        self.netbios_name = netbios_name
+        self.machinepass  = machinepass
+        self.username     = username
+        self.userpass     = userpass
+
+
+def generate_replay_accounts(ldb, instance_id, number, password):
+
+    generate_traffic_accounts(ldb, instance_id, number, password)
+    accounts = []
+    for i in range(1, number + 1):
+        netbios_name = "STGM-%d-%d" % (instance_id, i)
+        username     = "STGU-%d-%d" % (instance_id, i)
+
+        account = ConversationAccounts(netbios_name, password, username,
+                                       password)
+        accounts.append(account)
+    return accounts
+
+
+def generate_traffic_accounts(ldb, instance_id, number, password):
+    print >>sys.stderr, ("Generating machine and conversation accounts, "
+                         "as required for %d conversations" % number)
+    added = 0
+    for i in range(number, 0, -1):
+        try:
+            netbios_name = "STGM-%d-%d" % (instance_id, i)
+            create_machine_account(ldb, instance_id, netbios_name, password)
+            added += 1
+        except LdbError as e:
+            (status, _) = e
+            if status == 68:
+                break
+            else:
+                raise
+    if added > 0:
+        print >>sys.stderr, "Added %d new machine accounts" % added
+
+    added = 0
+    for i in range(number, 0, -1):
+        try:
+            username = "STGU-%d-%d" % (instance_id, i)
+            create_user_account(ldb, instance_id, username, password)
+            added += 1
+        except LdbError as e:
+            (status, _) = e
+            if status == 68:
+                break
+            else:
+                raise
+
+    if added > 0:
+        print >>sys.stderr, "Added %d new user accounts" % added
+
+
+def create_machine_account(ldb, instance_id, netbios_name, machinepass):
+    ou = ou_name(ldb, instance_id)
+    dn = "cn=%s,%s" % (netbios_name, ou)
+    utf16pw = unicode(
+        '"' + machinepass.encode('utf-8') + '"', 'utf-8'
+    ).encode('utf-16-le')
+    start = time.time()
+    ldb.add({
+        "dn": dn,
+        "objectclass": "computer",
+        "sAMAccountName": "%s$" % netbios_name,
+        "userAccountControl":
+        str(UF_WORKSTATION_TRUST_ACCOUNT | UF_PASSWD_NOTREQD),
+        "unicodePwd": utf16pw})
+    end = time.time()
+    duration = end - start
+    print("%f\t0\tcreate\tmachine\t%f\tTrue\t" % (end, duration))
+
+
+def create_user_account(ldb, instance_id, username, userpass):
+    ou = ou_name(ldb, instance_id)
+    user_dn = "cn=%s,%s" % (username, ou)
+    utf16pw = unicode(
+        '"' + userpass.encode('utf-8') + '"', 'utf-8'
+    ).encode('utf-16-le')
+    start = time.time()
+    ldb.add({
+        "dn": user_dn,
+        "objectclass": "user",
+        "sAMAccountName": username,
+        "userAccountControl": str(UF_NORMAL_ACCOUNT),
+        "unicodePwd": utf16pw
+    })
+    end = time.time()
+    duration = end - start
+    print("%f\t0\tcreate\tuser\t%f\tTrue\t" % (end, duration))
+
+
+def create_group(ldb, instance_id, name):
+    ou = ou_name(ldb, instance_id)
+    dn = "cn=%s,%s" % (name, ou)
+    start = time.time()
+    ldb.add({
+        "dn": dn,
+        "objectclass": "group",
+    })
+    end = time.time()
+    duration = end - start
+    print("%f\t0\tcreate\tgroup\t%f\tTrue\t" % (end, duration))
+
+
+def user_name(instance_id, i):
+    return "STGU-%d-%d" % (instance_id, i)
+
+
+def generate_users(ldb, instance_id, number, password):
+    users = 0
+    for i in range(number, 0, -1):
+        try:
+            username = user_name(instance_id, i)
+            create_user_account(ldb, instance_id, username, password)
+            users += 1
+        except LdbError as e:
+            (status, _) = e
+            # Stop if entry exists
+            if status == 68:
+                break
+            else:
+                raise
+
+    return users
+
+
+def group_name(instance_id, i):
+    return "STGG-%d-%d" % (instance_id, i)
+
+
+def generate_groups(ldb, instance_id, number):
+    groups = 0
+    for i in range(number, 0, -1):
+        try:
+            name = group_name(instance_id, i)
+            create_group(ldb, instance_id, name)
+            groups += 1
+        except LdbError as e:
+            (status, _) = e
+            # Stop if entry exists
+            if status == 68:
+                break
+            else:
+                raise
+    return groups
+
+
+def clean_up_accounts(ldb, instance_id):
+    ou = ou_name(ldb, instance_id)
+    try:
+        ldb.delete(ou, ["tree_delete:1"])
+    except LdbError as e:
+        (status, _) = e
+        # ignore does not exist
+        if status != 32:
+            raise
+
+
+def generate_users_and_groups(ldb, instance_id, password,
+                              number_of_users, number_of_groups,
+                              group_memberships):
+    assignments = []
+    groups_added  = 0
+
+    create_ou(ldb, instance_id)
+
+    print >>sys.stderr, "Generating dummy user accounts"
+    users_added = generate_users(ldb, instance_id, number_of_users, password)
+
+    if number_of_groups > 0:
+        print >>sys.stderr, "Generating dummy groups"
+        groups_added = generate_groups(ldb, instance_id, number_of_groups)
+
+    if group_memberships > 0:
+        print >>sys.stderr, "Assigning users to groups"
+        assignments = assign_groups(number_of_groups,
+                                    groups_added,
+                                    number_of_users,
+                                    users_added,
+                                    group_memberships)
+        print >>sys.stderr, "Adding users to groups"
+        add_users_to_groups(ldb, instance_id, assignments)
+
+    if (groups_added > 0 and users_added == 0 and
+       number_of_groups != groups_added):
+        print >>sys.stderr, "Warning: the added groups will contain no members"
+
+    print >>sys.stderr, ("Added %d users, %d groups and %d group memberships" %
+                         (users_added, groups_added, len(assignments)))
+
+
+def assign_groups(number_of_groups,
+                  groups_added,
+                  number_of_users,
+                  users_added,
+                  group_memberships):
+
+    def generate_user_distribution(n):
+        dist = []
+        for x in range(1, n + 1):
+            p = 1 / (x + 0.001)
+            dist.append(p)
+        return dist
+
+    def generate_group_distribution(n):
+        dist = []
+        for x in range(1, n + 1):
+            p = 1 / (x**1.3)
+            dist.append(p)
+        return dist
+
+    assignments = set()
+    if group_memberships <= 0:
+        return assignments
+
+    group_dist = generate_group_distribution(number_of_groups)
+    user_dist  = generate_user_distribution(number_of_users)
+
+    group_memberships = math.ceil(
+        float(group_memberships) *
+        (float(users_added) / float(number_of_users)))
+    existing_users  = number_of_users  - users_added  - 1
+    existing_groups = number_of_groups - groups_added - 1
+    while len(assignments) < group_memberships:
+        user        = random.randint(0, number_of_users - 1)
+        group       = random.randint(0, number_of_groups - 1)
+        probability = group_dist[group] * user_dist[user]
+
+        if ((random.random() < probability * 10000) and
+           (group > existing_groups or user > existing_users)):
+            # the + 1 converts the array index to the corresponding
+            # group or user number
+            assignments.add(((user + 1), (group + 1)))
+
+    return assignments
+
+
+def add_users_to_groups(db, instance_id, assignments):
+    ou = ou_name(db, instance_id)
+
+    def build_dn(name):
+        return("cn=%s,%s" % (name, ou))
+
+    for (user, group) in assignments:
+        user_dn  = build_dn(user_name(instance_id, user))
+        group_dn = build_dn(group_name(instance_id, group))
+
+        m = ldb.Message()
+        m.dn = ldb.Dn(db, group_dn)
+        m["member"] = ldb.MessageElement(user_dn, ldb.FLAG_MOD_ADD, "member")
+        start = time.time()
+        db.modify(m)
+        end = time.time()
+        duration = end - start
+        print("%f\t0\tadd\tuser\t%f\tTrue\t" % (end, duration))
+
+
+def generate_stats(statsdir, timing_file):
+    first      = sys.float_info.max
+    last       = 0
+    successful = 0
+    failed     = 0
+    latencies  = {}
+    failures   = {}
+    unique_converations = set()
+    conversations = 0
+
+    if timing_file is not None:
+        tw = timing_file.write
+    else:
+        def tw(x):
+            pass
+
+    tw("time\tconv\tprotocol\ttype\tduration\tsuccessful\terror\n")
+
+    for filename in os.listdir(statsdir):
+        path = os.path.join(statsdir, filename)
+        with open(path, 'r') as f:
+            for line in f:
+                tw(line)
+                try:
+                    fields       = line.rstrip('\n').split('\t')
+                    conversation = fields[1]
+                    protocol     = fields[2]
+                    packet_type  = fields[3]
+                    latency      = float(fields[4])
+                    first        = min(float(fields[0]) - latency, first)
+                    last         = max(float(fields[0]), last)
+
+                    if protocol not in latencies:
+                        latencies[protocol] = {}
+                    if packet_type not in latencies[protocol]:
+                        latencies[protocol][packet_type] = []
+
+                    latencies[protocol][packet_type].append(latency)
+
+                    if protocol not in failures:
+                        failures[protocol] = {}
+                    if packet_type not in failures[protocol]:
+                        failures[protocol][packet_type] = 0
+
+                    if fields[5] == 'True':
+                        successful += 1
+                    else:
+                        failed += 1
+                        failures[protocol][packet_type] += 1
+
+                    if conversation not in unique_converations:
+                        unique_converations.add(conversation)
+                        conversations += 1
+
+                except (ValueError, IndexError):
+                    # not a valid line print and ignore
+                    print >>sys.stderr, line
+                    pass
+    duration = last - first
+    if successful == 0:
+        success_rate = 0
+    else:
+        success_rate = successful / duration
+    if failed == 0:
+        failure_rate = 0
+    else:
+        failure_rate = failed / duration
+
+    # print the stats in more human-readable format when stdout is going to the
+    # console (as opposed to being redirected to a file)
+    if sys.stdout.isatty:
+        print("Total conversations:   %10d" % conversations)
+        print("Successful operations: %10d (%.3f per second)"
+              % (successful, success_rate))
+        print("Failed operations:     %10d (%.3f per second)"
+              % (failed, failure_rate))
+    else:
+        print("(%d, %d, %d, %.3f, %.3f)" %
+              (conversations, successful, failed, success_rate, failure_rate))
+
+    if sys.stdout.isatty:
+        print("Protocol    Op Code  Description                               "
+              " Count       Failed         Mean       Median          "
+              "95%        Range          Max")
+    else:
+        print("proto\top_code\tdesc\tcount\tfailed\tmean\tmedian\t95%\trange"
+              "\tmax")
+    protocols = sorted(latencies.keys())
+    for protocol in protocols:
+        packet_types = sorted(latencies[protocol])
+        for packet_type in packet_types:
+            values     = latencies[protocol][packet_type]
+            values     = sorted(values)
+            count      = len(values)
+            failed     = failures[protocol][packet_type]
+            mean       = sum(values) / count
+            median     = calc_percentile(values, 0.50)
+            percentile = calc_percentile(values, 0.95)
+            rng        = values[-1] - values[0]
+            maxv       = values[-1]
+            desc       = OP_DESCRIPTIONS.get((protocol, packet_type), '')
+            if sys.stdout.isatty:
+                print("%-12s   %4s  %-35s %12d %12d %12.6f "
+                      "%12.6f %12.6f %12.6f %12.6f"
+                      % (protocol,
+                         packet_type,
+                         desc,
+                         count,
+                         failed,
+                         mean,
+                         median,
+                         percentile,
+                         rng,
+                         maxv))
+            else:
+                print("%s\t%s\t%s\t%d\t%d\t%f\t%f\t%f\t%f\t%f"
+                      % (protocol,
+                         packet_type,
+                         desc,
+                         count,
+                         failed,
+                         mean,
+                         median,
+                         percentile,
+                         rng,
+                         maxv))
+
+
+def calc_percentile(values, percentile):
+    if not values:
+        return 0
+    k = (len(values) - 1) * percentile
+    f = math.floor(k)
+    c = math.ceil(k)
+    if f == c:
+        return values[int(k)]
+    d0 = values[int(f)] * (c - k)
+    d1 = values[int(c)] * (k - f)
+    return d0 + d1
+
+
+def mk_masked_dir(*path):
+    """In a testenv we end up with 0777 diectories that look an alarming
+    green colour with ls. Use umask to avoid that."""
+    d = os.path.join(*path)
+    mask = os.umask(0o077)
+    os.mkdir(d)
+    os.umask(mask)
+    return d
diff --git a/python/samba/tests/emulate/__init__.py b/python/samba/tests/emulate/__init__.py
new file mode 100644
index 0000000..67c6ff4
--- /dev/null
+++ b/python/samba/tests/emulate/__init__.py
@@ -0,0 +1 @@
+#nothing here yet
diff --git a/python/samba/tests/emulate/traffic.py b/python/samba/tests/emulate/traffic.py
new file mode 100644
index 0000000..7f29737
--- /dev/null
+++ b/python/samba/tests/emulate/traffic.py
@@ -0,0 +1,196 @@
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+from pprint import pprint
+from cStringIO import StringIO
+
+import samba.tests
+
+from samba.emulate import traffic
+
+
+TEST_FILE = 'testdata/traffic-sample-very-short.txt'
+
+class TrafficEmulatorTests(samba.tests.TestCase):
+    def setUp(self):
+        self.model = traffic.TrafficModel()
+
+    def tearDown(self):
+        del self.model
+
+    def test_parse_ngrams_dns_included(self):
+        model = traffic.TrafficModel()
+        f = open(TEST_FILE)
+        (conversations,
+         interval,
+         duration,
+         dns_counts) = traffic.ingest_summaries([f], dns_mode='include')
+        f.close()
+        model.learn(conversations)
+        expected_ngrams = {
+            ('-', '-'): ['dns:0'],
+            ('-', 'dns:0'): ['dns:0'],
+            ('cldap:3', 'cldap:3'): ['cldap:3', 'wait:0'],
+            ('cldap:3', 'wait:0'): ['dcerpc:11'],
+            ('dcerpc:11', 'epm:3'): ['dcerpc:11'],
+            ('dcerpc:11', 'rpc_netlogon:21'): ['epm:3'],
+            ('dcerpc:11', 'rpc_netlogon:4'): ['rpc_netlogon:26'],
+            ('dns:0', 'dns:0'): ['dns:0',
+                                 'dns:0',
+                                 'dns:0',
+                                 'dns:0',
+                                 'dns:0',
+                                 'ldap:3',
+                                 'wait:0'],
+            ('dns:0', 'ldap:3'): ['wait:1'],
+            ('dns:0', 'wait:0'): ['cldap:3'],
+            ('epm:3', 'dcerpc:11'): ['rpc_netlogon:4'],
+            ('epm:3', 'rpc_netlogon:29'): ['kerberos:'],
+            ('kerberos:', 'ldap:3'): ['smb:0x72'],
+            ('ldap:2', 'dns:0'): ['dns:0'],
+            ('ldap:3', 'smb:0x72'): ['-'],
+            ('ldap:3', 'wait:1'): ['ldap:2'],
+            ('rpc_netlogon:21', 'epm:3'): ['rpc_netlogon:29'],
+            ('rpc_netlogon:26', 'dcerpc:11'): ['rpc_netlogon:21'],
+            ('rpc_netlogon:29', 'kerberos:'): ['ldap:3'],
+            ('rpc_netlogon:4', 'rpc_netlogon:26'): ['dcerpc:11'],
+            ('wait:0', 'cldap:3'): ['cldap:3'],
+            ('wait:0', 'dcerpc:11'): ['epm:3'],
+            ('wait:1', 'ldap:2'): ['dns:0']
+        }
+        expected_query_details = {
+            'cldap:3': [('', '', '', 'Netlogon', '', '', ''),
+                        ('', '', '', 'Netlogon', '', '', ''),
+                        ('', '', '', 'Netlogon', '', '', '')],
+            'dcerpc:11': [(), (), ()],
+            'dns:0': [(), (), (), (), (), (), (), (), ()],
+            'epm:3': [(), ()],
+            'kerberos:': [('',)],
+            'ldap:2': [('', '', '', '', '', '', '')],
+            'ldap:3': [('',
+                        '',
+                        '',
+                        ('subschemaSubentry,dsServiceName,namingContexts,'
+                         'defaultNamingContext,schemaNamingContext,'
+                         'configurationNamingContext,rootDomainNamingContext,'
+                         'supportedControl,supportedLDAPVersion,'
+                         'supportedLDAPPolicies,supportedSASLMechanisms,'
+                         'dnsHostName,ldapServiceName,serverName,'
+                         'supportedCapabilities'),
+                        '',
+                        '',
+                        ''),
+                       ('2', 'DC,DC', '', 'cn', '', '', '')],
+            'rpc_netlogon:21': [()],
+            'rpc_netlogon:26': [()],
+            'rpc_netlogon:29': [()],
+            'rpc_netlogon:4': [()],
+            'smb:0x72': [()]
+        }
+        self.maxDiff = 5000
+        ngrams = {k: sorted(v) for k, v in model.ngrams.items()}
+        details = {k: sorted(v) for k, v in model.query_details.items()}
+
+        self.assertEqual(expected_ngrams, ngrams)
+        self.assertEqual(expected_query_details, details)
+        # We use a stringIO instead of a temporary file
+        f = StringIO()
+        model.save(f)
+
+        model2 = traffic.TrafficModel()
+        f.seek(0)
+        model2.load(f)
+
+        self.assertEqual(expected_ngrams, model2.ngrams)
+        self.assertEqual(expected_query_details, model2.query_details)
+
+
+    def test_parse_ngrams(self):
+        f = open(TEST_FILE)
+        (conversations,
+         interval,
+         duration,
+         dns_counts) = traffic.ingest_summaries([f])
+        f.close()
+        self.model.learn(conversations, dns_counts)
+        #print 'ngrams'
+        #pprint(self.model.ngrams, width=50)
+        #print 'query_details'
+        #pprint(self.model.query_details, width=55)
+        expected_ngrams = {
+            ('rpc_netlogon:4', 'rpc_netlogon:26'): ['dcerpc:11'],
+            ('epm:3', 'dcerpc:11'): ['rpc_netlogon:4'],
+            ('-', '-'): ['ldap:3'],
+            ('dcerpc:11', 'epm:3'): ['dcerpc:11'], 
+            ('rpc_netlogon:29', 'kerberos:'): ['ldap:3'], 
+            ('cldap:3', 'cldap:3'): ['cldap:3', 'wait:0'], 
+            ('ldap:3', 'smb:0x72'): ['-'], 
+            ('epm:3', 'rpc_netlogon:29'): ['kerberos:'], 
+            ('ldap:2', 'cldap:3'): ['cldap:3'], 
+            ('kerberos:', 'ldap:3'): ['smb:0x72'], 
+            ('ldap:3', 'wait:1'): ['ldap:2'], 
+            ('wait:0', 'dcerpc:11'): ['epm:3'], 
+            ('dcerpc:11', 'rpc_netlogon:4'): ['rpc_netlogon:26'], 
+            ('wait:1', 'ldap:2'): ['cldap:3'], 
+            ('cldap:3', 'wait:0'): ['dcerpc:11'], 
+            ('rpc_netlogon:26', 'dcerpc:11'): ['rpc_netlogon:21'], 
+            ('-', 'ldap:3'): ['wait:1'], 
+            ('rpc_netlogon:21', 'epm:3'): ['rpc_netlogon:29'], 
+            ('dcerpc:11', 'rpc_netlogon:21'): ['epm:3']
+        }
+
+        expected_query_details = {
+            'cldap:3': [('', '', '', 'Netlogon', '', '', ''),
+                        ('', '', '', 'Netlogon', '', '', ''),
+                        ('', '', '', 'Netlogon', '', '', '')],
+            'dcerpc:11': [(), (), ()],
+            'epm:3': [(), ()],
+            'kerberos:': [('',)],
+            'ldap:2': [('', '', '', '', '', '', '')],
+            'ldap:3': [('',
+                        '',
+                        '',
+                        ('subschemaSubentry,dsServiceName,namingContexts,'
+                         'defaultNamingContext,schemaNamingContext,'
+                         'configurationNamingContext,rootDomainNamingContext,'
+                         'supportedControl,supportedLDAPVersion,'
+                         'supportedLDAPPolicies,supportedSASLMechanisms,'
+                         'dnsHostName,ldapServiceName,serverName,'
+                         'supportedCapabilities'),
+                        '',
+                        '',
+                        ''),
+                       ('2', 'DC,DC', '', 'cn', '', '', '')],
+            'rpc_netlogon:21': [()],
+            'rpc_netlogon:26': [()],
+            'rpc_netlogon:29': [()],
+            'rpc_netlogon:4': [()],
+            'smb:0x72': [()]
+        }
+        self.maxDiff = 5000
+        ngrams = {k: sorted(v) for k, v in self.model.ngrams.items()}
+        details = {k: sorted(v) for k, v in self.model.query_details.items()}
+        
+        self.assertEqual(expected_ngrams, ngrams)
+        self.assertEqual(expected_query_details, details)
+        # We use a stringIO instead of a temporary file
+        f = StringIO()
+        self.model.save(f)
+
+        model2 = traffic.TrafficModel()
+        f.seek(0)
+        model2.load(f)
+
+        self.assertEqual(expected_ngrams, model2.ngrams)
+        self.assertEqual(expected_query_details, model2.query_details)
+       
diff --git a/script/traffic_learner b/script/traffic_learner
new file mode 100755
index 0000000..6033454
--- /dev/null
+++ b/script/traffic_learner
@@ -0,0 +1,66 @@
+#!/usr/bin/env python
+# Generate a traffic model from a traffic summary file
+#
+# Copyright (C) Catalyst IT Ltd. 2017
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+import sys
+import argparse
+from collections import Counter
+import json
+import random
+import math
+
+sys.path.insert(0, "bin/python")
+from samba.emulate import traffic
+
+
+def main():
+    parser = argparse.ArgumentParser(description=__doc__,
+                        formatter_class=argparse.RawDescriptionHelpFormatter)
+    parser.add_argument('-o', '--out', type=argparse.FileType('w'),
+                        help="write model here")
+    parser.add_argument('--dns-mode', choices=['inline', 'count'],
+                        help='how to deal with DNS', default='count')
+    parser.add_argument('SUMMARY_FILE', nargs='*', type=argparse.FileType('r'),
+                        default=[sys.stdin],
+                        help="read from this file (default STDIN)")
+    args = parser.parse_args()
+
+    if not args.out:
+        print >> sys.stdout, "No output file was specified to write the model to."
+        print >> sys.stdout, "Please specify a filename using the --out option."
+        return
+
+    if args.SUMMARY_FILE is sys.stdin:
+        print >> sys.stderr, "reading from STDIN..."
+
+    (conversations,
+     interval,
+     duration,
+     dns_counts) = traffic.ingest_summaries(args.SUMMARY_FILE,
+                                            dns_mode=args.dns_mode)
+
+    model = traffic.TrafficModel()
+    print >> sys.stderr, "learning model"
+    if args.dns_mode == 'count':
+        model.learn(conversations, dns_counts)
+    else:
+        model.learn(conversations)
+
+    model.save(args.out)
+
+main()
diff --git a/script/traffic_replay b/script/traffic_replay
new file mode 100755
index 0000000..883dadc
--- /dev/null
+++ b/script/traffic_replay
@@ -0,0 +1,359 @@
+#!/usr/bin/env python
+# Generates samba network traffic
+#
+# Copyright (C) Catalyst IT Ltd. 2017
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+import sys
+import time
+import os
+import random
+import optparse
+import tempfile
+import shutil
+
+sys.path.insert(0, "bin/python")
+
+from samba.emulate import traffic
+import samba.getopt as options
+
+def main():
+
+    desc = ("Generates network traffic 'conversations' based on <summary-file>"
+            " (which should the output file produced by either traffic_learner"
+            " or traffic_summary.pl). This traffic is sent to <dns-hostname>, which is"
+            " the full DNS hostname of the DC being tested.")
+
+    parser = optparse.OptionParser("%prog [--help|options] <summary-file> <dns-hostname>",
+                                   description=desc)
+
+    parser.add_option('--dns-rate', type='float', default=0,
+                      help='fire extra DNS packets at this rate')
+    parser.add_option('-B', '--badpassword-frequency',
+                      type='float', default=0.0,
+                      help='frequency of connections with bad passwords')
+    parser.add_option('-K', '--prefer-kerberos',
+                      action="store_true",
+                      help='prefer kerberos when authenticating test users')
+    parser.add_option('-I', '--instance-id', type='int', default=0,
+                      help='Instance number, when running multiple instances')
+    parser.add_option('-t', '--timing-data',
+                      help=('write individual message timing data here '
+                            '(- for stdout)'))
+    parser.add_option('--preserve-tempdir', default=False, action="store_true",
+                      help='do not delete temporary files')
+    parser.add_option('-F', '--fixed-password',
+                       type='string', default=None,
+                       help='Password used for the test users created. Required')
+    parser.add_option('-c', '--clean-up',
+                      action="store_true",
+                      help='Clean up the generated groups and user accounts')
+
+    model_group = optparse.OptionGroup(parser, 'Traffic Model Options',
+                                       'These options alter the traffic '
+                                       'generated when the summary-file is a '
+                                       'traffic-model (produced by '
+                                       'traffic_learner)')
+    model_group.add_option('-S', '--scale-traffic', type='float', default=1.0,
+                           help='Increase the number of conversations by '
+                           'this factor')
+    model_group.add_option('-D', '--duration', type='float', default=None,
+                           help='Run model for this long (approx). Default 60s')
+    model_group.add_option('-r', '--replay-rate', type='float', default=1.0,
+                           help='Replay the traffic faster by this factor')
+    model_group.add_option('--traffic-summary',
+                           help=('Generate a traffic summary file and write '
+                           'it here (- for stdout)'))
+    parser.add_option_group(model_group)
+
+    user_gen_group = optparse.OptionGroup(parser, 'Generate User Options',
+                                         "Add extra user/groups on the DC to "
+                                         "increase the DB size. These extra "
+                                         "users aren't used for traffic "
+                                         "generation.")
+    user_gen_group.add_option('-G', '--generate-users-only',
+                              action="store_true",
+                              help='Generate the users, but do not replay '
+                              'the traffic')
+    user_gen_group.add_option('-n', '--number-of-users', type='int', default=0,
+                              help='Total number of test users to create')
+    user_gen_group.add_option('--number-of-groups', type='int', default=0,
+                              help='Create this many groups')
+    user_gen_group.add_option('--average-groups-per-user', type='int', default=0,
+                              help='Assign the test users to this '
+                              'many groups on average')
+    user_gen_group.add_option('--group-memberships', type='int', default=0,
+                              help='Total memberships to assign across all '
+                              'test users and all groups')
+    parser.add_option_group(user_gen_group)
+
+    sambaopts = options.SambaOptions(parser)
+    parser.add_option_group(sambaopts)
+    parser.add_option_group(options.VersionOptions(parser))
+    credopts = options.CredentialsOptions(parser)
+    parser.add_option_group(credopts)
+
+    # the --no-password credential doesn't make sense for this tool
+    if parser.has_option('-N'):
+        parser.remove_option('-N')
+
+    opts, args = parser.parse_args()
+
+    # First ensure we have reasonable arguments
+
+    if len(args) == 1:
+        summary = None
+        host    = args[0]
+    elif len(args) == 2:
+        summary, host = args
+    else:
+        parser.print_usage()
+        return
+
+    if opts.clean_up:
+        print >>sys.stderr, "Removing user and machine accounts"
+        lp    = sambaopts.get_loadparm()
+        creds = credopts.get_credentials(lp)
+        ldb   = traffic.openLdb(host, creds, lp)
+        traffic.clean_up_accounts(ldb, opts.instance_id)
+        exit(0)
+
+    if summary:
+        if not os.path.exists(summary):
+            print >>sys.stderr, "Summary file %s doesn't exist" % summary
+            sys.exit(1)
+    # the summary-file can be ommitted for --generate-users-only and
+    # --cleanup-up, but it should be specified in all other cases
+    elif not opts.generate_users_only:
+        print >>sys.stderr, ("No summary-file specified to replay traffic from")
+        sys.exit(1)
+
+    if not opts.fixed_password:
+        print >>sys.stderr, ("Please use --fixed-password to specify a password"
+                             " for the users created as part of this test")
+        sys.exit(1)
+
+    lp = sambaopts.get_loadparm()
+    creds = credopts.get_credentials(lp)
+
+    domain = opts.workgroup
+    if domain:
+        lp.set("workgroup", domain)
+    else:
+        domain = lp.get("workgroup")
+        if domain == "WORKGROUP":
+            print >>sys.stderr, ("NETBIOS domain does not appear to be specified, "
+                                 "use the --workgroup option")
+            sys.exit(1)
+
+    if not opts.realm and not lp.get('realm'):
+        print >>sys.stderr, "Realm not specified, use the --realm option"
+        sys.exit(1)
+
+    if opts.generate_users_only and not (opts.number_of_users or
+                                         opts.number_of_groups):
+        print >>sys.stderr, ("Please specify the number of users and/or groups "
+                             "to generate.")
+        sys.exit(1)
+
+    if opts.group_memberships and opts.average_groups_per_user:
+        print >>sys.stderr, ("--group-memberships and --average-groups-per-user"
+                             " are incompatible options - use one or the other")
+        sys.exit(1)
+
+    if not opts.number_of_groups and opts.average_groups_per_user:
+        print >>sys.stderr, ("--average-groups-per-user requires "
+                             "--number-of-groups")
+        sys.exit(1)
+
+    if not opts.number_of_groups and opts.group_memberships:
+        print >>sys.stderr, "--group-memberships requires --number-of-groups"
+        sys.exit(1)
+
+    if opts.timing_data not in ('-', None):
+        try:
+            open(opts.timing_data, 'w').close()
+        except IOError as e:
+            print >> sys.stderr, ("the supplied timing data destination "
+                                  "(%s) is not writable" % opts.timing_data)
+            print >> sys.stderr, e
+            sys.exit()
+
+    if opts.traffic_summary not in ('-', None):
+        try:
+            open(opts.traffic_summary, 'w').close()
+        except IOError as e:
+            print >> sys.stderr, ("the supplied traffic summary destination "
+                                  "(%s) is not writable" % opts.traffic_summary)
+            print >> sys.stderr, e
+            sys.exit()
+
+    traffic.DEBUG_LEVEL = opts.debuglevel
+
+    duration = opts.duration
+    if duration is None:
+        duration = 60.0
+
+    # ingest the model or traffic summary
+    if summary:
+        try:
+            conversations, interval, duration, dns_counts = \
+                                            traffic.ingest_summaries([summary])
+
+            print >>sys.stderr, ("Using conversations from the traffic summary "
+                                 "file specified")
+
+            # honour the specified duration if it's different to the capture duration
+            if opts.duration is not None:
+                duration = opts.duration
+
+        except ValueError as e:
+            if not e.message.startswith('need more than'):
+                raise
+
+            model = traffic.TrafficModel()
+
+            try:
+                model.load(summary)
+            except ValueError:
+                print >>sys.stderr, ("Could not parse %s. The summary file should "
+                                     "be the output from either the "
+                                     "traffic_summary.pl or traffic_learner script"
+                                     % summary)
+                sys.exit()
+
+            print >>sys.stderr, ("Using the specified model file to "
+                                 "generate conversations")
+
+            conversations = model.generate_conversations(opts.scale_traffic,
+                                                         duration,
+                                                         opts.replay_rate)
+
+    else:
+        conversations = []
+
+    if opts.debuglevel > 5:
+        for c in conversations:
+            for p in c.packets:
+                print "    ", p
+
+        print '=' * 72
+
+    if opts.number_of_users and opts.number_of_users < len(conversations):
+        print >>sys.stderr, ("--number-of-users (%d) is less than the "
+                             "number of conversations to replay (%d)"
+                             %(opts.number_of_users, len(conversations)))
+        sys.exit(1)
+
+    number_of_users = max(opts.number_of_users, len(conversations))
+    max_memberships = number_of_users * opts.number_of_groups
+
+    if not opts.group_memberships and opts.average_groups_per_user:
+        opts.group_memberships = opts.average_groups_per_user * number_of_users
+        print >>sys.stderr, ("Using %d group-memberships based on %u average "
+                             "memberships for %d users"
+                             %(opts.group_memberships,
+                               opts.average_groups_per_user, number_of_users))
+
+    if opts.group_memberships > max_memberships:
+        print >>sys.stderr, ("The group memberships specified (%d) exceeds "
+                             "the total users (%d) * total groups (%d)"
+                             %(opts.group_memberships, number_of_users,
+                              opts.number_of_groups))
+        sys.exit(1)
+
+    try:
+        ldb = traffic.openLdb(host, creds, lp)
+    except:
+        print >>sys.stderr, ("\nInitial LDAP connection failed! Did you supply "
+                             "a DNS host name and the correct credentials?")
+        sys.exit(1)
+
+    if opts.generate_users_only:
+        traffic.generate_users_and_groups(ldb,
+                                          opts.instance_id,
+                                          opts.fixed_password,
+                                          opts.number_of_users,
+                                          opts.number_of_groups,
+                                          opts.group_memberships)
+        sys.exit()
+
+    tempdir = tempfile.mkdtemp(prefix="samba_tg_")
+    print >>sys.stderr, "Using temp dir %s" % tempdir
+
+    traffic.generate_users_and_groups(ldb,
+                                      opts.instance_id,
+                                      opts.fixed_password,
+                                      number_of_users,
+                                      opts.number_of_groups,
+                                      opts.group_memberships)
+
+    accounts = traffic.generate_replay_accounts(ldb,
+                                                opts.instance_id,
+                                                len(conversations),
+                                                opts.fixed_password)
+
+    statsdir = traffic.mk_masked_dir(tempdir, 'stats')
+
+
+    if opts.traffic_summary:
+        if opts.traffic_summary == '-':
+            summary_dest = sys.stdout
+        else:
+            summary_dest = open(opts.traffic_summary, 'w')
+
+        print >>sys.stderr, "Writing traffic summary"
+        summaries = []
+        for c in conversations:
+            summaries += c.replay_as_summary_lines()
+
+        summaries.sort()
+        for (time, line) in summaries:
+            print >>summary_dest, line
+
+        exit(0)
+
+    traffic.replay(conversations, host,
+                   lp=lp,
+                   creds=creds,
+                   accounts=accounts,
+                   dns_rate=opts.dns_rate,
+                   duration=duration,
+                   badpassword_frequency=opts.badpassword_frequency,
+                   prefer_kerberos=opts.prefer_kerberos,
+                   statsdir=statsdir,
+                   domain=domain,
+                   base_dn=ldb.domain_dn(),
+                   ou=traffic.ou_name(ldb, opts.instance_id),
+                   tempdir=tempdir,
+                   domain_sid=ldb.get_domain_sid())
+
+
+
+    if opts.timing_data == '-':
+        timing_dest = sys.stdout
+    elif opts.timing_data is None:
+        timing_dest = None
+    else:
+        timing_dest = open(opts.timing_data, 'w')
+
+    print >>sys.stderr, "Generating statistics"
+    traffic.generate_stats(statsdir, timing_dest)
+
+    if not opts.preserve_tempdir:
+        print >>sys.stderr, "Removing temporary directory"
+        shutil.rmtree(tempdir)
+
+main()
diff --git a/selftest/tests.py b/selftest/tests.py
index 175b56c..882797f 100644
--- a/selftest/tests.py
+++ b/selftest/tests.py
@@ -131,6 +131,7 @@ planpythontestsuite("none", "samba.tests.kcc.graph")
 planpythontestsuite("none", "samba.tests.kcc.graph_utils")
 planpythontestsuite("none", "samba.tests.kcc.kcc_utils")
 planpythontestsuite("none", "samba.tests.kcc.ldif_import_export")
+planpythontestsuite("none", "samba.tests.emulate.traffic")
 plantestsuite("wafsamba.duplicate_symbols", "none", [os.path.join(srcdir(), "buildtools/wafsamba/test_duplicate_symbol.sh")])
 plantestsuite(
     "script.traffic_summary", "none",
diff --git a/testdata/traffic-sample-very-short.model b/testdata/traffic-sample-very-short.model
new file mode 100644
index 0000000..ff6a380
--- /dev/null
+++ b/testdata/traffic-sample-very-short.model
@@ -0,0 +1,115 @@
+{
+  "ngrams": {
+    "rpc_netlogon:21\tepm:3": {
+      "rpc_netlogon:29": 1
+    }, 
+    "wait:1\tldap:2": {
+      "dns:0": 1
+    }, 
+    "dns:0\tdns:0": {
+      "dns:0": 5, 
+      "ldap:3": 1, 
+      "wait:0": 1
+    }, 
+    "dns:0\tldap:3": {
+      "wait:1": 1
+    }, 
+    "kerberos:\tldap:3": {
+      "smb:0x72": 1
+    }, 
+    "dcerpc:11\tepm:3": {
+      "dcerpc:11": 1
+    }, 
+    "wait:0\tcldap:3": {
+      "cldap:3": 1
+    }, 
+    "-\tdns:0": {
+      "dns:0": 1
+    }, 
+    "wait:0\tdcerpc:11": {
+      "epm:3": 1
+    }, 
+    "ldap:3\tsmb:0x72": {
+      "-": 1
+    }, 
+    "rpc_netlogon:29\tkerberos:": {
+      "ldap:3": 1
+    }, 
+    "cldap:3\twait:0": {
+      "dcerpc:11": 1
+    }, 
+    "epm:3\trpc_netlogon:29": {
+      "kerberos:": 1
+    }, 
+    "-\t-": {
+      "dns:0": 1
+    }, 
+    "epm:3\tdcerpc:11": {
+      "rpc_netlogon:4": 1
+    }, 
+    "ldap:3\twait:1": {
+      "ldap:2": 1
+    }, 
+    "cldap:3\tcldap:3": {
+      "cldap:3": 1, 
+      "wait:0": 1
+    }, 
+    "rpc_netlogon:4\trpc_netlogon:26": {
+      "dcerpc:11": 1
+    }, 
+    "dcerpc:11\trpc_netlogon:4": {
+      "rpc_netlogon:26": 1
+    }, 
+    "rpc_netlogon:26\tdcerpc:11": {
+      "rpc_netlogon:21": 1
+    }, 
+    "ldap:2\tdns:0": {
+      "dns:0": 1
+    }, 
+    "dns:0\twait:0": {
+      "cldap:3": 1
+    }, 
+    "dcerpc:11\trpc_netlogon:21": {
+      "epm:3": 1
+    }
+  }, 
+  "query_details": {
+    "rpc_netlogon:29": {
+      "-": 1
+    }, 
+    "epm:3": {
+      "-": 2
+    }, 
+    "cldap:3": {
+      "\t\t\tNetlogon\t\t\t": 3
+    }, 
+    "dcerpc:11": {
+      "-": 3
+    }, 
+    "rpc_netlogon:26": {
+      "-": 1
+    }, 
+    "dns:0": {
+      "-": 9
+    }, 
+    "rpc_netlogon:21": {
+      "-": 1
+    }, 
+    "ldap:2": {
+      "\t\t\t\t\t\t": 1
+    }, 
+    "smb:0x72": {
+      "-": 1
+    }, 
+    "kerberos:": {
+      "": 1
+    }, 
+    "ldap:3": {
+      "\t\t\tsubschemaSubentry,dsServiceName,namingContexts,defaultNamingContext,schemaNamingContext,configurationNamingContext,rootDomainNamingContext,supportedControl,supportedLDAPVersion,supportedLDAPPolicies,supportedSASLMechanisms,dnsHostName,ldapServiceName,serverName,supportedCapabilities\t\t\t": 1, 
+      "2\tDC,DC\t\tcn\t\t\t": 1
+    }, 
+    "rpc_netlogon:4": {
+      "-": 1
+    }
+  }
+}
\ No newline at end of file
diff --git a/testdata/traffic-sample-very-short.txt b/testdata/traffic-sample-very-short.txt
new file mode 100644
index 0000000..ae766f1
--- /dev/null
+++ b/testdata/traffic-sample-very-short.txt
@@ -0,0 +1,50 @@
+1487921562.592126000	11		3	1	dns	0	query
+1487921562.592285000	11		1	4	dns	0	query
+1487921562.592636000	11		4	1	dns	1	response
+1487921562.592911000	11		1	3	dns	1	response
+1487921562.593315000	06	3	5	1	ldap	3	searchRequest	2	DC,DC		cn			
+1487921562.596247000	11		3	1	dns	0	query
+1487921562.596362000	11		1	4	dns	0	query
+1487921562.596697000	11		4	1	dns	1	response
+1487921562.596921000	11		1	3	dns	1	response
+1487921562.598308000	11		3	1	dns	0	query
+1487921562.598414000	11		1	4	dns	0	query
+1487921562.598729000	11		4	1	dns	1	response
+1487921562.598963000	11		1	3	dns	1	response
+1487921562.607624000	11		6	1	dns	0	query
+1487921562.607956000	11		6	1	dns	0	query
+1487921562.608009000	11		1	6	dns	1	response
+1487921562.608232000	11		1	6	dns	1	response
+1487921562.612424000	11		6	1	dns	0	query
+1487921562.612648000	11		1	6	dns	1	response
+1487921562.720442000	11		6	1	cldap	3	searchRequest				Netlogon			
+1487921562.720706000	11		6	1	cldap	3	searchRequest				Netlogon			
+1487921562.721004000	11		6	1	cldap	3	searchRequest				Netlogon			
+1487921562.724801000	11		1	6	cldap	5	searchResDone							
+1487921562.728632000	11		1	6	cldap	5	searchResDone							
+1487921562.732508000	11		1	6	cldap	5	searchResDone							
+1487921562.748004000	06	3	1	5	ldap	5	searchResDone							
+1487921562.820387000	06	3	5	1	ldap	2	unbindRequest							
+1487921562.831445000	06	14	6	1	dcerpc	11	Bind
+1487921562.831565000	06	14	1	6	dcerpc	12	Bind_ack
+1487921562.831776000	06	14	6	1	epm	3	Map
+1487921562.832483000	06	14	1	6	epm	3	Map
+1487921562.833521000	06	15	6	1	dcerpc	11	Bind
+1487921562.833775000	06	15	1	6	dcerpc	12	Bind_ack
+1487921562.833955000	06	15	6	1	rpc_netlogon	4	NetrServerReqChallenge
+1487921562.834039000	06	15	1	6	rpc_netlogon	4	NetrServerReqChallenge
+1487921562.834325000	06	15	6	1	rpc_netlogon	26	NetrServerAuthenticate3
+1487921562.834895000	06	15	1	6	rpc_netlogon	26	NetrServerAuthenticate3
+1487921562.835515000	06	16	6	1	dcerpc	11	Bind
+1487921562.836417000	06	16	1	6	dcerpc	12	Bind_ack
+1487921562.836694000	06	16	6	1	rpc_netlogon	21	NetrLogonDummyRoutine1
+1487921562.836917000	06	16	1	6	rpc_netlogon	21	NetrLogonDummyRoutine1
+1487921562.852041000	06	14	6	1	epm	3	Map
+1487921562.852687000	06	14	1	6	epm	3	Map
+1487921562.876310000	06	16	6	1	rpc_netlogon	29	NetrLogonGetDomainInfo
+1487921562.880868000	06	18	6	1	kerberos			
+1487921562.881074000	06	16	1	6	rpc_netlogon	29	NetrLogonGetDomainInfo
+1487921562.884476000	06	19	6	1	ldap	3	searchRequest				subschemaSubentry,dsServiceName,namingContexts,defaultNamingContext,schemaNamingContext,configurationNamingContext,rootDomainNamingContext,supportedControl,supportedLDAPVersion,supportedLDAPPolicies,supportedSASLMechanisms,dnsHostName,ldapServiceName,serverName,supportedCapabilities			
+1487921562.885803000	06	18	1	6	kerberos			
+1487921562.892086000	06	19	1	6	ldap	5	searchResDone							
+1487921562.916946000	06	20	6	1	smb	0x72	Negotiate Protocol (0x72)
-- 
1.9.1


From aa1e956685f75a0c823107ec058e75ce74c1b3f6 Mon Sep 17 00:00:00 2001
From: Tim Beale <timbeale at catalyst.net.nz>
Date: Wed, 28 Jun 2017 12:50:13 +1200
Subject: [PATCH 5/7] manpages/traffic: Add man page for traffic_replay script

This is a first cut at a man page for the new tool. Some options are
still a bit up in the air, so not everything has been documented.

Signed-off-by: Tim Beale <timbeale at catalyst.net.nz>
---
 docs-xml/manpages/traffic_replay.7.xml | 553 +++++++++++++++++++++++++++++++++
 docs-xml/wscript_build                 |   1 +
 2 files changed, 554 insertions(+)
 create mode 100644 docs-xml/manpages/traffic_replay.7.xml

diff --git a/docs-xml/manpages/traffic_replay.7.xml b/docs-xml/manpages/traffic_replay.7.xml
new file mode 100644
index 0000000..3c9a7d4
--- /dev/null
+++ b/docs-xml/manpages/traffic_replay.7.xml
@@ -0,0 +1,553 @@
+<?xml version="1.0" encoding="iso-8859-1"?>
+<!DOCTYPE refentry PUBLIC "-//Samba-Team//DTD DocBook V4.2-Based Variant V1.0//EN" "http://www.samba.org/samba/DTD/samba-doc">
+<refentry id="traffic_replay.7">
+
+<refmeta>
+	<refentrytitle>traffic_replay</refentrytitle>
+	<manvolnum>7</manvolnum>
+	<refmiscinfo class="source">Samba</refmiscinfo>
+	<refmiscinfo class="manual">User Commands</refmiscinfo>
+	<refmiscinfo class="version">4.7</refmiscinfo>
+</refmeta>
+
+
+<refnamediv>
+	<refname>traffic_replay</refname>
+	<refpurpose>Samba traffic generation tool.
+	</refpurpose>
+</refnamediv>
+
+<refsynopsisdiv>
+	<cmdsynopsis>
+		<command>traffic_replay</command>
+		<arg choice="opt">summary-file</arg>
+		<arg choice="opt">dns-hostname</arg>
+		<arg choice="opt">-h, --help</arg>
+		<arg choice="opt">-F, --fixed-password <test-password></arg>
+		<arg choice="opt">-S, --scale-traffic < scale by factor></arg>
+		<arg choice="opt">-r, --replay-rate <scale by factor></arg>
+		<arg choice="opt">-D, --duration <seconds></arg>
+		<arg choice="opt">--traffic-summary <output file></arg>
+		<arg choice="opt">-I, --instance-id <id></arg>
+		<arg choice="opt">-K, --prefer-kerberos</arg>
+		<arg choice="opt">-B, --badpassword-frequency <frequency></arg>
+		<arg choice="opt">--dns-rate <rate></arg>
+		<arg choice="opt">-t, --timing-data <file></arg>
+		<arg choice="opt">-U, --username user</arg>
+		<arg choice="opt">--password <password></arg>
+		<arg choice="opt">-W --workgroup <workgroup></arg>
+		<arg choice="opt">--realm <realm></arg>
+		<arg choice="opt">-s, --config-file <file></arg>
+		<arg choice="opt">-k, --kerberos <kerberos></arg>
+		<arg choice="opt">--ipaddress <address></arg>
+		<arg choice="opt">-P, --machine-pass</arg>
+		<arg choice="opt">--option <option></arg>
+		<arg choice="opt">-d, --debuglevel <debug level></arg>
+		<arg choice="opt">-V, --version</arg>
+	</cmdsynopsis>
+
+	<cmdsynopsis>
+		<command>traffic_replay</command>
+		<arg choice="opt">dns-hostname</arg>
+		<arg choice="opt">-G, --generate-users-only</arg>
+		<arg choice="opt">-F, --fixed-password <test-password></arg>
+		<arg choice="opt">-n, --number-of-users <total users></arg>
+		<arg choice="opt">--number-of-groups <total groups></arg>
+		<arg choice="opt">--average-groups-per-user <average number></arg>
+		<arg choice="opt">--group-memberships <total memberships></arg>
+	</cmdsynopsis>
+
+	<cmdsynopsis>
+		<command>traffic_replay</command>
+		<arg choice="opt">dns-hostname</arg>
+		<arg choice="opt">-c|--clean-up</arg>
+	</cmdsynopsis>
+</refsynopsisdiv>
+
+<refsect1>
+	<title>DESCRIPTION</title>
+	<para>This tool is part of the <citerefentry><refentrytitle>samba</refentrytitle>
+	<manvolnum>7</manvolnum></citerefentry> suite.</para>
+	<para>This tool generates traffic in order to measure the performance
+	of a Samba DC, and to test how well Samba will scale as a network
+	increases in size. It can simulate multiple different hosts making
+	multiple different types of requests to a DC.</para>
+
+	<para>This tool is intended to run against a dedicated test DC (rather
+	than a live DC that is handling real network traffic).</para>
+
+	<para>Note that a side-effect of running this tool is that user
+	accounts will be created on the DC, in order to test various Samba
+	operations. As creating accounts can be very time-consuming, these
+	users will remain on the DC by default. To remove these accounts, use
+	the --clean-up option.
+	</para>
+</refsect1>
+
+<refsect1>
+	<title>OPTIONS</title>
+
+	<variablelist>
+
+	<varlistentry>
+	<term>-h|--help</term>
+	<listitem><para>
+	Print a summary of command line options.
+	</para></listitem>
+	</varlistentry>
+
+	<varlistentry>
+	<term>summary-file</term>
+	<listitem><para>
+	File containing the network traffic to replay. This should either be
+	a traffic-summary (generated by <command>traffic_summary.pl</command>)
+	or a traffic-model (generated by <command>traffic_learner</command>).
+	Based on this file, this tool will generate 'conversations' which
+	represent Samba activity between a network host and the DC.
+	</para></listitem>
+	</varlistentry>
+
+	<varlistentry>
+	<term>dns-hostname</term>
+	<listitem><para>
+	The full DNS hostname of the DC that's being tested. The Samba activity
+	in the summary-file will be replicated and directed at this DC. It's
+	recommended that you use a dedicated DC for testing and don't try to run
+	this tool against a DC that's processing live network traffic.
+	</para></listitem>
+	</varlistentry>
+
+	<varlistentry>
+	<term>-F|--fixed-password <test-password></term>
+	<listitem><para>
+	Test users are created when this tool is run, so that actual Samba
+	activity, such as authorizing users, can be mimicked. This option
+	specifies the password that will be used for any test users that are
+	created.</para>
+
+	<para>Note that any users created by this tool will remain on the DC
+	until you run the --clean-up option. Therefore, the fixed-password
+	option needs to be the same each time the tool is run, otherwise the
+	test users won't authenticate correctly.
+	</para></listitem>
+	</varlistentry>
+
+	<varlistentry>
+	<term>Traffic Model Options</term>
+	<listitem><para>
+	When the summary-file is a traffic-model (produced by
+	<command>traffic_learner</command>), use these options to alter the
+	traffic that gets generated.</para>
+	<itemizedlist>
+		<varlistentry>
+		<term>-D|--duration <seconds></term>
+		<listitem><para>
+		Specifies the approximate duration in seconds to generate
+		traffic for. The default is 60 seconds.
+		</para></listitem>
+		</varlistentry>
+
+		<varlistentry>
+		<term>-r|--replay-rate <factor></term>
+		<listitem><para>
+		Replays the traffic faster by this factor. This option won't
+		affect the number of conversations (which is based on the
+		traffic model), but the rate at which the packets are sent will
+		be increased.
+		</para></listitem>
+		</varlistentry>
+
+		<varlistentry>
+		<term>-S|--scale-traffic <factor></term>
+		<listitem><para>
+		Increases the number of conversations by this factor. This
+		option won't affect the rate at which packets get sent (which
+		is still based on the traffic model), but it will mean more
+		conversations get replayed.
+		</para></listitem>
+		</varlistentry>
+
+		<varlistentry>
+		<term>--traffic-summary <output-file></term>
+		<listitem><para>
+		Instead of replaying a traffic-model, this option generates a
+		traffic-summary file based on what traffic would be sent. Using
+		a traffic-model allows you to scale the packet rate and number
+		of packets sent. However, using a traffic-model introduces
+		some randomness into the traffic generation. So running the
+		same traffic_replay command multiple times using a model file
+		may result in some differences in the actual traffic sent.
+		However, running the same traffic_replay command multiple times
+		with a traffic-summary file will always result in the same
+		traffic being sent. </para>
+		<para>
+		For taking performance measurements over several test runs,
+		it's recommended to use this option and replay the traffic from
+		a traffic-summary file.
+		</para></listitem>
+		</varlistentry>
+	</itemizedlist>
+	</listitem>
+	</varlistentry>
+
+	<varlistentry>
+	<term>--generate-users-only</term>
+	<listitem><para>Add extra user/groups on the DC to increase the DB
+	size. By default, this tool automatically creates test users that map
+	to the traffic conversations being generated. This option allows extra
+	users to be created on top of this. Note that these extra users may
+	not actually used for traffic generation - the traffic generation is
+	still based on the number of conversations from the model/summary file.
+	</para>
+	
+	<para>
+	Generating a large number of users can take a long time, so it this
+	option allows this to be done only once.</para>
+
+	<para>Note that the users created will remain on the DC until the
+	tool is run with the --clean-up option. This means that it is best to
+	only assign group memberships once, i.e. run --clean-up before
+	assigning a different allocation of group memberships.</para>
+	<itemizedlist>
+
+		<varlistentry>
+		<term>-n|--number-of-users <total-users></term>
+		<listitem><para>
+		Specifies the total number of test users to create (excluding
+		any machine accounts required for the traffic). Note that these
+		extra users simply populate the DC's DB - the actual user
+		traffic generated is still based on the summary-file.
+		</para></listitem>
+		</varlistentry>
+
+		<varlistentry>
+		<term>--number-of-groups <total-groups></term>
+		<listitem><para>
+		Creates the specified number of groups, for assigning the test
+		users to. Note that users are not automatically assigned to
+		groups - use either --average-groups-per-user or
+		--group-membership to do this.
+		</para></listitem>
+		</varlistentry>
+
+		<varlistentry>
+		<term>--average-groups-per-user <average-groups></term>
+		<listitem><para>
+		Randomly assigns the test users to the test groups created.
+		The group memberships are distributed so that the overall
+		average groups that a user is member of matches this number.
+		Some users will belong to more groups and some users will
+		belong to fewer groups. This option is incompatible with
+		the --group-membership option.
+		</para></listitem>
+		</varlistentry>
+
+		<varlistentry>
+		<term>--group-memberships <total-memberships></term>
+		<listitem><para>
+		Randomly assigns the test users to the test groups created.
+		The group memberships are distributed so that the total
+		groups that a user is member of, across all users, matches
+		this number. For example, with 100 users and 10 groups,
+		--group-memberships=300 would assign a user to 3 groups
+		on average. Some users will belong to more groups and some
+		users will belong to fewer groups, but the total of all
+		member linked attributes would be 300. This option is
+		incompatible with the --group-membership option.
+		</para></listitem>
+		</varlistentry>
+	</itemizedlist>
+	</listitem>
+	</varlistentry>
+
+	<varlistentry>
+	<term>--clean-up</term>
+	<listitem><para>
+	Cleans up any users and groups that were created by previously running
+	this tool. It is recommended you always clean up after running the tool.
+	</para></listitem>
+	</varlistentry>
+
+	<varlistentry>
+	<term>-I|--instance-id <id></term>
+	<listitem><para>
+	Use this option to run multiple instances of the tool on the same DC at
+	the same time. This adds a prefix to the test users generated to keep
+	them separate on the DC.
+	</para></listitem>
+	</varlistentry>
+
+	<varlistentry>
+	<term>-K|--prefer-kerberos</term>
+	<listitem><para>
+	Use Kerberos to authenticate the test users.
+	</para></listitem>
+	</varlistentry>
+
+	<varlistentry>
+	<term>-B|--badpassword-frequency <frequency></term>
+	<listitem><para>
+	Use this option to simulate users trying to authenticate with an
+	incorrect password.
+	</para></listitem>
+	</varlistentry>
+
+	<varlistentry>
+	<term>--dns-rate <rate></term>
+	<listitem><para>
+	Increase the rate at which DNS packets get sent.
+	</para></listitem>
+	</varlistentry>
+
+	<varlistentry>
+	<term>-t|--timing-data <file></term>
+	<listitem><para>
+	This writes extra timing data to the file specified. This is mostly
+	used for reporting options, such as generating graphs.
+	</para></listitem>
+	</varlistentry>
+
+	<varlistentry>
+	<term>Samba Common Options</term>
+	<listitem>
+	<itemizedlist>
+		&stdarg.client.debug;
+		&stdarg.configfile;
+		&stdarg.option;
+		<varlistentry>
+		<term>--realm=REALM</term>
+		<listitem><para>
+		Set the realm name
+		</para></listitem>
+		</varlistentry>
+		&stdarg.version;
+	</itemizedlist>
+	</listitem>
+	</varlistentry>
+
+	<varlistentry>
+	<term>Credential Options</term>
+	<listitem>
+	<itemizedlist>
+		<varlistentry>
+		<term>--simple-bind-dn=DN</term>
+		<listitem><para>
+		DN to use for a simple bind
+		</para></listitem>
+		</varlistentry>
+
+		<varlistentry>
+		<term>--password=PASSWORD</term>
+		<listitem><para>
+		Password
+		</para></listitem>
+		</varlistentry>
+
+		<varlistentry>
+		<term>-U USERNAME|--username=USERNAME</term>
+		<listitem><para>
+		Username
+		</para></listitem>
+		</varlistentry>
+
+		<varlistentry>
+		<term>-W WORKGROUP|--workgroup=WORKGROUP</term>
+		<listitem><para>
+		Workgroup
+		</para></listitem>
+		</varlistentry>
+
+		&stdarg.kerberos;
+
+		<varlistentry>
+		<term>--ipaddress=IPADDRESS</term>
+		<listitem><para>
+		IP address of the server
+		</para></listitem>
+		</varlistentry>
+
+		&stdarg.machinepass;
+	</itemizedlist>
+	</listitem>
+	</varlistentry>
+
+	</variablelist>
+</refsect1>
+
+<refsect1>
+<title>OPERATIONS</title>
+
+<refsect2>
+	<title>Generating a traffic-summary file</title>
+	<para>To use this tool, you need either a traffic-summary file or a
+	traffic-model file. To generate either of these files, you will need a
+	packet capture of actual Samba activity on your network.</para>
+
+	<para>Use wireshark to take a packet capture on your network of the
+	trafic you want to generate. For example, if you want to simulate lots
+	of users logging on, then take a capture at 8:30am when users are
+	logging in.</para>
+
+	<para>Next, you need to convert your packet capture into a traffic
+	summary file, using <command>traffic_summary.pl</command>. Basically
+	this removes any sensitive information from the capture and summarizes
+	what type of packet was sent and when.</para>
+
+	<para>Refer to the <command>traffic_summary.pl --help</command> help for more
+	details, but the basic command will look something like:</para>
+
+	<para><command>tshark -r capture.pcapng -T pdml |
+	traffic_summary.pl > traffic-summary.txt</command></para>
+</refsect2>
+
+<refsect2>
+	<title>Replaying a traffic-summary file</title>
+	<para>Once you have a traffic-summary file, you can use it to generate
+	traffic. The traffic_replay tool gets passed the traffic-summary file,
+	along with the full DNS hostname of the DC being tested. You also need
+	to provide some user credentials, and possibly the Samba realm and
+	workgroup (although the realm and workgroup may be determined
+	automatically, for example from the /etc/smb.conf file, if one is 
+	present). E.g.</para>
+
+	<para><command>traffic_replay traffic-summary.txt
+	my-dc.samdom.example.com -UAdmin%password -W samdom
+	--realm=samdom.example.com --fixed-password=blahblah123!</command>
+	</para>
+
+	<para>This simply regenerates Samba activity seen in the traffic
+	summary. The traffic is grouped into 'conversations' between a host and
+	the DC.	A user and machine account is created on the DC for each
+	conversation, in order to allow logon and other operations to succeed.
+	The script generates the same types of packets as those seen in the
+	summary.</para>
+
+	<para>Creating users can be quite a time-consuming process, especially
+	if a lot of conversations are being generated. To save time, the test
+	users remain on the DC by default. You will need to run the --clean-up
+	option to remove them, once you have finished generating traffic.
+	Because the same test users are used across multiple runs of the tool,
+	a consistent password for these users needs to be used - this is
+	specified by the --fixed-password option.
+	</para>
+
+	<para>The benefit of this tool over simply using tcprelay is that the
+	traffic generated is independent of any specific network. No setup is
+	needed beforehand on the test DC. The traffic no longer contains
+	sensitive details, so the traffic summary could be potentially shared
+	with other Samba developers.</para>
+
+	<para>However, replaying a traffic-summary directly is somewhat limited
+	in what you can actually do. A more flexible approach is to generate
+	the traffic using a model file.</para>
+</refsect2>
+
+<refsect2>
+	<title>Generating a traffic-model file</title>
+	<para>To create a traffic-model file, simply pass the traffic-summary
+	file to the <command>traffic_learner</command> script. E.g.</para>
+
+	<para><command>traffic_learner traffic-summary.txt
+	-o traffic-model.txt</command></para>
+
+	<para>This generates a model of the Samba activity in your network.
+	This model-file can now be used to generate traffic.</para>
+</refsect2>
+
+<refsect2>
+	<title>Replaying the traffic-model file</title>
+	<para>Packet generation using a traffic-model file uses the same
+	command as a traffic-summary file, e.g.</para>
+
+	<para><command>traffic_replay traffic-model.txt
+	my-dc.samdom.example.com -UAdmin%password</command>
+	</para>
+
+	<para>By default, this will generate 60 seconds worth of traffic based
+	on the model. You can specify longer using the --duration parameter.
+	</para>
+
+	<para>The traffic generated is an approximation of what was seen in
+	the network capture. The traffic generation involves some randomness,
+	so running the same command multiple times may result in slightly
+	different traffic being generated.</para>
+
+	<para>As well as changing how long the model runs for, you can also
+	change how many conversations get generated and how fast the traffic
+	gets replayed. To roughly double the number of conversations that get
+	replayed, use --scale-traffic=2 or to approximately halve the number
+	use --scale-traffic=0.5. To approximately double how quickly the
+	conversations get replayed, use --replay-rate=2, or to halve this use
+	--replay-rate=0.5</para>
+
+	<para>For example, to generate approximately 10 times the amount of
+	traffic seen over a two-minute period (based on the network capture),
+	use:</para>
+
+	<para><command>traffic_replay traffic-model.txt
+	my-dc.samdom.example.com -UAdmin%password --fixed-password=blahblah123!
+	--scale-traffic=10 --duration=120</command></para>
+</refsect2>
+
+<refsect2>
+	<title>Scaling the number of users</title>
+	<para>The performance of a Samba DC running a small subset of test
+	users will be different to a fully-populated Samba DC running in a
+	network. As the number of users increases, the size of the DB
+	increases, and a very large DB will perform worse than a smaller DB.
+	</para>
+
+	<para>To increase the size of the Samba DB, this tool can also create
+	extra users and groups. These extra users are basically 'filler' for
+	the DB. They won't actually be used to generate traffic, but they may
+	slow down authentication of the test users.</para>
+
+	<para>For example, to populate the DB with an extra 5000 users (note
+	this will take a while), use the command:</para>
+
+	<para><command>traffic_replay my-dc.samdom.example.com
+	-UAdmin%password --generate-users-only --fixed-password=blahblah123!
+	--number-of-users=5000</command></para>
+
+	<para>You can also create groups and assign users to groups. The users
+	can be randomly assigned to groups - this includes any extra users
+	created as well as the users that map to conversations. Use either
+	--average-groups-per-user or --group-memberships to specify how many
+	group memberships should be assigned to the test users.</para>
+
+	<para>For example, to assign the users in the replayed conversations
+	into 10 groups on average, use a command like:</para>
+
+	<para><command>traffic_replay traffic-model.txt my-dc.samdom.example.com
+	-UAdmin%password --fixed-password=blahblah123!
+	--generate-users-only --number-of-groups=25 --average-groups-per-user=10
+	</command></para>
+
+	<para>The users created by the test will have names like STGU-0-xyz.
+	The groups generated have names like STGG-0-xyz.</para>
+</refsect2>
+</refsect1>
+
+
+<refsect1>
+	<title>VERSION</title>
+
+	<para>This man page is complete for version 4 of the Samba
+	suite.</para>
+</refsect1>
+
+<refsect1>
+	<title>AUTHOR</title>
+
+	<para>The original Samba software and related utilities
+	were created by Andrew Tridgell. Samba is now developed
+	by the Samba Team as an Open Source project similar
+	to the way the Linux kernel is developed.</para>
+
+	<para>The tcp_replay tool was developed by the Samba team at
+	Catalyst IT Ltd.</para>
+
+	<para>The traffic_replay manpage was written by Tim Beale.</para>
+</refsect1>
+
+</refentry>
diff --git a/docs-xml/wscript_build b/docs-xml/wscript_build
index cbc09a5..841740c 100644
--- a/docs-xml/wscript_build
+++ b/docs-xml/wscript_build
@@ -46,6 +46,7 @@ manpages='''
          manpages/smbtar.1
          manpages/smbtree.1
          manpages/testparm.1
+         manpages/traffic_replay.7
          manpages/vfs_acl_tdb.8
          manpages/vfs_acl_xattr.8
          manpages/vfs_aio_fork.8
-- 
1.9.1


From 2d12ce5e459900b3c5e4aa0d1e97943138e21ebd Mon Sep 17 00:00:00 2001
From: Tim Beale <timbeale at catalyst.net.nz>
Date: Thu, 29 Jun 2017 11:07:47 +1200
Subject: [PATCH 6/7] manpages/traffic: Add man page for traffic_learner script

Signed-off-by: Tim Beale <timbeale at catalyst.net.nz>
---
 docs-xml/manpages/traffic_learner.7.xml | 139 ++++++++++++++++++++++++++++++++
 docs-xml/manpages/traffic_replay.7.xml  |  11 ++-
 docs-xml/wscript_build                  |   1 +
 3 files changed, 150 insertions(+), 1 deletion(-)
 create mode 100644 docs-xml/manpages/traffic_learner.7.xml

diff --git a/docs-xml/manpages/traffic_learner.7.xml b/docs-xml/manpages/traffic_learner.7.xml
new file mode 100644
index 0000000..35e39d3
--- /dev/null
+++ b/docs-xml/manpages/traffic_learner.7.xml
@@ -0,0 +1,139 @@
+<?xml version="1.0" encoding="iso-8859-1"?>
+<!DOCTYPE refentry PUBLIC "-//Samba-Team//DTD DocBook V4.2-Based Variant V1.0//EN" "http://www.samba.org/samba/DTD/samba-doc">
+<refentry id="traffic_learner.7">
+
+<refmeta>
+	<refentrytitle>traffic_learner</refentrytitle>
+	<manvolnum>7</manvolnum>
+	<refmiscinfo class="source">Samba</refmiscinfo>
+	<refmiscinfo class="manual">User Commands</refmiscinfo>
+	<refmiscinfo class="version">4.7</refmiscinfo>
+</refmeta>
+
+
+<refnamediv>
+	<refname>traffic_learner</refname>
+	<refpurpose>Samba tool to assist with traffic generation.
+	</refpurpose>
+</refnamediv>
+
+<refsynopsisdiv>
+	<cmdsynopsis>
+		<command>traffic_learner</command>
+		<arg choice="opt">-h</arg>
+		<arg choice="opt">SUMMARY_FILE</arg>
+		<arg choice="opt">SUMMARY_FILE ...</arg>
+		<arg choice="opt">-o OUTPUT_FILE ...</arg>
+		<arg choice="opt">--dns-mode {inline|count}</arg>
+	</cmdsynopsis>
+</refsynopsisdiv>
+
+<refsect1>
+	<title>DESCRIPTION</title>
+	<para>This tool is part of the <citerefentry><refentrytitle>samba</refentrytitle>
+	<manvolnum>7</manvolnum></citerefentry> suite.</para>
+
+	<para>This tool assists with generation of Samba traffic.
+	It takes a traffic-summary file (produced by
+	<command>traffic_summary.pl</command>) as input and produces a
+	traffic-model file that can be used by <command>traffic_replay</command>
+	for traffic generation.</para>
+
+	<para>The model file summarizes the types of traffic ('conversations'
+	between a host and a Samba DC) that occur on a network. The model file
+	describes the traffic in a way that allows it to be scaled so that
+	either more (or fewer) packets get sent, and the packets can be sent at
+	a faster (or slower) rate than that seen in the network.</para>
+</refsect1>
+
+<refsect1>
+	<title>OPTIONS</title>
+
+	<variablelist>
+
+	<varlistentry>
+	<term>-h|--help</term>
+	<listitem><para>
+	Print a summary of command line options.
+	</para></listitem>
+	</varlistentry>
+
+	<varlistentry>
+	<term>SUMMARY_FILE</term>
+	<listitem><para>
+	File containing a network traffic-summary. The traffic-summary file
+	should be generated by <command>traffic_summary.pl</command> from a
+	packet capture of actual network traffic.
+	More than one file can be specified, in which case the traffic will
+	be combined into a single traffic-model. If no SUMMARY_FILE is
+	specified, this tool will read the traffic-summary from STDIN, i.e.
+	you can pipe the output from traffic_summary.pl directly to this tool.
+	</para></listitem>
+	</varlistentry>
+
+	<varlistentry>
+	<term>-o|--out OUTPUT_FILE</term>
+	<listitem><para>
+	The traffic-model that is produced will be written to this file. The
+	OUTPUT_FILE can then be passed to <command>traffic_replay</command>
+	to generate (and manipulate) Samba network traffic.
+	</para></listitem>
+	</varlistentry>
+
+	<varlistentry>
+	<term>--dns-mode [inline|count]</term>
+	<listitem><para>
+	How DNS traffic should be handled by the model.
+	</para></listitem>
+	</varlistentry>
+
+	</variablelist>
+</refsect1>
+
+<refsect1>
+	<title>EXAMPLES</title>
+
+	<para>To take a traffic-summary file and produce a traffic-model
+	file, use:</para>
+
+	<para><command>traffic_learner traffic-summary.txt
+	-o traffic-model.txt</command></para>
+
+	<para>To generate a traffic-model from a packet capture, you can
+	pipe the traffic summary to STDIN using:</para>
+
+	<para><command>tshark -r capture.pcapng -T pdml |
+	traffic_summary.pl | traffic_learner -o traffic-model.txt</command></para>
+</refsect1>
+
+<refsect1>
+	<title>VERSION</title>
+
+	<para>This man page is complete for version 4 of the Samba
+	suite.</para>
+</refsect1>
+
+<refsect1>
+	<title>SEE ALSO</title>
+	<para>
+	<citerefentry>
+	<refentrytitle>traffic_replay</refentrytitle><manvolnum>7</manvolnum>
+	</citerefentry>.
+	</para>
+</refsect1>
+
+<refsect1>
+	<title>AUTHOR</title>
+
+	<para>The original Samba software and related utilities
+	were created by Andrew Tridgell. Samba is now developed
+	by the Samba Team as an Open Source project similar
+	to the way the Linux kernel is developed.</para>
+
+	<para>The traffic_learner tool was developed by the Samba team at
+	Catalyst IT Ltd.</para>
+
+	<para>The traffic_learner manpage was written by Tim Beale.</para>
+</refsect1>
+
+</refentry>
diff --git a/docs-xml/manpages/traffic_replay.7.xml b/docs-xml/manpages/traffic_replay.7.xml
index 3c9a7d4..c5ac1b7 100644
--- a/docs-xml/manpages/traffic_replay.7.xml
+++ b/docs-xml/manpages/traffic_replay.7.xml
@@ -537,6 +537,15 @@
 </refsect1>
 
 <refsect1>
+	<title>SEE ALSO</title>
+	<para>
+	<citerefentry>
+	<refentrytitle>traffic_learner</refentrytitle><manvolnum>7</manvolnum>
+	</citerefentry>.
+	</para>
+</refsect1>
+
+<refsect1>
 	<title>AUTHOR</title>
 
 	<para>The original Samba software and related utilities
@@ -544,7 +553,7 @@
 	by the Samba Team as an Open Source project similar
 	to the way the Linux kernel is developed.</para>
 
-	<para>The tcp_replay tool was developed by the Samba team at
+	<para>The traffic_replay tool was developed by the Samba team at
 	Catalyst IT Ltd.</para>
 
 	<para>The traffic_replay manpage was written by Tim Beale.</para>
diff --git a/docs-xml/wscript_build b/docs-xml/wscript_build
index 841740c..e329ad4 100644
--- a/docs-xml/wscript_build
+++ b/docs-xml/wscript_build
@@ -47,6 +47,7 @@ manpages='''
          manpages/smbtree.1
          manpages/testparm.1
          manpages/traffic_replay.7
+         manpages/traffic_learner.7
          manpages/vfs_acl_tdb.8
          manpages/vfs_acl_xattr.8
          manpages/vfs_aio_fork.8
-- 
1.9.1


From 1482a4ad96e886b7b6764248f21b431d2bcfc0f5 Mon Sep 17 00:00:00 2001
From: Garming Sam <garming at catalyst.net.nz>
Date: Thu, 29 Jun 2017 12:57:56 +1200
Subject: [PATCH 7/7] manpages/traffic: Add some description of the output file
 format

Signed-off-by: Garming Sam <garming at catalyst.net.nz>
---
 docs-xml/manpages/traffic_learner.7.xml | 36 +++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/docs-xml/manpages/traffic_learner.7.xml b/docs-xml/manpages/traffic_learner.7.xml
index 35e39d3..c906fae 100644
--- a/docs-xml/manpages/traffic_learner.7.xml
+++ b/docs-xml/manpages/traffic_learner.7.xml
@@ -107,6 +107,42 @@
 </refsect1>
 
 <refsect1>
+	<title>OUTPUT FILE FORMAT</title>
+
+	<para>The output model file describes a Markov model estimating the
+	probability of a packet occuring given the last two packets.</para>
+
+	<para>In JSON, we describe ngrams of the form (x, y) -> { a : 2, b : 3 }
+	as well as a conversation rate (and DNS traffic rate).</para>
+
+	<varlistentry>
+	<term>[Packet x] \t [Packet y] : { </term>
+	<listitem><para>
+	</para></listitem>
+	<listitem><para>
+	[Packet a] : [count of packet],
+	</para></listitem>
+	<listitem><para>
+	</para></listitem>
+	<listitem><para>
+	[Packet b] : [count of packet],
+	</para></listitem>
+	<listitem><para>
+	</para></listitem>
+	<listitem><para>
+	"-" : [count of wait]
+	</para></listitem>
+	</varlistentry>
+	<para>}</para>
+
+	<para>NOTE: "-" represent wait</para>
+
+	<para>Using the counts, they form ratios of which packets will likely
+	be sent.</para>
+
+</refsect1>
+
+<refsect1>
 	<title>VERSION</title>
 
 	<para>This man page is complete for version 4 of the Samba
-- 
1.9.1