Retrieve tracking data since account inception using WSProxy

Whether you’re building a data warehouse or are dealing with a legal compliance issue, access to historical tracking data from your Salesforce Marketing Cloud account using the out-of-the-box features is limited.

Data Views are powerful, but only allow you to query for up to six months of subscriber and tracking information.

Tracking extracts provide granular tracking data regarding several different aspects of email send jobs, such as clicks, bounces and unsubscribes. They are not time-restricted like Data Views, but if you want to get tracking data from a specific time range, the start and end dates can only be up to 30 days apart.

The third option to access historical data is to reach out to Salesforce support and ask them for a data export for a specific time range, but this service will come at a hefty price.

The method of retrieving tracking data using SOAP API and WSProxy also has some limitations, mostly related to performance when working with big data volumes, but it can be very useful in many cases where using Data Views or Tracking Extracts is not enough.

Retrieve Tracking Data with SOAP API

To retrieve tracking data using the SOAP API, we will call the SentEvent, OpenEvent, ClickEvent, BounceEvent and UnsubEvent objects, which will allow us to create a structure similar to the one we know from Data Views. Each of those objects has a persistent set of properties, which include: BatchID, ClientID, EventDate, SendID, SubscriberKey and TriggeredSendDefinitionObjectID.

Here is an example SOAP envelope that you could use in Postman to retrieve data from any of the tracking event objects:

<?xml version="1.0" encoding="UTF-8"?>
<s:Envelope xmlns:s="http://www.w3.org/2003/05/soap-envelope&quot; xmlns:a="http://schemas.xmlsoap.org/ws/2004/08/addressing&quot; xmlns:u="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd"&gt;
<s:Header>
<a:Action s:mustUnderstand="1">Retrieve</a:Action>
<a:ReplyTo>
<a:Address>http://schemas.xmlsoap.org/ws/2004/08/addressing/role/anonymous</a:Address&gt;
</a:ReplyTo>
<a:To s:mustUnderstand="1">https://{{your endpoint}}.soap.marketingcloudapis.com/Service.asmx</a:To>
<fueloauth xmlns="http://exacttarget.com"&gt;{{accessToken}}</fueloauth>
</s:Header>
<s:Body>
<RetrieveRequestMsg xmlns="http://exacttarget.com/wsdl/partnerAPI"&gt;
<RetrieveRequest>
<ObjectType>SentEvent</ObjectType>
<Properties>Client.ID</Properties>
<Properties>TriggeredSendDefinitionObjectID</Properties>
<Properties>SendID</Properties>
<Properties>SubscriberKey</Properties>
<Properties>EventDate</Properties>
<Properties>EventType</Properties>
<Properties>BatchID</Properties>
<Properties>ListID</Properties>
</RetrieveRequest>
</RetrieveRequestMsg>
</s:Body>
</s:Envelope>

Interact with SOAP API using WSProxy

To retrieve data from the tracking event objects using WSProxy, we will have to specify the properties to retrieve and perform the retrieve on each of the objects we are interested in. Here’s an example of retrieving data from the SentEvent object:

<script runat="server">
Platform.Load("Core","1");
var prox = new Script.Util.WSProxy();
var cols = [
"SendID",
"SubscriberKey",
"EventDate",
"EventType",
"TriggeredSendDefinitionObjectID",
"BatchID",
"Client.ID",
"ListID"
];
var data = prox.retrieve("SentEvent", cols);
Write(Stringify(data));
</script>

We can also add a filter to narrow down the results. You can filter the results by JobID, SubscriberKey or any other object property. Here, we are going to filter by EventDate, which will show us all the events that took place in the given timeframe:

<script runat="server">
Platform.Load("Core","1");
var prox = new Script.Util.WSProxy();
var cols = [
"SendID",
"SubscriberKey",
"EventDate",
"EventType",
"TriggeredSendDefinitionObjectID",
"BatchID",
"Client.ID",
"ListID"
];
var filter = {Property: "EventDate", SimpleOperator: "between", Value: ["2018-01-01T03:00:00.000Z", "2019-02-02T02:59:59.999Z"]};
var data = prox.retrieve("SentEvent", cols, filter);
Write(Stringify(data));
</script>

If you run the above script in an account that sends large volumes of emails, you will notice that it only pulls a part of the results and displays a status message: MoreDataAvailable. That’s because the SOAP API returns up to 2500 records at a time per retrieve call. To retrieve all available data for a given timeframe, we need to include pagination of retrieves in our script. Let’s also add an HTML table, so that data is displayed in a more accessible way:

<table border="1">
<tr>
<th>Ordinal</th>
<th>SendID</th>
<th>SubscriberKey</th>
<th>Eventhate</th>
<th>EventType</th>
<th>TriggeredSendDefinitionObjectID</th>
<th>BatchID</th>
<th>ClientID</th>
<th>ListID</th>
</tr>
<script runat="server">
Platform.Load("Core","1.1.1");
try{
var prox = new Script.Util.WSProxy(),
objectType = "SentEvent",
cols = [
"SendID",
"SubscriberKey",
"EventDate",
"EventType",
"TriggeredSendDefinitionObjectID",
"BatchID",
"Client.ID",
"ListID"
],
filter = {Property: "EventDate", SimpleOperator: "between", Value: ["2016-01-01T03:00:00.000Z", "2017-02-02T02:59:59.999Z"]},
moreData = true,
reqID = null,
numItems = 0;
while(moreData) {
moreData = false;
var data = reqID == null ?
prox.retrieve(objectType, cols, filter):
prox.getNextBatch(objectType, reqID);
if(data != null) {
moreData = data.HasMoreRows;
reqID = data.RequestID;
if(data && data.Results) {
for(var i=0; i< data.Results.length; i++) {
var j = +i +1
Write("<tr>");
Write("<td>" + j + "</td>");
Write("<td>" + data.Results[i].SendID + "</td>");
Write("<td>" + data.Results[i].SubscriberKey + "</td>");
Write("<td>" + data.Results[i].EventDate + "</td>");
Write("<td>" + data.Results[i].EventType + "</td>");
Write("<td>" + data.Results[i].TriggeredSendDefinitionObjectID + "</td>");
Write("<td>" + data.Results[i].BatchID + "</td>");
Write("<td>" + data.Results[i].Client.ID + "</td>");
Write("<td>" + data.Results[i].PartnerProperties[0].Value + "</td>");
Write("</tr>");
numItems++;
}
}
}
}
Platform.Response.Write("<b><br>" + numItems + " total " + objectType + ":<br><br></b>");
}catch(e){
Write(Stringify(e));
}
</script>
</table>

You can do the same with the remaining objects, here are the scripts:

Write retrieved data into a Data Extension

Although this will increase the processing time of the call, it will allow you to have the data ready for further processing without any additional preparations. Below script will create a new Data Extension called 01_SentEvent in your main Data Extensions folder and insert all the retrieved data.

<script runat="server">
Platform.Load("core","1.1");
try{
//Set time range of retrieved tracking data
var filter = {
Property: "EventDate",
SimpleOperator: "between",
Value: ["2016-01-01T00:00:00.000Z", "2017-10-30T00:00:00.000Z"]
};
//Set Data Extension properties
var fields = [
{ "Name" : "SendID", "FieldType" : "Number"},
{ "Name" : "SubscriberKey", "FieldType" : "Text", "MaxLength" : 50 },
{ "Name" : "EventDate", "FieldType" : "Date", "Ordinal" : 2 },
{ "Name" : "EventType", "FieldType" : "Text", "MaxLength" : 50 },
{ "Name" : "TriggeredSendDefinitionObjectID", "FieldType" : "Text", "MaxLength" : 50 },
{ "Name" : "BatchID", "FieldType" : "Number"},
{ "Name" : "ClientID", "FieldType" : "Number"},
{ "Name" : "ListID", "FieldType" : "Number"}
]
var SentEventDE = {
"CustomerKey" : "01_SentEvent",
"Name" : "01_SentEvent",
"Fields" : fields
};
//Create Data Extensions
var SentDE = DataExtension.Add(SentEventDE);
//Set columns to retrieve
var cols = ["SendID","SubscriberKey","EventDate","EventType","TriggeredSendDefinitionObjectID","BatchID","Client.ID","ListID"];
//Initiate data extensions
var logSentData = DataExtension.Init("01_SentEvent");
//Initiate WSProxy and set object properties
var prox = new Script.Util.WSProxy(),
objectType = "SentEvent",
moreData = true,
reqID = null,
numItems = 0;
//Retrieve data using pagination and insert data to SentEvent DE
while(moreData) {
moreData = false;
var data = reqID == null ?
prox.retrieve(objectType, cols, filter):
prox.getNextBatch(objectType, reqID);
if(data != null) {
moreData = data.HasMoreRows;
reqID = data.RequestID;
if(data && data.Results) {
for(var i=0; i< data.Results.length; i++) {
logSentData.Rows.Add({
SendID: data.Results[i].SendID,
SubscriberKey: data.Results[i].SubscriberKey,
EventDate: data.Results[i].EventDate,
EventType: data.Results[i].EventType,
TriggeredSendDefinitionObjectID: data.Results[i].TriggeredSendDefinitionObjectID,
BatchID: data.Results[i].BatchID,
ClientID: data.Results[i].Client.ID,
ListID: data.Results[i].PartnerProperties[0].Value
});
numItems++;
}
}
}
}
Platform.Response.Write("<br>Inserted " + numItems + " records<br>")
}catch(e){
Write(Stringify(e));
}
</script>

Here are the scripts that will do the same for the remaining objects:

Performance of the retrieve calls

The official Marketing Cloud SOAP API documentation states the following:

In most production implementations, data volume in the account requires you to include specific JobIDs in the filter criteria of the request. If requests that don’t specify a JobID time out during processing, add a JobID in the filter. If specifying a JobID is not possible, or if your implementation requires a broad range of JobIDs, use a data extract-based procedure instead.

Above is true if you are sending heavy volumes of emails each month, but you are not limited to filtering by JobID. You can include any of the object properties in filter criteria. You can, for example, retrieve all send and tracking history for a specific subscriber if you use the SubscriberKey in the filter: filter = {Property: "SubscriberKey", SimpleOperator: "equals", Value: "{{SubscriberKey}}"}

If you’re experiencing time outs when working with the script, try narrowing down the amount of retrieved data by using filters.

Security

Last but not least, always remember to put security measures into practice when setting up this kind of functionality on a CloudPage to prevent your data from being exposed in a breach. For the script to run and either return results onscreen or insert them into a Data Extension, you don’t have to actually publish the CloudPage – just press “Publish”, without further saving it. Or, to be completely safe – create a Script Activity in Automation Studio and run it from there.

6 thoughts on “Retrieve tracking data since account inception using WSProxy

  1. Curious if you’ve gotten any results from a non-JobID filter on the SentEvent object. Based on my testing, SentEvent is pretty much useless if you can’t expand the reach beyond the _Sent data view.

    Like

  2. wvpv

    Curious if you’ve been able to a meaningful retrieve SentEvent data without using JobId. In my testing, it’s pretty much useless if you can’t retrieve anything beyond the _Sent data view.

    Like

    1. wvpv

      How large of a range? And what about the normal daily send volume in the account? This is the one elusive piece in a thing I’m building. I just can’t find a way to reliably retrieve SentEvents for a single subscriber in a high-volume sending account.

      Like

  3. I was able to get ~5 years for a single subscriber, but you’re right, the account’s volume isn’t that high – maybe around 5k emails per day.
    Does it time out, or are you experiencing other issues?

    Like

Leave a comment